Link analysis (Chapter 21) Definition 1 (Markov Transition Matrix) Given the graph ๐บ = (๐‘‰, ๐ธ) and teleport probability ๐›ผ, let ๐‘ = |๐‘‰ | and ๐ด be the ๐‘ ร— ๐‘ link matrix with elements โˆ€๐‘ข, ๐‘ฃ โˆˆ ๐‘‰ : ๐ด ๐‘ข๐‘ฃ = {๏ธƒ 1 (๐‘ข, ๐‘ฃ) โˆˆ ๐ธ 0 otherwise The transition probability matrix ๐‘ƒ is then calculated in the following way: 1. If a row of ๐ด has all 0s, then substitute all of them with 1s. 2. Divide each 1 by the number of 1s in that row. 3. Multiply each entry by 1 โˆ’ ๐›ผ. 4. Add ๐›ผ ๐‘ to each entry. Algorithm 1 (PageRank) 1: function PageRank(๐‘ƒ) 2: ๐‘– โ† 0 3: โˆ’โ†’๐‘ฅ ๐‘– โ† (1, 0, . . . , 0) 4: repeat 5: โˆ’โ†’๐‘ฅ ๐‘–+1 โ† โˆ’โ†’๐‘ฅ ๐‘– ยท ๐‘ƒ 6: ๐‘– โ† ๐‘– + 1 7: until ๐‘ฅ๐‘– = ๐‘ฅ๐‘–โˆ’1 8: end function Definition 2 (Hubs and authorities) Given the link matrix ๐ด, let โ„Ž(๐‘ฃ) denote the hub score and ๐‘Ž(๐‘ฃ) the authority score. First, set the โ„Ž(๐‘ฃ) a ๐‘Ž(๐‘ฃ) vectors to 1 ๐‘ for all vertices ๐‘ฃ โˆˆ ๐‘‰ . The scores are calculated as โ„Ž(๐‘ฃ) = ๐ด ยท ๐‘Ž(๐‘ฃ) ๐‘Ž(๐‘ฃ) = ๐ด ๐‘‡ ยท โ„Ž(๐‘ฃ) which is equivalent to โ„Ž(๐‘ฃ) = ๐ด ยท ๐ด ๐‘‡ ยท โ„Ž(๐‘ฃ) ๐‘Ž(๐‘ฃ) = ๐ด ๐‘‡ ยท ๐ด ยท ๐‘Ž(๐‘ฃ) Exercise 21/1 Assume the web graph ๐บ = (๐‘‰ = {๐‘Ž, ๐‘, ๐‘}, ๐ธ = {(๐‘Ž, ๐‘), (๐‘Ž, ๐‘), (๐‘, ๐‘), (๐‘, ๐‘)}). Count PageRank, hub and authority scores for each of the web pages. Rank the pages by the individual scores and observe the connections. You can assume that in each step of the random walk we teleport to a random page with probability 0.1 and uniform distribution. Normalize the hub and authority scores so that the maximum is 1. 1