# Topic-Sensitive PageRank

## Presentation on theme: "Topic-Sensitive PageRank"— Presentation transcript:

Topic-Sensitive PageRank
Reference: Taher H. Haveliwala, “Topic-Sensitive PageRank: A context-sensitive ranking algorithm for websearch”, IEEE Trans. On Knowledge and Data Engineering, vol. 15, No. 4, PP The original PageRank: purely based on the hyperlinks of web pages. Contents are not considered. A vector of PageRank is used for all web pages. Topic-Sensitive PageRank For each topic, a vector of PageRank is created. Each page has several PageRank values. One for each topic.

Creating a Page Rank vector for each topic
How to select topics? Using a small set of topics is important for low computation cost and quick response time. Open Directory: 16 top level topics

Original PageRank Rank=MRank
Rank is a vector, one element for each web page. M is a nn matric If there is a link from page j to page I, then Mi,j =1/Nj, where Nj is the number of out-links of page j.

Another Version Let n be the total number of web pages.
P=[1/n] n1 be a vector. d is a n1 matrix. di=1 if page i has no out links. Otherwise, di=0. D=ppT, and E=p[1] 1n M’=(1-)(M+D) +E. Rank=M’Rank=(1-)(M+D) Rank +P.

Topic sensitive Page Rank
Let Tj be the set of URLs in the ODP category cj. P=vj, where Vj,i =1/|Tj| if page j points to page i. Otherwisr vj,i=0. The pageRank vector for topic cj is PR(, vj). Compute the pageRAnk for all pages related to topic cj as if for the original PageRank by considering Tj.

The Retrieval Score Let r j, d be the PageRank of document d given by the rank PR( , vj). Sq,d=  j P(cj|q)•r j,d , P(cj|q) is the score that topic is related to q.

Similarity Measures for Induced Rankings Another Version
Let 1 and 2 be two rankings of documents. OSim(1,2) indicates the degree of overlap between the top k URLs of the two rankings. OSim (1,2) =|AB|/k. KSimn (1,2) = |(u, v): 1 and 2 agree on order of (u, v)|/(|U|)(|U|-1) Let  be the true ranking given by user. To compare 1 and 2, we can use OSim (1,2) or KSimn (1,2) .

Weighted PageRank Assign larger rank values to more important pages.
Each outlink page gets its value proportional to it popularity. W in (v, u) is the weight of link(v, u) calculated based on the number of inlinks of page u and the number of inlinks of all reference pages of page v. w in (v, u) =Iu/pR(v) Ip, Iu—number of inlinks of page u. R(v)-the set of all pages that v points to.

Weighted PageRank w out (v, u) =Ou/pR(v) Op,
Ou—number of outlinks of page u. Let B(u) be the set of pages that points to v. PR(u)=(1-d) +d  vB(u) PR(v) w in (v, u) w out (v, u) Reference: Wenpu Xing and Ali Ghorbani, Weighted PageRank Algorithm, Proceedings of the 2nd Annual Conference on Communication Networks and Services Research (CNSR’04), 2004.

Choices of Search Engines
Many search engines exist to compete for users The results are not necessarily the same Different users prefer different search engines Search results may, in the future, be biased towards paid advertisements.

YAHOO City University London - the University for business and the The University for business and the professions. Contact Us | About City University | Maps & Directions, AZ Index | Site Map | Help. Prospective Students. ... City University A University which believes in forward-thinking business and leadership skills, exposing students to the latest technology in all courses of study. - 3k - Cached - More pages from this site Dublin City University here... Dublin City University, Dublin 9, Ireland. Tel (0) , Fax (0) Page updated: 02/09/03 legal k - Cached - More pages from this site City University of New York The University's Main Website. - More pages from this site City University of Hong Kong AD Working Group lends staff, students an ear. Several issues still evoked strong emotions as staff members and students voiced their k - Cached - More pages from this site Oklahoma City University At Oklahoma City University, our students come first. Our ... Oklahoma City University offers a quality, values-centered education. United k - Cached - More pages from this site

AOL Search City University London - the University for business and the The University for business and the professions. Contact Us | About City University | Maps & Directions, AZ Index | Site Map | Help. Prospective Students. ... City University Home - A University which believes in forward-thinking business and leadership skills, exposing students to the latest technology in all courses of study. Welcome to Dublin City University here... Dublin City University, Dublin 9, Ireland. Tel (0) , Fax (0) Page updated: 02/09/03 legal. ... - The University's Main Website. City University of Hong Kong - AD Working Group lends staff, students an ear. Several issues still evoked strong emotions as staff members and students voiced their ... Welcome to Oklahoma City University At Oklahoma City University, our students come first. Our ... Oklahoma City University offers a quality, values-centered education. United ...

HotBot 1. City University of New York Details the CUNY system's campuses and admissions requirements. Eye photos of notable graduates such as actor Judd Hirsch. - August 10, KB 2. City University Private, nonprofit institution located in Belleview, Washington, offers higher education opportunities to working professionals and lifelong learners. - August 23, KB 3. New Jersey City University Explore the course requirements for this liberal arts institution. Link to admissions and financial aid information. ... New Jersey City University Kennedy Boulevard Jersey City, New Jersey November 1, KB 4. City University London - the University for business and the Contact Us | About City University | Maps & Directions. A-Z Index | Site Map | Help September 28, KB 5. Oklahoma City University Take a tour of the campus, and explore a roster of academic programs. ... At Oklahoma City University, our students come first October 26, KB 6. University of Missouri, Kansas City University based in Kansas City, Missouri, presents and overview of its academic programs, and offers campus news. ... UNIVERSITY OF MISSOURI-KANSAS CITY. Bulletin. Winter 2004 Fee Update, more October 23, KB

MetaSearch Engine Metasearch Engines are designed to increase the coverage of web by forwarding users’ queries to multiple search engines Users’ requests are sent to multiple search engines such as AlltheWeb, Google, MSN. Then the results from the individual search engine are combined into a single result set to present to users.

Longest common subsequence
Definition 1: Given a sequence X=x1x2...xm, another sequence Z=z1z2...zk is a subsequence of X if there exists a strictly increasing sequence i1i2...ik of indices of X such that for all j=1,2,...k, we have xij=zj. Example 1: If X=abcdefg, Z=abdg is a subsequence of X. X=abcdefg, Z=ab d g

Definition 2: Given two sequences X and Y, a sequence Z is a common subsequence of X and Y if Z is a subsequence of both X and Y. Example 2: X=abcdefg and Y=aaadgfd. Z=adf is a common subsequence of X and Y. X=abc defg Y=aaaadgfd Z=a d f

Definition 3: A longest common subsequence of X and Y is a common subsequence of X and Y with the longest length. (The length of a sequence is the number of letters in the seuqence.) Longest common subsequence may not be unique. Example: abcd acbd Both acd and abd are LCS.

Longest common subsequence problem
Input: Two sequences X=x1x2...xm, and Y=y1y2...yn. Output: a longest common subsequence of X and Y. A brute-force approach Suppose that mn. Try all subsequence of X (There are 2m subsequence of X), test if such a subsequence is also a subsequence of Y, and select the one with the longest length.

Charactering a longest common subsequence
Theorem (Optimal substructure of an LCS) Let X=x1x2...xm, and Y=y1y2...yn be two sequences, and Z=z1z2...zk be any LCS of X and Y. 1. If xm=yn, then zk=xm=yn and Z[1..k-1] is an LCS of X[1..m-1] and Y[1..n-1]. 2. If xm yn, then zkxm implies that Z is an LCS of X[1..m-1] and Y. 2. If xm yn, then zkyn implies that Z is an LCS of X and Y[1..n-1].

The recursive equation
Let c[i,j] be the length of an LCS of X[1...i] and X[1...j]. c[i,j] can be computed as follows: if i=0 or j=0, c[i,j]= c[i-1,j-1] if i,j>0 and xi=yj, max{c[i,j-1],c[i-1,j]} if i,j>0 and xiyj. Computing the length of an LCS There are nm c[i,j]’s. So we can compute them in a specific order.

The algorithm to compute an LCS
1. for i=1 to m do c[i,0]=0; 3. for j=0 to n do c[0,j]=0; 5. for i=1 to m do for j=1 to n do { if x[I] ==y[j] then c[i,j]=c[i-1,j-1]=1; b[i,j]=1; else if c[i-1,j]>=c[i,j-1] then c[i,j]=c[i-1,j] b[i,j]=2; else c[i,j]=c[i,j-1] b[i,j]=3; }

Example 3: X=BDCABA and Y=ABCBDAB.

Constructing an LCS (back-tracking)
We can find an LCS using b[i,j]’s. We start with b[n,m] and track back to some cell b[0,i] or b[i,0]. The algorithm to construct an LCS 1. i=m 2. j=n; 3. if i==0 or j==0 then exit; 4. if b[i,j]==1 then { i=i-1; j=j-1; print “xi”; } 5. if b[i,j]== i=i-1 6. if b[i,j]== j=j-1 7. Goto Step 3. The time complexity: O(nm).

Shortest common supersequence
Definition: Let X and Y be two sequences. A sequence Z is a supersequence of X and Y if both X and Y are subsequence of Z. Shortest common supersequence problem: Input: Two sequences X and Y. Output: a shortest common supersequence of X and Y. Example: X=abc and Y=abb. Both abbc and abcb are the shortest common supersequences for X and Y.

Let c[i,j] be the length of an LCS of X[1...i] and X[1...j].
Recursive Equation: Let c[i,j] be the length of an LCS of X[1...i] and X[1...j]. c[i,j] can be computed as follows: j if i=0 i if j=0, c[i,j]= c[i-1,j-1] if i,j>0 and xi=yj, min{c[i,j-1]+1,c[i-1,j]+1} if i,j>0 and xiyj.