Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2010 IBM Corporation Diversified Ranking on Large Graphs: An Optimization Viewpoint Hanghang Tong, Jingrui He, Zhen Wen, Ching-Yung Lin, Ravi Konuru.

Similar presentations


Presentation on theme: "© 2010 IBM Corporation Diversified Ranking on Large Graphs: An Optimization Viewpoint Hanghang Tong, Jingrui He, Zhen Wen, Ching-Yung Lin, Ravi Konuru."— Presentation transcript:

1 © 2010 IBM Corporation Diversified Ranking on Large Graphs: An Optimization Viewpoint Hanghang Tong, Jingrui He, Zhen Wen, Ching-Yung Lin, Ravi Konuru KDD 2011, August 21-24, San Diego, CA

2 2 Background: Why Diversity?  A1: Uncertainty & Ambiguity in an Information Need Case 1: Uncertainty from the query Case 2: Uncertainty from the user

3 3 Background: Why Diversity? (cont.)  A2: Uncertainty & ambiguity of an information need –C1: Product search  want different reviews –C2: Political issue debate  desire different opinions –C3: Legal search  get an overview of a topic –C4: Team assembling  find a set of relevant & diversified experts  A3: Become a better and safer employee –Better: A 1% increase in diversity  an additional $886 of monthly revenue –Safer: A 1% increase in diversity  an increase of 11.8% in job retention

4 4 Problem Definitions & Challenges  Problem 1 (Evaluate/measure a given top-k ranking list) –Given: A large graph A, the query vector p, the damping factor c, and a subset of k nodes S; –Measure: the goodness of the subset of nodes S by a single number in terms of (a) the relevance of each node in S wrt the query vector p, and (b) the diversity among all the nodes in the subset S.  Problem 2 (Find a near optimal top-k ranking list) –Given: A large graph A, the query vector p, the damping factor c, and the budget k; –Find: A subset of k nodes S that maximizes the goodness measure f(S).  Challenges –(for Prob. 1) No existing measure encoding both relevance and diversity –(for Prob. 2) Sub-set level optimization 4

5 5 Our Solutions (10 seconds introduction!)  Problem 1 (Evaluate/measure a given top-k ranking list)  A1: A weighted sum between relevance and similarity  Problem 2 (Find a near optimal top-k ranking list)  A2: A greedy algorithm (near-optimal, linear scalability) 5 weightdiversity relevance

6 6 Measure Relevance (r) by RWR (a.k.a. Personalized PageRank) Details 1 4 3 2 5 6 7 9 10 8 11 12 n x n n x 1 Ranking vector Starting vector Adjacency matrix 1 Restart p r = c A r + (1-c) e

7 7 = [c A + (1-c) e 1’ ] r = B r Diversity ~ reverse of weighted similarity on the personalized graph Details B: Personalized Graph (a.k.a ‘Google-Matrix’) 1 4 3 2 5 6 7 9 10 8 11 12 B(i,j): How node i and node j are connected in the personalized graph 1 4 3 2 5 6 7 9 10 8 11 12 g(S) = w∑r(i) - ∑B(i,j)r(j) i in Si,j in S

8 8 Properties of g(S): Why is it a Good Measure?  P1: g(S)=0 for an empty set S  P2: g(S) is sub-modular for any w>0  P3: g(S) is monotonically non-decreasing for any w>=2  A greedy algorithm (Dragon) leads to near-opt. solution –Quality: g(S) >= (1−1/e)g(S*), where S* is the optimal subset maximizing g(S) –Complexity: O(m) for both time and space For any w>=2 Details Footnote: Dragon stands for Diversified Ranking on Graph: An Optimization Viewpoint

9 9 Experimental Results 9 Quality-Time Balance Scalability An Illustrative ExampleCompare w/ alternative choices Quality Budget Time Opt. Quality

10 10 Conclusion  Problem 1 (Evaluate/measure a given top-k ranking list)  A1: A weighted sum between relevance and similarity  Problem 2 (Find a near optimal top-k ranking list)  A2: A greedy algorithm (near-optimal, linear scalability)  Contact: Hanghang Tong (htong@us.ibm.com)

11 11 Academic Literature: More Detailed Comparison [6] [7] This Disclosure Proposes (1) The first measure that combines both relevance & diversity (2) The first method that (a) leads to near-optimal solution with (b) linear complexity For Problem 1 For Problem 2


Download ppt "© 2010 IBM Corporation Diversified Ranking on Large Graphs: An Optimization Viewpoint Hanghang Tong, Jingrui He, Zhen Wen, Ching-Yung Lin, Ravi Konuru."

Similar presentations


Ads by Google