Presentation is loading. Please wait.

Presentation is loading. Please wait.

RoundTripRank Graph-based Proximity with Importance and Specificity Yuan FangUniv. of Illinois at Urbana-Champaign Kevin C.-C. ChangUniv. of Illinois at.

Similar presentations


Presentation on theme: "RoundTripRank Graph-based Proximity with Importance and Specificity Yuan FangUniv. of Illinois at Urbana-Champaign Kevin C.-C. ChangUniv. of Illinois at."— Presentation transcript:

1 RoundTripRank Graph-based Proximity with Importance and Specificity Yuan FangUniv. of Illinois at Urbana-Champaign Kevin C.-C. ChangUniv. of Illinois at Urbana-Champaign Hady W. LauwSingapore Management University ICDE 2013 @ Brisbane, Australia April 10, 2013 arise.adsc.com.sg

2 Graph-based proximity has many applications with different ranking needs 2 paper 1 paper 2 VLDB “data” “spatio” author 1 paper 3 “xml” ICDE author 2 Citation graph Social network Query log graph “pear” “apple” “apple inc” “temporal” ? ? ? ?

3 Although various applications involve different needs, ranking by existing graph proximity is limited 3 SIGMOD Intl Conference VLDB Intl Conference ICDE Intl Conference Looks reasonable? What’s missing? “spatio”, “temporal”, “data” favor very popular or important venues Query Matching venues by P-PageRank only categorically related as data topics “schema”, “matching”?

4 Other venues are needed for different purposes 4 Spatio-Temporal Databases Springer Book Spatio-Temporal Data Mining Intl Workshop Temporal Aspects in Information Systems Working Conference More specific venues? “spatio”, “temporal”, “data” Query VLDB Intl Conference Spatio-Temporal Databases Springer Book ACM SIGSPATIAL/GIS Intl Conference A balanced mixture of venues? important specific balanced quick background studyreport preliminary results

5 Specificity has been traditionally ignored 5 Closeness Common neighbor Jaccard coefficient [Jaccard1901] AdamicAdar [Adamic2003] Hitting time Escape probability [Koren2006, Tong2007] SimRank [Jeh2002] Reachability Ad-hoc Katz [Katz1953] Semantics Methodology Specificity InvObjectRank Inverse global ObjectRank Inverse node degree [Hristidis2008] Importance P-PageRank [Page1999] ObjectRank [Balmin2004] PopRank [Nie2005]

6 Applications require varying degrees of trade- off between importance and specificity 6 Observation 1 Most Tasks Require Both Importance and Specificity. Observation 2 The Desirable Trade-off Varies from Task to Task. Finding a Reviewer Overly important: maybe too broad, unaware of details Overly specific: maybe a student, lack authoritativeness Choosing a Venue (to submit best work) important conferences ++ (to build background) specific book chapters ++ Purpose?

7 Addressing the two observations is challenging 7 Challenge 1: How do we unify importance & specificity into a single proximity measure? Challenge 2: How do we generalize our unified model to accommodate flexible trade-offs? Generalize random walk based importance to integrate specificity. more importance more specificity Challenge 3: How do we efficiently compute the proximity measure? Real-time search is indispensable.

8 Challenge 1 How do we unify importance & specificity into a single proximity measure?

9 Let’s first review reachability-based importance for generalization to specificity 9 paper 1 paper 2 cites published in paper 1 ICDE accepts “citations” or “endorsements”

10 Generalize importance to specificity based on the same citation analogy 10 paper 6 paper 2 ICDE STDB (book) paper 5 “cache” “spatio” “lock” paper 4 ? ? paper 1 paper 3

11 Unify forward and backward walks into a round trip for both importance & specificity 11 Random walk: Round trip: Target node: RoundTripRank: querytarget

12 Challenge 2 How do we generalize our unified model to accommodate flexible trade-offs? Based on the same principle of random walk in a round trip.

13 Further generalize RoundTripRank using hybrid random surfers of different goals 13 Goal: balance b/w importance and specificity …

14 Generalize the behaviors of hybrid random surfers for customizable trade-offs 14 … … … balance importance specificity Hybrid Surfers Adjusting Composition RoundTripRank+ SIGSPATIAL/GIS VLDB STDB mostly balanced mostly important mostly specific

15 Challenge 3 How do we efficiently compute the proximity measure?

16 Compute RoundTripRank by “divide & conquer” 16 “divide” “conquer”

17 Compute RoundTripRank by “divide & conquer” 17 “divide” “conquer”

18 Top-K ranking is more practical & scalable 18 Full ranking [over the entire graph] Top- K ranking [based on a neighborhood]

19 Branch-and-bound algorithm 19 Neighborhood expansion … Bounds

20 20 … …

21 Experiments

22 Experimental Setup 22 paper 2 author 1 author 2 venue term 1 term 2 paper 1 Bibliographic network (BibNet) phrase 1 phrase 2 phrase 3 URL 1 URL 2 Query log graph (QLog) Graphs Evaluation methodology Reserve nodes with known associations to query Remove those associations from the graph Can a proximity measure still rank those nodes highly? ? ? Hide-and-rediscover

23 Evaluation Tasks 23 more importance more specificity 1. Find authors of a paper 2. Find venues of a paper 3. Find URL of a phrase 4. Find equivalent phrase

24 Both importance & specificity are needed 24 + 8% ~ 10% Venues matching “spatio temporal data” importantspecificbalanced NDCG Quantitative evaluation (hide-and-rediscover) F-Rank/PPR dell dell com dell computers T-Rank dell c1295 battery for dell inspiron 8000 312 0068 RoundTripRank dell battery battery for dell inspiron 8000 dell Phrases similar to “dell notebook” importantspecificbalanced F-Rank/PPR dell dell com dell computers T-Rank dell c1295 battery for dell inspiron 8000 312 0068 RoundTripRank dell battery battery for dell inspiron 8000 dell

25 25 1. Find authors of a paper 4. Find equivalent phrase 0 1 1 2. Find venues of a paper 3. Find URL of a phrase

26 26 Comparison to non-customizable dual-sensed proximity + 6% ~ 7% NDCG

27 Our top-K method is efficient & scalable 27 Efficiency Scalability 300 ms x 7.4 x 1.9

28 28 Importance as “Reachability”  Specificity as “Returnability” “Reachability” + “Returnability”  a Round Trip


Download ppt "RoundTripRank Graph-based Proximity with Importance and Specificity Yuan FangUniv. of Illinois at Urbana-Champaign Kevin C.-C. ChangUniv. of Illinois at."

Similar presentations


Ads by Google