Presentation is loading. Please wait.

Presentation is loading. Please wait.

Center-Piece Subgraphs: Problem definition and Fast Solutions Hanghang Tong Christos Faloutsos Carnegie Mellon University.

Similar presentations


Presentation on theme: "Center-Piece Subgraphs: Problem definition and Fast Solutions Hanghang Tong Christos Faloutsos Carnegie Mellon University."— Presentation transcript:

1 Center-Piece Subgraphs: Problem definition and Fast Solutions Hanghang Tong Christos Faloutsos Carnegie Mellon University

2 2 Center-Piece Subgraph(Ceps) Given Q query nodes Find Center-piece ( ) Input of Ceps –Q Query nodes –Budget b –K softand coefficient App. –Social Network –Law Inforcement –Gene Network –…

3 3 Challenges in Ceps Q1: How to measure the importance? Q2: How to extract connection subgraph? Q3: How to do it efficiently?

4 4 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: Extract Alg. Q3: Efficiency Issue Experimental Results Conclusion

5 5 Ceps Overview Individual Score Calculation –Measure importance wrt individual query Combine Individual Scores –Measure importance wrt query set “Extract” Alg. – … the connection subgraphs

6 6 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: “Extract” Alg. Q3: Efficiency Issue Experimental Results Conclusion

7 7 RWR: Individual Score Calculation Goal –Individual importance score r(i,j) = r i,j –For each node j wrt each query i How to –Random walk with restart –Steady State Prob.

8 8 An Illustrating Example 1 2 3 4 5 6 7 89 11 10 13 12 Starting from 1 Randomly to neighbor Some p to return to 1 Prob (RW will finally stay at j)

9 9 Individual Score Calculation Q1 Q2Q3 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 Node 13 0.5767 0.0088 0.0088 0.1235 0.0076 0.0076 0.0283 0.0283 0.0283 0.0076 0.1235 0.0076 0.0088 0.5767 0.0088 0.0076 0.0076 0.1235 0.0088 0.0088 0.5767 0.0333 0.0024 0.1260 0.1260 0.0024 0.0333 0.1260 0.0333 0.0024 0.0333 0.1260 0.0024 0.0024 0.1260 0.0333 0.0024 0.0333 0.1260

10 10 Individual Score Calculation Q1 Q2Q3 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 Node 13 0.5767 0.0088 0.0088 0.1235 0.0076 0.0076 0.0283 0.0283 0.0283 0.0076 0.1235 0.0076 0.0088 0.5767 0.0088 0.0076 0.0076 0.1235 0.0088 0.0088 0.5767 0.0333 0.0024 0.1260 0.1260 0.0024 0.0333 0.1260 0.0333 0.0024 0.0333 0.1260 0.0024 0.0024 0.1260 0.0333 0.0024 0.0333 0.1260

11 11 AND: Combine Scores Q: How to combine scores? A: Multiply …= prob. 3 random particles coincide on node j

12 12 K_SoftAnd: Combine Scores Generalization – SoftAND: We want nodes close to k of Q (k<Q) query nodes. Q: How to do that?

13 13 K_SoftAnd: Combine Scores Generalization – softAND: We want nodes close to k of Q (k<Q) query nodes. Q: How to do that? A: Prob(at least k-out- of-Q will meet each other at j)

14 14 K_SoftAnd: Relaxation of AND Asking AND query?  No Answer! Disconnected Communities Noise

15 15 K_SoftAnd: Combine Score Goal –Importance wrt query set –Depend on query scenario! How to… –Meeting Probability –K_SoftAnd –Prob(at least k-out-of-Q will meet each other at j)

16 16 AND query vs. K_SoftAnd query And Query 2_SoftAnd Query x 1e-4

17 17 1_SoftAnd query = OR query

18 18 Measuring Importance Q1 Q2Q3 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 Node 13 0.5767 0.0088 0.0088 0.1235 0.0076 0.0076 0.0283 0.0283 0.0283 0.0076 0.1235 0.0076 0.0088 0.5767 0.0088 0.0076 0.0076 0.1235 0.0088 0.0088 0.5767 0.0333 0.0024 0.1260 0.1260 0.0024 0.0333 0.1260 0.0333 0.0024 0.0333 0.1260 0.0024 0.0024 0.1260 0.0333 0.0024 0.0333 0.1260 0.4505 0.0710 0.2267 0.0710 0.4505 0.0710 0.4505 0.1010 OR 0.0103 0.0019 0.0103 0.0019 0.0103 0.0019 0.0024 0.0046 K_SoftAnd Random walk with restart And 2_SoftAnd Individual Scores Combining Scores Steady State Prob Meeting Prob

19 19 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: “Extract” Alg. Q3: Efficiency Experimental Results Conclusion

20 20 Goal –Maximize total scores and –‘Appropriate’ Connections How to…”Extract” Alg. –Dynamic Programming –Greedy Alg. Pickup promising node Find ‘best’ path “Extract” Alg. 1 2 3 5 4 6 7 8 9 10 11 12 13 141516 1 2 3 5 4 6 7 8 9 10 11 12 13

21 21 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: “Extract” Alg. Q3: Efficiency Experimental Results Conclusion

22 22 Graph Partition: Efficiency Issue Straightforward way –Q linear system: –linear to # of edge Observation –Skewed dist. How to… –Graph partition

23 23 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: “Extract” Alg. Q3: Efficiency Issue Experimental Results Conclusion

24 24 Experimental Setup Dataset –DBLP/authorship –Author-Paper –315k nodes –1,800k edges Evaluation Criteria –I Node Ratio –I Edge Ratio

25 25 Experimental Setup We want to check –Does the goodness criteria make sense? –Does “extract” alg. capture most of important nodes/edge? –Efficiency

26 26 Case Study: AND query

27 27 2_SoftAnd query Statistic database

28 28 Evaluation of “Extract” Alg. 20 nodes 90%+ preserved Budget (b) Node Ratio 2 query nodes 3 query nodes

29 29 Running Time vs. Quality for Fast Ceps Running Time Quality ~90% quality 6:1 speedup

30 30 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: “Extract” Alg. Q3: Efficiency Issue Experimental Results Conclusion

31 31 Conclusion Q1:How to measure the importance? A1: RWR+K_SoftAnd Q2: How to find connection subgraph? A2:”Extract” Alg. Q3:How to do it efficiently? A3:Graph Partition (Fast Ceps) –~90% quality –6:1 speedup

32 32 Q&A Thank you! htong@cs.cmu.edu


Download ppt "Center-Piece Subgraphs: Problem definition and Fast Solutions Hanghang Tong Christos Faloutsos Carnegie Mellon University."

Similar presentations


Ads by Google