Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluating Reachability Queries over Path Collections* P. Bouros 1, S. Skiadopoulos 2, T. Dalamagas 3, D. Sacharidis 3, T. Sellis 1,3 1 National Technical.

Similar presentations


Presentation on theme: "Evaluating Reachability Queries over Path Collections* P. Bouros 1, S. Skiadopoulos 2, T. Dalamagas 3, D. Sacharidis 3, T. Sellis 1,3 1 National Technical."— Presentation transcript:

1 Evaluating Reachability Queries over Path Collections* P. Bouros 1, S. Skiadopoulos 2, T. Dalamagas 3, D. Sacharidis 3, T. Sellis 1,3 1 National Technical University of Athens 2 University of Peleponnese 3 Institute for Management of Information Systems – R.C. Athena HDMS'09 * Long version of SSDBM’09 paper

2 Introduction (I) Several applications store and query large collections of data sequences – Recent advances in GIS and geoservices resulted in large volumes of routes (e.g., Points of Interest (POIs) sequences) Route collections – Points => nodes – Sequences => routes HDMS'09

3 Introduction (II) Web sites retain huge collections of routes – ShareMyRoutes.com – TravelByGPS.com People visiting Athens – Track their sightseeing – Create routes of interesting places Frequent updates – Users upload new routes HDMS'09

4 Problem Route collections 1.Too large to fit in main memory 2.Frequently updated, adding new routes Reachability queries – Q: path from Academy to Zappeion – A: Academy -> University of Athens (change to route p 2 ) -> Parliament-> Zappeion HDMS'09

5 Problem Route collections 1.Too large to fit in main memory 2.Frequently updated, adding new routes Reachability queries – Q: path from Academy to Zappeion – A: Academy -> University of Athens (change to route p 2 ) -> Parliament-> Zappeion HDMS'09

6 Why not a graph-based solution? Transform route collection P into graph G P 1)Searching: depth or breadth-first search Low storage and maintance cost Slow query evaluation 2)Enconding transitive closure: Fast query evaluation Expensive precomputation, not for frequently updated graphs – 2-hop [CH+02], HOPI [STW05] – DAGs: Geometric-based & partitioning 2-hop [CY+06,08], interval LB [AB+89] – GRIPP [TL07] HDMS'09

7 Outline The pfs algorithm – Indexing route collections – Indexing route transitions Index maintenance Experimental evaluation Conclusions and Further work HDMS'09

8 The pfs algorithm (I) Path-first search, basic idea: – Examine part of routes at once, not single nodes Extend depth-first search – Work with routes instead of graph edges For each route p containing current node v – Visit each node after v (successor) in p – Push to dfs stack set of successors at once HDMS'09

9 The pfs algorithm (II) Find a path from node F to C HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

10 The pfs algorithm (II) Find a path from node F to C HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

11 The pfs algorithm (II) Find a path from node F to C HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

12 The pfs algorithm (II) Find a path from node F to C HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K) Answer: (F, D, N, B, C)

13 P -Index Inverted index on route collections – For each node store routes containing it Access paths containing current node Better termination condition => pfsP – Identify a path containing current node before target HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K) noderoutes list A,, B C D,, ……

14 P -Index Inverted index on route collections – For each node store routes containing it Access paths containing current node Better termination condition => pfsP – Identify a path containing current node before target HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K) noderoutes list A,, B C D,, ……

15 P -Index Inverted index on route collections – For each node store routes containing it Access routes containing current node Better termination condition => pfsP – Identify a route containing current node before target HDMS'09 noderoutes list A,, B C D,, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

16 The pfsP algorithm Find a path from F to T HDMS'09 noderoutes list ……. F,, …… T p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

17 The pfsP algorithm Find a path from F to T HDMS'09 JOIN noderoutes list ……. F,, …… T p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

18 The pfsP algorithm Find a path from F to T HDMS'09 JOIN noderoutes list ……. F,, …… T p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

19 The pfsP algorithm Find a path from F to T Answer: (F, D, N, B, T) HDMS'09 JOIN noderoutes list ……. F,, …… T p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

20 H -graph (I) HDMS'09 Graph representation of collection – Nodes Routes of the collection – Edges (p i, p j, v) All possible transitions among routes Edge label v => share node, link Better termination condition => pfsH – Identify an edge on H- graph

21 H -graph (I) Graph representation of collection – Nodes Routes of the collection – Edges (p i, p j, v) All possible transitions among routes Edge label v => share node, link Better termination condition => pfsH – Identify an edge on H- graph HDMS'09 p1p1 (A, B, C, D, J) p4p4 (D, N, B, F, K)

22 H -graph (I) HDMS'09 p1p1 (A, B, C, D, J) p4p4 (D, N, B, F, K) Graph representation of collection – Nodes Routes of the collection – Edges (p i, p j, v) All possible transitions among routes Edge label v => share node, link Better termination condition => pfsH – Identify an edge on H- graph

23 H -graph (I) HDMS'09 p1p1 (A, B, C, D, J) p4p4 (D, N, B, F, K) Graph representation of collection – Nodes Routes of the collection – Edges (p i, p j, v) All possible transitions among routes Edge label v => share node, link Better termination condition => pfsH – Identify an edge on H- graph

24 H -graph (I) HDMS'09 Graph representation of collection – Nodes Routes of the collection – Edges (p i, p j, v) All possible transitions among routes Edge label v => share node, link Better termination condition => pfsH – Identify an edge on H - graph

25 H -graph (II) Find a path from node F to J HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

26 H -graph (II) Find a path from node F to J HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

27 H -graph (II) Find a path from node F to J HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K) Answer: (F, D, J)

28 H -Index In practice, H -Index, adj. lists of H -graph HDMS'09 routeedges list p1p1,,, p2p2,,,, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

29 H -Index In practice, H -Index, adj. lists of H -graph HDMS'09 routeedges list p1p1,,, p2p2,,,, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

30 H -Index In practice, H -Index, adj. lists of H -graph HDMS'09 routeedges list p1p1,,, p2p2,,,, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

31 H -Index In practice, H -Index, adj. lists of H -graph HDMS'09 routeedges list p1p1,,, p2p2,,,, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

32 H -Index In practice, H -Index, adj. lists of H -graph HDMS'09 routeedges list p1p1,,, p2p2,,,, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

33 H -Index In practice, H -Index, adj. lists of H -graph HDMS'09 routeedges list p1p1,,, p2p2,,,, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K) p1p1 p2p2 B,D

34 The pfsH algorithm Find a path from F to J, routes[F] = {,, } routes[J] = { } HDMS'09 routeedges list p2p2,,,,,,, p4p4,,,,, p5p5, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

35 The pfsH algorithm Find a path from F to J, routes[F] = {,, } routes[J] = { } HDMS'09 routeedges list p2p2,,,,,,, p4p4,,,,, p5p5, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

36 The pfsH algorithm Find a path from F to J, routes[F] = {,, } routes[J] = { } HDMS'09 routeedges list p2p2,,,,,,, p4p4,,,,, p5p5, …… JOIN p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

37 The pfsH algorithm Find a path from F to J, routes[F] = {,, } routes[J] = { } HDMS'09 routeedges list p2p2,,,,,,, p4p4,,,,, p5p5, …… JOIN p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

38 The pfsH algorithm Find a path from F to J, routes[F] = {,, } routes[J] = { } HDMS'09 routeedges list p2p2,,,,,,, p4p4,,,,, p5p5, …… JOIN p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K) Answer: (F, D, J)

39 Index maintenance P -Index, H -Index as inverted files on disk – Updates -> adding new routes – Not consider each new route separately – Batch updates, consider set of new routes Basic idea: – Build memory resident P -Index, H -Index for new routes – Merge disk-based indices with memory resident ones HDMS'09

40 Outline The pfs algorithm – Indexing route collections – Indexing route transitions Index maintenance Experimental evaluation Conclusions and Further work HDMS'09

41 Setup Synthetic route collections – |P|, l avg, |V|, zipf, U Compare – Convert collection to graph, dfs & adjacency lists – pfsP & P -Index – pfsH & P -Index, H -Index Construction cost, query evaluation, vary one of |P|, l avg, |V|, zipf Maintenance cost, vary U HDMS'09

42 Index construction HDMS'09 |P| (x 10 3 ) l avg = 10, |V| = 100000, zipf = 0.8 |V| (x 10 3 ) |P| = 100000, l avg = 10, zipf = 0.8

43 Query evaluation (I) HDMS'09 |P| (x 10 3 ) l avg = 10, |V| = 100000, zipf = 0.8 l avg |P| = 100000, |V| = 100000, zipf = 0.8

44 Query evaluation (II) HDMS'09 |V| (x 10 3 ) |P| = 100000, l avg = 10, zipf = 0.8 zipf |P| = 100000, l avg = 10, |V| = 100000

45 Index maintenance HDMS'09 |P| = 100000, l avg = 10, |V| = 100000, zipf = 0.8 U (%)

46 Conclusions Reachability queries over frequently updated route collections The path-first search (pfs) algorithm – Indexing route collections: P -Index & pfsP – Indexing route transitions: H -Index & pfsH Handling frequent updates, adding new routes Experimental evaluation – P -Index & pfsP, low construction & maintance cost – H -Index, P -Index & pfsH, fast query evaluation HDMS'09

47 Further work Ongoing – New index that combines P -Index & H -Index advantages Low constructing and maintenance cost Fast query evaluation Future work – Other types of queries Considering constraints HDMS'09

48 Thank you! HDMS'09


Download ppt "Evaluating Reachability Queries over Path Collections* P. Bouros 1, S. Skiadopoulos 2, T. Dalamagas 3, D. Sacharidis 3, T. Sellis 1,3 1 National Technical."

Similar presentations


Ads by Google