Presentation is loading. Please wait.

Presentation is loading. Please wait.

A K-Main Routes Approach to Spatial Network Activity Summarization Authors: Dev Oliver Shashi Shekhar James M. Kang Renee Bousselaire Abdussalam Bannur.

Similar presentations


Presentation on theme: "A K-Main Routes Approach to Spatial Network Activity Summarization Authors: Dev Oliver Shashi Shekhar James M. Kang Renee Bousselaire Abdussalam Bannur."— Presentation transcript:

1 A K-Main Routes Approach to Spatial Network Activity Summarization Authors: Dev Oliver Shashi Shekhar James M. Kang Renee Bousselaire Abdussalam Bannur

2 Outline Motivation Problem Statement Contributions Validation  Analytical  Experimental  Case Studies Summary and Future Work

3 Motivation: Crime Analysis (application domain) Crime hotspot  Area of concentrated crime Street Place Neighborhood ** J. E. Eck et. al. Mapping Crime: Understanding Hot Spots. US National Inst. of Justice (http://www.ncjrs.gov/pdffiles1/nij/ pdf), “Most clustering algorithms will show areas of concentration even when a line is the most appropriate dimension.” – National Institute of Justice ** Star Tribune, January 26, 2011

4 Examples of Linear Patterns Linear patterns resulting from deforestation in Brazil Linear patterns of crime in a major US city

5 Motivation: Environmental Criminology (scientific domain) Spatial theories in Environmental Criminology 1 L.E. Cohen et al., Social change and crime rate trends: A routine activity approach, American sociological review, P. L. Brantingham et al., Environmental Criminology, Waveland Press, Routine Activity Theory 1 Crime location related to criminal’s frequently visited areas Crime Pattern Theory 2 Based on spatial model Nodes (e.g. home, work, entertainment), Paths (e.g. routes between nodes), Edges Crime locations close to edges Near criminal’s activity boundaries where residents may not recognize him/her Source: Rossmo, Kim (2000). Geographic Profiling. Boca Raton, FL: CRC Press. Network based summarization adds value to Environmental Criminology Assist with large scale verification of real-world data matching theories Opportunities to develop hypotheses for new theory formulation

6 Other Domains Accident Analysis and Prevention Disaster Relief

7 Motivation Problem Contributions Validation Summary Key Concepts Activity  Object of interest located at node or edge Summary path  A path chosen by KMR to summarize activities Activity coverage  Total number of activities of a path or set of paths Active node  A node having n ≥ 1 activities or joined by an edge having n ≥ 1 activities e.g., A, B, C, D, E Inactive node  A node having n = 0 activities and joined by edges all having n = 0 activities e.g., F Active node ratio  Total # active nodes/Total # nodes e.g., 5/6 Each edge has a weight of 1

8 Motivation Problem Contributions Validation Summary Problem Statement Given  A spatial network G = (N, E)  A set of activities, A and their locations (e.g. a node or edge)  A set of Paths, P  K (Number of routes)  Edge weights Find  A cardinality k subset P′ of P, i.e., a subset P′ ⊆ P with |P′| = k Objective  Maximize the activity coverage (AC) by P′ Constraints  1 ≤ k ≤ |P|. k = 2 Edge Weights are 1 Given P = the set of Shortest Paths

9 Motivation Problem Contributions Validation Summary Challenges Measures of interestingness  Activity coverage, average distance, etc Computational Complexity  Choose(N,2) paths, given N nodes  Exponential number of k subsets of paths

10 Motivation Problem Contributions Validation Summary SNAS is NP-Complete (Proof Sketch) Devising an NP-Completeness proof for decision problem Π [1] 1.Show that Π is in NP 2.Select a known NP-Complete Problem Π’ 3.Construct a transformation f from Π’ to Π 4.Prove that f is a polynomial transformation 1 M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NP-completeness. WH freeman San Francisco, 1979.

11 Motivation Problem Contributions Validation Summary Step 1: SNAS is in NP Verify in polynomial time whether activity coverage of P’ ≥ B SNAS Decision problem Given A spatial network G = (N, E) A set of activities, A and their locations (e.g. a node or edge) A set of Paths, P K (Number of routes) Edge weights B (bound on number of activities) Find A cardinality k subset P′ of P, i.e., a subset P′ ⊆ P with |P′| = k Objective Activity coverage (AC) by P′ ≥ B Constraints 1 ≤ k ≤ |P|.

12 Motivation Problem Contributions Validation Summary Step 2: Select a known NP-Complete Problem Maximum Coverage  Input Sets s 1, s 2, …,s m (the sets may have some elements E = {e 1, e 2, …, e n } in common) A number k < min (m,n)  Output k sets such that the maximum number of elements are covered, i.e. the union of the selected sets has maximal size.

13 Motivation Problem Contributions Validation Summary Step 3: Construct a transformation f from Π to Π’ (1/3) Known NP Complete Problem Polynomial transformation A new Problem Solution to New Problem Solution to NP- Complete Problem Polynomial transformation Maximum Coverage ProblemSNAS

14 Motivation Problem Contributions Validation Summary Step 3: Construct a transformation f from Π to Π’ (2/3) Maximum coverage input to SNAS input  Impose a total order, TO, to m elements E = {e 1, e 2, …, e n }  Convert each element in E into a node with one activity  Convert each set s i to a path pi Sort elements in si using TO Add edge (e i j, e i j+1 ) ∀ j ∈ 1 …. |s i | e1e2e3 e5e6 Maximum Coverage: E = {e1, e2, e3, e5, e6} K = 2 S1 = {e1, e2} S2 = {e2, e3} S3 = {e1, e2, e3} S4 = {e5, e6} KMR: P = {(e1→e2), (e2→e3), (e1→e2→e3), (e5→e6)} K = 2 Activity = {a1, a2, a3, a5, a6} Activity node = {a1–e1, a2–e2, a3–e3, a5–e5, a6–e6} Candidate Solutions: (e1→e2→e3), (e5→e6) Example

15 Motivation Problem Contributions Validation Summary Step 3: Construct a transformation f from Π to Π’ (3/3) SNAS output to maximum coverage output  For each K route, Ri, produced by SNAS, convert the activities on the route into elements and form a set Si Given the K Routes: (e1→e2→e3), (e5→e6) S1 = {e1, e2, e3} S2 = {e5, e6} Example

16 Motivation Problem Contributions Validation Summary Related Work Network Summarization by Grouping/Clustering Clumping (Okabe), e.g. NT-VCM (Shiode) Max. Subgraph, e.g. path, tree (Buchin) Multiple routes Zero or One routes Our Work

17 Motivation Problem Contributions Validation Summary Contributions K-Main Routes (KMR) algorithm  Finds a set of k routes to group activities  New design decisions added Network Voronoi Activity assignment Divide and Conquer Summary path recomputation Spatial network activity summarization is shown to be NP-complete. Analytically demonstrate correctness of design decisions and show cost analysis Experimental evaluation of the various algorithms  Performance evaluated using synthetic and real world datasets Case study comparing KMR with geometry based summarization

18 Motivation Problem Contributions Validation Summary K-Main Routes (KMR) Algorithm K-Main Routes Algorithm  Select k paths as initial summary paths  Repeat 1.Form k clusters by assigning each activity to its closest summary path 2.Recompute summary path of each cluster  Until summary paths do not change Design Decisions  Inactive node pruning  Network Voronoi Activity assignment  Divide and Conquer Summary path recomputation P = the set of Shortest Paths, K=2

19 Motivation Problem Contributions Validation Summary Design Decision: Inactive Node Pruning Only consider paths between active nodes  Optimal solution will still be in this set Given the set of shortest paths 20 shortest paths calculated and stored versus 30

20 Motivation Problem Contributions Validation Summary Design Decision: Network Voronoi (NV) Activity Assignment Goals  Form k clusters by assigning each activity to its closest summary path  Improve execution time of current assignment strategy Example (execution trace) Next K-Main Routes Algorithm Select k shortest paths as initial summary paths Repeat 1.Form k clusters by assigning each activity to its closest summary path 2.Recompute summary path of each cluster Until summary paths do not change K-Main Routes Algorithm Select k shortest paths as initial summary paths Repeat 1.Network Voronoi Activity Assignment 2.Recompute summary path of each cluster Until summary paths do not change

21 Motivation Problem Contributions Validation Summary Design Decision: Network Voronoi (NV) Activity Assignment ABCD EFGH X DISTANCE FROM Open: ACTIVITIES A E D H AE DH Closed: Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node XAE ∞ 0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ D 0 H X

22 Motivation Problem Contributions Validation Summary Design Decision: Network Voronoi (NV) Activity Assignment ABCD EFGH X DISTANCE FROM Open: ACTIVITIES A E D H AE DH Closed: Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node AE ∞ 0 ∞ ∞ ∞ D 0 H X 1 B 1 < 0? 0 0 A 0 0

23 Motivation Problem Contributions Validation Summary Design Decision: Network Voronoi (NV) Activity Assignment ABCD EFGH X DISTANCE FROM Open: ACTIVITIES A E D H AE DH Closed: Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node E ∞ 0 ∞ ∞ D 0 H X 1 B 0 0 A F E 0 0

24 Motivation Problem Contributions Validation Summary Design Decision: Network Voronoi (NV) Activity Assignment ABCD EFGH X DISTANCE FROM Open: ACTIVITIES A E D H AE DH Closed: Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node 0 ∞ ∞ D 0 H X 1 B 0 0 A F E 1 C < 0? D

25 Motivation Problem Contributions Validation Summary Design Decision: Network Voronoi (NV) Activity Assignment ABCD EFGH X DISTANCE FROM Open: ACTIVITIES A E D H AE DH Closed: Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node 0 ∞ H X 1 B 0 0 A F E 1 C D 1 G H 00

26 Motivation Problem Contributions Validation Summary Design Decision: Network Voronoi (NV) Activity Assignment ABCD EFGH X DISTANCE FROM Open: ACTIVITIES A E D H AE DH Closed: Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node X 1 B 0 0 A F E 1 C D 1 G H 2 < 1? B 00

27 Motivation Problem Contributions Validation Summary Design Decision: Network Voronoi (NV) Activity Assignment ABCD EFGH X DISTANCE FROM Open: ACTIVITIES A E D H AE DH Closed: Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node X A F E 1 C D 1 G H B 2 < 1? F 00

28 Motivation Problem Contributions Validation Summary Design Decision: Network Voronoi (NV) Activity Assignment ABCD EFGH X DISTANCE FROM Open: ACTIVITIES A E D H AE DH Closed: Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node X A E 1 C D 1 G H B F C 2 < 1? 00

29 Motivation Problem Contributions Validation Summary Design Decision: Network Voronoi (NV) Activity Assignment Network Voronoi Activity Assignment algorithm Input: Graph G = (N, E), a set of Activities A, a set of k Summary Paths, S Output: A set of k clusters formed by assigning all a i ∈ A to one s i ∈ S, where dist(a i, s i ) ≤ dist(a i, s j ) and s j ∈ S and s j ≠ s i 1. Open ← all nodes ∈ S, Closed ← Ø 2. Tnodes ← all nodes ∈ S, 3. Tactivities ← activities on si ∈ S 4. repeat 5. nc ← next node ∈ Open 6. remove nc from Open 7. Closed ← nc 8. X ← neighbors of nc 9. foreach xi ∈ X 10. if xi ∉ Tnodes and xi ∉ Closed 11. Tnodes ← xi 12. xi.prev ← nc, 13. xi.dist ← dist(xi, nc) + nc.dist 14. xi.sp ← nc.sp 15. else if xi ∈ Tnodes 16. update xi if new dist < xi.dist 17. if xi ∉ Open 18. Open ← xi 19. Y ← activities on edge {nc, xi} 20. foreach yi ∈ Y 21. if yi ∉ Tactivities 22. Tactivities ← yi 23. yi.prev ← nc 24. yi.dist ← xi.dist 25. yi.sp ← xi.sp 26. else 27. update yi if new dist < yi.dist 28. until all active nodes ∈ Closed 29. return currentClusters

30 Motivation Problem Contributions Validation Summary Design Decision: Divide and Conquer Summary PAth REcomputation Goals  Recompute the summary path of each cluster  Improve execution time of current recomputation strategy Example (execution trace) Next K-Main Routes Algorithm Select k shortest paths as initial summary paths Repeat 1.Network Voronoi Activity Assignment 2.Recompute summary path of each cluster Until summary paths do not change K-Main Routes Algorithm Select k shortest paths as initial summary paths Repeat 1.Network Voronoi Activity Assignment 2.Divide and Conquer Summary path Recomputation Design Decision Until summary paths do not change

31 Motivation Problem Contributions Validation Summary Design Decision: Divide and Conquer Summary PAth REcomputation Summary Path Recomputation Algorithm Input: Graph G = (N, E), a set of Clusters, C Output: A set of summary paths, S where s i ∈ S has max coverage for c i ∈ C and s i ∈ c i 1.nextClusters ← Ø 2.foreach c i ∈ C 3. X ← active nodes of c i 4. maxP ← Ø 5. foreach x i ∈ X 6. foreach x j ∈ X 7. if (i ≠ j) 8. cP ← getSP(x i, x j ) 9. if (maxP = Ø) 10. maxP ← cP 11. if (maxP.activities < cP.activities) 12. maxP ← cP 13. if (maxP ≠ ci.summaryPath 14. nextClusters ← maxP 15. else 16. nextClusters ← ci.summaryPath 17.return nextClusters ABCD EFGH Activity Active Node Inactive Node Summary Path Edge weights are 1 Cluster

32 Motivation Problem Contributions Validation Summary Validation Analytical  Cost analysis explaining computational savings Experimental  Comparative analysis of KMR with various design decisions  Performed on real and synthetic data  Network voronoi activity assignment and divide and conquer summary path recomputation saves computational costs  Savings increase with number of nodes, routes, activities and active node ratio Case studies  Qualitatively shows the usefulness of network based summarization on Crime data

33 Motivation Problem Contributions Validation Summary Analytical Evaluation: Computational Analysis KMR Execution Time = Number of Iterations × (Activity Assignment Cost + Summary Path Recomputation Cost) T KMR = I × ([K × |A| × cost(a i,c i )] + [K × d c × |N| 2 ]) T KMR_I = I × ([K × |A| × cost(a i,c i )] + [K × d c × (|N| × r) 2 ]) T KMR_IAS = I × ([|E| + |N|×log |N|] + [K × d c × (|N|/K × r) 2 ]) I = Number of Iterations K = Number of Clusters A = Set of activities cost(a i, c i ) = Cost of calculating the distance between activity a i and cluster c i d c = Cost of looking up a path N = Set of Nodes E = Set of Edges r = active node ratio, 0 ≤ r ≤ 1

34 Motivation Problem Contributions Validation Summary Experimental Evaluation Goal: Comparative analysis Candidates: KMR with various design decisions KMR_I – KMR with inactive node pruning KMR_IV – KMR with inactive node pruning and Network voronoi activity assignment KMR_ID – KMR with Divide and conquer summary path recomputation KMR_IVD – KMR with all three design decisions Measure: CPU time (Unix time command) Platform: Mac Pro, 2 x Xeon Quad Core 2.26 GHz, 16 GB RAM Variables: #Nodes, #Routes, #Activities, Active Node Ratio Fixed Parameters: unit edge length Datasets: Synthetic and Real (Haiti Earthquake) Real Dataset Analysis #Nodes #Routes Java-based Simulator KMR_IKMR_IV Candidates Variables #Activities Active Node Ratio Measures Synthetic Dataset KMR_IDKMR_IVD

35 Motivation Problem Contributions Validation Summary Data Description and Characteristics Synthetic Data  2010 Census TIGER/Line® Shapefiles used for road network  Activities randomly assigned to each edge Real-world data: Haiti Data Set  Geospatial and Temporal Dataset describing recent events post-disaster  Dataset collected from Jan 12, 2010 to March 23, 2010  1,677 records  Characteristics Attributes Incident Title (e.g., “Food, Water, Tents needed…”) Incident Date and Time Location (City, port name) Category (numeric category) Latitude/Longitude  Sources Crisis Map of Haiti - OpenStreetMap -

36 Motivation Problem Contributions Validation Summary Effect of Number of Nodes Synthetic Data Set Number of Activities = 1200 Active Node Ratio = 0.2 K = 2 Trends:  Voronoi Activity assignment and divide and conquer summary path recomputation saves comp. costs  Savings increase with number of nodes Real Data Set Number of Activities = 1206 Active Node Ratio = K = 2

37 Motivation Problem Contributions Validation Summary Effect of Number of Routes, K Synthetic Data Set Number of Nodes = 1000 Number of Activities = 1200 Active Node Ratio = 0.2 Real Data Set Number of Nodes = 1000 Number of Activities = 202 Active Node Ratio = Trends:  Voronoi Activity assignment and divide and conquer summary path recomputation saves comp. costs  Savings increase with number of routes

38 Motivation Problem Contributions Validation Summary Effect of Number of Activities Synthetic Data Set Number of Nodes = 1000 Active Node Ratio = 0.2 K = 2 Trends:  Voronoi Activity assignment and divide and conquer summary path recomputation saves comp. costs  Savings increase with number of activities

39 Motivation Problem Contributions Validation Summary Effect of Active Node Ratio Synthetic Data Set Number of Nodes = 1000 Number of Activities = 1200 K = 2 Trends:  Voronoi Activity assignment and divide and conquer summary path recomputation saves comp. costs  Savings increase with active node ratio

40 Input (a set of crime incidents, k=5)KMR Output Crimestat K-Means (Euclidean distance)Crimestat K-Means (Network distance) Case Study: Crime Analysis

41 Input (a set of crime incidents, k=5)KMR Output Crimestat K-Means (Euclidean distance)Crimestat K-Means (Network distance) Case Study: Crime Analysis

42 Input (a set of crime incidents, k=5)KMR Output Crimestat K-Means (Euclidean distance)Crimestat K-Means (Network distance) Case Study: Crime Analysis

43 Motivation Problem Contributions Validation Summary Summary Spatial network activity summarization was shown to be NP-complete. K-Main Routes (KMR) algorithm and its design decisions described  Inactive node pruning  Network Voronoi Activity assignment  Divide and Conquer Summary path recomputation Analytically demonstrated correctness of design decisions and cost analysis showed Experimental evaluation  Performance evaluated using synthetic and real world datasets Case study comparing KMR with geometry based summarization

44 Motivation Problem Contributions Validation Summary Future Work Short Term  Usefulness When is it useful to domain professionals (crime analysts, emergency managers)? For which use cases is the proposed solution appropriate? For which geographies is the proposed solution appropriate?  Distance based objective function instead of coverage based  Overlapping paths Long Term  Dynamically changing incidents  Edge lengths, e.g. activities on a small section of a long edge

45 Acknowledgements Members of the Spatial Database and Spatial Data Mining Research Group, University of Minnesota, Twin-Cities. This work was supported by grants from USARMY and USDOD. Thank you for your time! Any questions or comments?


Download ppt "A K-Main Routes Approach to Spatial Network Activity Summarization Authors: Dev Oliver Shashi Shekhar James M. Kang Renee Bousselaire Abdussalam Bannur."

Similar presentations


Ads by Google