Download presentation

Presentation is loading. Please wait.

Published byAnnabel Hemsworth Modified over 2 years ago

1
**A K-Main Routes Approach to Spatial Network Activity Summarization**

Authors: Dev Oliver Shashi Shekhar James M. Kang Renee Bousselaire Abdussalam Bannur

2
**Outline Motivation Problem Statement Contributions Validation**

Analytical Experimental Case Studies Summary and Future Work

3
**Motivation: Crime Analysis (application domain)**

Street Place Neighborhood Crime hotspot Area of concentrated crime **J. E. Eck et. al. Mapping Crime: Understanding Hot Spots. US National Inst. of Justice (http://www.ncjrs.gov/pdffiles1/nij/ pdf), 2005. “Most clustering algorithms will show areas of concentration even when a line is the most appropriate dimension.” – National Institute of Justice** Star Tribune, January 26, 2011

4
**Examples of Linear Patterns**

Linear patterns resulting from deforestation in Brazil Linear patterns of crime in a major US city

5
**Motivation: Environmental Criminology (scientific domain)**

Spatial theories in Environmental Criminology Routine Activity Theory1 Crime location related to criminal’s frequently visited areas Crime Pattern Theory2 Based on spatial model Nodes (e.g. home, work, entertainment), Paths (e.g. routes between nodes), Edges Crime locations close to edges Near criminal’s activity boundaries where residents may not recognize him/her Source: Rossmo, Kim (2000). Geographic Profiling. Boca Raton, FL: CRC Press. Network based summarization adds value to Environmental Criminology Assist with large scale verification of real-world data matching theories Opportunities to develop hypotheses for new theory formulation 1L.E. Cohen et al., Social change and crime rate trends: A routine activity approach, American sociological review, 1979. 2P. L. Brantingham et al., Environmental Criminology, Waveland Press, 1990.

6
Other Domains Disaster Relief Accident Analysis and Prevention

7
**Each edge has a weight of 1**

Key Concepts Activity Object of interest located at node or edge Summary path A path chosen by KMR to summarize activities Activity coverage Total number of activities of a path or set of paths Active node A node having n ≥ 1 activities or joined by an edge having n ≥ 1 activities e.g., A, B, C, D, E Inactive node A node having n = 0 activities and joined by edges all having n = 0 activities e.g., F Active node ratio Total # active nodes/Total # nodes e.g., 5/6 Each edge has a weight of 1 7

8
**Given P = the set of Shortest Paths**

Problem Statement Given P = the set of Shortest Paths Given A spatial network G = (N, E) A set of activities, A and their locations (e.g. a node or edge) A set of Paths, P K (Number of routes) Edge weights Find A cardinality k subset P′ of P, i.e., a subset P′⊆ P with |P′| = k Objective Maximize the activity coverage (AC) by P′ Constraints 1 ≤ k ≤ |P|. k = 2 Edge Weights are 1 8

9
**Challenges Measures of interestingness Computational Complexity**

Activity coverage, average distance, etc Computational Complexity Choose(N,2) paths, given N nodes Exponential number of k subsets of paths 9

10
**SNAS is NP-Complete (Proof Sketch)**

Devising an NP-Completeness proof for decision problem Π [1] Show that Π is in NP Select a known NP-Complete Problem Π’ Construct a transformation f from Π’ to Π Prove that f is a polynomial transformation 1M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NP-completeness. WH freeman San Francisco, 1979.

11
Step 1: SNAS is in NP Verify in polynomial time whether activity coverage of P’ ≥ B SNAS Decision problem Given A spatial network G = (N, E) A set of activities, A and their locations (e.g. a node or edge) A set of Paths, P K (Number of routes) Edge weights B (bound on number of activities) Find A cardinality k subset P′ of P, i.e., a subset P′⊆ P with |P′| = k Objective Activity coverage (AC) by P′ ≥ B Constraints 1 ≤ k ≤ |P|.

12
**Step 2: Select a known NP-Complete Problem**

Maximum Coverage Input Sets s1, s2, …,sm (the sets may have some elements E = {e1, e2, …, en} in common) A number k < min (m,n) Output k sets such that the maximum number of elements are covered, i.e. the union of the selected sets has maximal size.

13
**Step 3: Construct a transformation f from Π to Π’ (1/3)**

Known NP Complete Problem Polynomial transformation A new Problem Solution to New Problem Solution to NP-Complete Problem Maximum Coverage Problem SNAS

14
**Step 3: Construct a transformation f from Π to Π’ (2/3)**

Maximum coverage input to SNAS input Impose a total order, TO, to m elements E = {e1, e2, …, en} Convert each element in E into a node with one activity Convert each set si to a path pi Sort elements in si using TO Add edge (eij, eij+1) ∀ j ∈ 1 …. |si| Example Maximum Coverage: E = {e1, e2, e3, e5, e6} K = 2 S1 = {e1, e2} S2 = {e2, e3} S3 = {e1, e2, e3} S4 = {e5, e6} KMR: P = {(e1→e2), (e2→e3), (e1→e2→e3), (e5→e6)} K = 2 Activity = {a1, a2, a3, a5, a6} Activity node = {a1–e1, a2–e2, a3–e3, a5–e5, a6–e6} Candidate Solutions: (e1→e2→e3), (e5→e6) e1 e2 e3 e5 e6

15
**Step 3: Construct a transformation f from Π to Π’ (3/3)**

SNAS output to maximum coverage output For each K route, Ri, produced by SNAS, convert the activities on the route into elements and form a set Si Example Given the K Routes: (e1→e2→e3), (e5→e6) S1 = {e1, e2, e3} S2 = {e5, e6}

16
**Network Summarization by Grouping/Clustering**

Related Work Network Summarization by Grouping/Clustering Zero or One routes Multiple routes Clumping (Okabe), e.g. NT-VCM (Shiode) Max. Subgraph, e.g. path, tree (Buchin) Our Work 16

17
**Contributions K-Main Routes (KMR) algorithm**

Finds a set of k routes to group activities New design decisions added Network Voronoi Activity assignment Divide and Conquer Summary path recomputation Spatial network activity summarization is shown to be NP-complete. Analytically demonstrate correctness of design decisions and show cost analysis Experimental evaluation of the various algorithms Performance evaluated using synthetic and real world datasets Case study comparing KMR with geometry based summarization 17

18
**K-Main Routes (KMR) Algorithm**

P = the set of Shortest Paths, K=2 K-Main Routes (KMR) Algorithm K-Main Routes Algorithm Select k paths as initial summary paths Repeat Form k clusters by assigning each activity to its closest summary path Recompute summary path of each cluster Until summary paths do not change Design Decisions Inactive node pruning Network Voronoi Activity assignment Divide and Conquer Summary path recomputation The lower left graph shows 2 active nodes N7 and N8. With inactive node pruning, we would only need to calculate and store shortest paths between these 2 nodes, as opposed to calculating and storing shortest paths between all the nodes in the graph. 18

19
**Design Decision: Inactive Node Pruning**

Only consider paths between active nodes Optimal solution will still be in this set Given the set of shortest paths 20 shortest paths calculated and stored versus 30

20
**Design Decision: Network Voronoi (NV) Activity Assignment**

Goals Form k clusters by assigning each activity to its closest summary path Improve execution time of current assignment strategy Example (execution trace) Next K-Main Routes Algorithm Select k shortest paths as initial summary paths Repeat Network Voronoi Activity Assignment Recompute summary path of each cluster Until summary paths do not change K-Main Routes Algorithm Select k shortest paths as initial summary paths Repeat Form k clusters by assigning each activity to its closest summary path Recompute summary path of each cluster Until summary paths do not change

21
**Design Decision: Network Voronoi (NV) Activity Assignment**

X Open: X A E D H Closed: X ∞ ∞ ∞ A B 3 4 C 7 8 D ∞ ACTIVITIES 1 9 2 10 1 2 3 4 5 6 7 8 9 10 A E D H AE DH ∞ E 5 6 F G H ∞ DISTANCE FROM ∞ ∞ Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node

22
**Design Decision: Network Voronoi (NV) Activity Assignment**

X Open: A E D H B Closed: X A ∞ 1 ∞ A B 3 4 C 7 8 D ACTIVITIES 1 9 2 10 1 2 3 4 5 6 7 8 9 10 A E D H AE DH E 5 6 F G H DISTANCE FROM ∞ ∞ 1 < 0? Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node

23
**Design Decision: Network Voronoi (NV) Activity Assignment**

X Open: E D H B F Closed: X A E 1 ∞ A B 3 4 C 7 8 D ACTIVITIES 1 9 2 10 1 2 3 4 5 6 7 8 9 10 A E D H AE DH E 5 6 F G H DISTANCE FROM ∞ 1 ∞ Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node

24
**Design Decision: Network Voronoi (NV) Activity Assignment**

X Open: D H B F C Closed: X A E D 1 ∞ 1 A B 3 4 C 7 8 D ACTIVITIES 1 9 2 10 1 2 3 4 5 6 7 8 9 10 A E D H AE DH E 5 6 F G H DISTANCE FROM 1 ∞ 1 < 0? Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node

25
**Design Decision: Network Voronoi (NV) Activity Assignment**

X Open: H B F C G Closed: X A E D H 1 1 A B 3 4 C 7 8 D ACTIVITIES 1 9 2 10 1 2 3 4 5 6 7 8 9 10 A E D H AE DH E 5 6 F G H DISTANCE FROM 1 ∞ 1 Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node

26
**Design Decision: Network Voronoi (NV) Activity Assignment**

X Open: B F C G 2 < 1? Closed: X A E D H B 1 1 A B 3 4 C 7 8 D ACTIVITIES 1 9 2 10 1 2 3 4 5 6 7 8 9 10 A E D H AE DH E 5 6 F G H DISTANCE FROM 1 1 1 1 2 < 1? Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node 1 1

27
**Design Decision: Network Voronoi (NV) Activity Assignment**

X Open: F C G Closed: X A E D H B F 1 1 A B 3 4 C 7 8 D ACTIVITIES 1 9 2 10 1 2 3 4 5 6 7 8 9 10 A E D H AE DH E 5 6 F G H DISTANCE FROM 1 1 1 1 2 < 1? Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node 1 1

28
**Design Decision: Network Voronoi (NV) Activity Assignment**

X Open: C G Closed: X A E D H B F C 1 1 A B 3 4 C 7 8 D ACTIVITIES 1 9 2 10 1 2 3 4 5 6 7 8 9 10 A E D H AE DH E 5 6 F G H DISTANCE FROM 1 1 1 1 2 < 1? Activity Active Node Inactive Node Virtual Node Summary Path Edge weight = 1 Edge weight = 0 Closed Node 1 1 1 1 1 1

29
**Design Decision: Network Voronoi (NV) Activity Assignment**

Network Voronoi Activity Assignment algorithm Input: Graph G = (N, E), a set of Activities A, a set of k Summary Paths, S Output: A set of k clusters formed by assigning all ai ∈A to one si ∈S, where dist(ai, si) ≤ dist(ai, sj) and sj ∈S and sj ≠ si 1. Open ← all nodes ∈ S, Closed ← Ø 2. Tnodes ← all nodes ∈ S, 3. Tactivities ← activities on si ∈S 4. repeat nc ← next node ∈ Open remove nc from Open Closed ← nc X ← neighbors of nc foreach xi ∈ X if xi ∉ Tnodes and xi ∉ Closed Tnodes ← xi xi.prev ← nc, xi.dist ← dist(xi, nc) + nc.dist xi.sp ← nc.sp else if xi ∈Tnodes update xi if new dist < xi.dist if xi ∉ Open Open ← xi Y ← activities on edge {nc, xi} foreach yi ∈ Y if yi ∉ Tactivities Tactivities ← yi yi.prev ← nc yi.dist ← xi.dist yi.sp ← xi.sp else update yi if new dist < yi.dist until all active nodes ∈ Closed return currentClusters

30
**Design Decision: Divide and Conquer Summary PAth REcomputation**

Goals Recompute the summary path of each cluster Improve execution time of current recomputation strategy Example (execution trace) Next K-Main Routes Algorithm Select k shortest paths as initial summary paths Repeat Network Voronoi Activity Assignment Divide and Conquer Summary path Recomputation Design Decision Until summary paths do not change K-Main Routes Algorithm Select k shortest paths as initial summary paths Repeat Network Voronoi Activity Assignment Recompute summary path of each cluster Until summary paths do not change

31
**Design Decision: Divide and Conquer Summary PAth REcomputation**

Summary Path Recomputation Algorithm Input: Graph G = (N, E), a set of Clusters, C Output: A set of summary paths, S where si ∈S has max coverage for ci ∈ C and si ∈ ci nextClusters ← Ø foreach ci ∈ C X ← active nodes of ci maxP ← Ø foreach xi ∈ X foreach xj ∈ X if (i ≠ j) cP ← getSP(xi, xj) if (maxP = Ø) maxP ← cP if (maxP.activities < cP.activities) if (maxP ≠ ci.summaryPath nextClusters ← maxP else nextClusters ← ci.summaryPath return nextClusters A B C D E F G H 1 2 3 4 5 6 7 8 9 10 Activity Active Node Inactive Node Summary Path Edge weights are 1 Cluster

32
**Validation Analytical Experimental Case studies**

Cost analysis explaining computational savings Experimental Comparative analysis of KMR with various design decisions Performed on real and synthetic data Network voronoi activity assignment and divide and conquer summary path recomputation saves computational costs Savings increase with number of nodes, routes, activities and active node ratio Case studies Qualitatively shows the usefulness of network based summarization on Crime data

33
**Analytical Evaluation: Computational Analysis**

KMR Execution Time = Number of Iterations × (Activity Assignment Cost + Summary Path Recomputation Cost) TKMR = I × ([K × |A| × cost(ai,ci)] + [K × dc × |N|2]) TKMR_I = I × ([K × |A| × cost(ai,ci)] + [K × dc × (|N| × r)2]) TKMR_IAS = I × ([|E| + |N|×log |N|] [K × dc × (|N|/K × r)2]) I = Number of Iterations K = Number of Clusters A = Set of activities cost(ai, ci) = Cost of calculating the distance between activity ai and cluster ci dc = Cost of looking up a path N = Set of Nodes E = Set of Edges r = active node ratio, 0 ≤ r ≤ 1

34
**Experimental Evaluation**

Variables Synthetic Dataset Real Dataset #Nodes #Routes Measures Java-based Simulator Analysis #Activities Active Node Ratio Candidates KMR_I KMR_IV KMR_ID KMR_IVD Goal: Comparative analysis Candidates: KMR with various design decisions KMR_I – KMR with inactive node pruning KMR_IV – KMR with inactive node pruning and Network voronoi activity assignment KMR_ID – KMR with Divide and conquer summary path recomputation KMR_IVD – KMR with all three design decisions Measure: CPU time (Unix time command) Platform: Mac Pro, 2 x Xeon Quad Core 2.26 GHz, 16 GB RAM Variables: #Nodes, #Routes, #Activities, Active Node Ratio Fixed Parameters: unit edge length Datasets: Synthetic and Real (Haiti Earthquake) 34

35
**Data Description and Characteristics**

Synthetic Data 2010 Census TIGER/Line® Shapefiles used for road network Activities randomly assigned to each edge Real-world data: Haiti Data Set Geospatial and Temporal Dataset describing recent events post-disaster Dataset collected from Jan 12, 2010 to March 23, 2010 1,677 records Characteristics Attributes Incident Title (e.g., “Food, Water, Tents needed…”) Incident Date and Time Location (City, port name) Category (numeric category) Latitude/Longitude Sources Crisis Map of Haiti - OpenStreetMap - 35

36
**Effect of Number of Nodes**

Synthetic Data Set Number of Activities = 1200 Active Node Ratio = 0.2 K = 2 Real Data Set Number of Activities = 1206 Active Node Ratio = K = 2 Trends: Voronoi Activity assignment and divide and conquer summary path recomputation saves comp. costs Savings increase with number of nodes

37
**Effect of Number of Routes, K**

Synthetic Data Set Number of Nodes = 1000 Number of Activities = 1200 Active Node Ratio = 0.2 Real Data Set Number of Nodes = 1000 Number of Activities = 202 Active Node Ratio = 0.219 Trends: Voronoi Activity assignment and divide and conquer summary path recomputation saves comp. costs Savings increase with number of routes

38
**Effect of Number of Activities**

Synthetic Data Set Number of Nodes = 1000 Active Node Ratio = 0.2 K = 2 Trends: Voronoi Activity assignment and divide and conquer summary path recomputation saves comp. costs Savings increase with number of activities

39
**Effect of Active Node Ratio**

Synthetic Data Set Number of Nodes = 1000 Number of Activities = 1200 K = 2 Trends: Voronoi Activity assignment and divide and conquer summary path recomputation saves comp. costs Savings increase with active node ratio

40
**Case Study: Crime Analysis**

Input (a set of crime incidents, k=5) KMR Output Crimestat K-Means (Euclidean distance) Crimestat K-Means (Network distance)

41
**Case Study: Crime Analysis**

Input (a set of crime incidents, k=5) KMR Output Crimestat K-Means (Euclidean distance) Crimestat K-Means (Network distance)

42
**Case Study: Crime Analysis**

Input (a set of crime incidents, k=5) KMR Output Crimestat K-Means (Euclidean distance) Crimestat K-Means (Network distance)

43
Summary Spatial network activity summarization was shown to be NP-complete. K-Main Routes (KMR) algorithm and its design decisions described Inactive node pruning Network Voronoi Activity assignment Divide and Conquer Summary path recomputation Analytically demonstrated correctness of design decisions and cost analysis showed Experimental evaluation Performance evaluated using synthetic and real world datasets Case study comparing KMR with geometry based summarization 43

44
**Future Work Short Term Long Term Usefulness**

When is it useful to domain professionals (crime analysts, emergency managers)? For which use cases is the proposed solution appropriate? For which geographies is the proposed solution appropriate? Distance based objective function instead of coverage based Overlapping paths Long Term Dynamically changing incidents Edge lengths, e.g. activities on a small section of a long edge 44

45
Acknowledgements Members of the Spatial Database and Spatial Data Mining Research Group, University of Minnesota, Twin-Cities. This work was supported by grants from USARMY and USDOD. Thank you for your time! Any questions or comments?

Similar presentations

OK

Project 2: Classification Using Genetic Programming 2008. 10. 27 Kim, MinHyeok Biointelligence laboratory Artificial.

Project 2: Classification Using Genetic Programming 2008. 10. 27 Kim, MinHyeok Biointelligence laboratory Artificial.

© 2018 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on history of olympics in canada Ppt on suspension type insulators home Ppt on waxes philosophical Ppt on bluetooth technology Ppt on measuring central venous pressure Ppt on object-oriented programming with c++ Ppt on hunter commission Ppt on area of parallelogram worksheet Ppt on job rotation definition Download ppt on teachers day in india