Presentation on theme: "Chap 6: Spatial Networks 6. 1 Example Network Databases 6"— Presentation transcript:
1Chap 6: Spatial Networks 6. 1 Example Network Databases 6 Chap 6: Spatial Networks 6.1 Example Network Databases 6.2 Conceptual, Logical and Physical Data Models 6.3 Query Language for Graphs 6.4 Graph Algorithms 6.5 Trends: Access Methods for Spatial Networks
2Learning Objectives Learning Objectives (LO) LO1: Understand the concept of spatial network (SN)What is a spatial network?Why learn about spatial network?LO2 : Learn about data models for SNLO3: Learn about query languages and query processingLO4: Learn about trendsFocus on concepts not procedures!Mapping Sections to learning objectivesLOLOLO , 6.4LO
3Motivation: Navigation Systems In-vehicle navigation systemsOffered on many luxury carsServices:map destination given a street addresscompute the shortest route to a destinationHelp drivers follow selected routeRecall that many maps and GIS were made for navigation16th century shipping, trade and treasure huntsMaps were used to find destination and avoid hazardsNavigation and transportation are important applications!Can OGIS spatial abstract data types support navigation?
4Navigation Systems and OGIS Spatial ADTs Can OGIS spatial abstract data types support navigation?OGIS spatial ADTs discussed in chapter 2 and 3 are inadequate forcomputing route to a destinationRationale: Part 1Fact: Relation language (e.g. SQL2) can’t compute transitive closure!Fact: Shortest path computation is an instance of transitive closureRationale: Part 2OGIS spatial ADTs discussed in chapters 2 and 3 do not includeGraph data type, shortest_path operationsSee section 6.3.1, pp. 158 for details.
5How important are Spatial Networks? Web based navigation services attract large audienceExample: mapquest.com, yahoo route etc.Rated among most popular internet services!Transportation sector among application of GISAlready a major segment of GIS marketIs among three fastest growing segmentsSpatial networks in business and governmentLargest companies in the world in following sectors of economyCar, ship, airplane, oil, electrical power, natural gas, telephoneLogistics groups in Retailers (e.g. Walmart), Manufacturers (e.g. GE)Many departments of governmentTransportation (USDOT), Logistics of deployment (USDOD),City (utilities like water, sewer, …)
6SDBMS News on Spatial Networks SQL3 includes transitive closure operatorShortest paths can be computed in SQL3But response time may be large!An OGIS committee working on defining standard Graph ADTsThese can be added to SQL3 as user defined data typesWhat to do till standards are out?let us work with simple graph ADTsBased on commercial software and academic research
76.1 Example Spatial Networks Road, River, Railway networksFig 6.1
8Spatial network query example Shortest path querySmallest travel time path between the two points shown in blue color. It follows a freeway (I-94) which is faster but not shorter in distance.Shortest distance path between the same two points shown in red colo, which is is hidden under blue on common edges.
9Queries on Example Networks Railway network1. Find the number of stops on the Yellow West (YW) route.2. List all stops which can be reached from Downtown Berkeley.3. List the route numbers that connect Downtown Berkeley and Daly City.4. Find the last stop on the Blue West (BW) route.River network1. List the names of all direct and indirect tributaries of the Mississippi river2. List the direct tributaries of the Colorado.3. Which rivers could be affected if there is a spill in river P1?Road Network1. Find shortest path from my current location to a destination.2. Find nearest hospital by distance along road networks.3. Find shortest route to deliver goods to a list of retail stores.4. Allocate customers to nearest service center using distance along roads
10Learning Objectives Learning Objectives (LO) LO1: Understand the concept of spatial network (SN)LO2 : Learn about data models of SNRepresentative data types and operations for SNRepresentative data-structuresLO3: Learn about query languages and query processingLO4: Learn about trendsFocus on concepts not procedures!Mapping Sections to learning objectivesLOLOLO , 6.4LO
116.2 Spatial Network Data Models Recall 3 level Database DesignConceptual Data ModelGraphsLogical Data Model -Data types - Graph, Vertex, Edge, Path, …Operations - Connected(), Shortest_Path(), ...Physical Data ModelRecord and file representation - adjacency listFile-structures and access methods - CCAM
126.2 Conceptual Data Models Conceptual Data Model for Spatial NetworksA graph, G = (V,E)V = a finite set of verticesE = a set of edges E , between vertices in VExample: two graph models of a roadmap1. Nodes = road-intersections, edges = road segment between intersections2. Nodes = roads (e.g. Route 66), edge(A, B) = road A crosses road BClassifying graph modelsDo nodes represent spatial points? - spatial vs. abstract graphsAre vertex-pair in an edge order? - directed vs. undirectedExample (continued)Model 1 is a spatial graph, Model 2 is an abstract graphModel 2 is undirected but can be directed or undirected
136.2 Conceptual Data Model - Exercise Exercise: Review the graph model of river network in Figure 6.3 (pp. 157).List the nodes and edges in the graph.List 2 paths in this graph.Is it spatial graph? Justify your answer.Is it a Directed graph? Justify your answer.Fig 6.3
146.2 Logical Data Model - Data types Common data typesVertex: attributes are label, isVisited, (location for spatial graphs)DirectedEdge : attributes are start node, end node, labelGraph : attributes are setOfVertices, setOfDirectedEdges, ...Path : attributes are sequenceOfVerticesQuestions. Use above data types to model.an undirected edgetrain routes in BART network (Fig. 6.1(a), pp. 150)rivers in the river network (Fig. 6.1(b), pp.150)Note: Multiple distinct solutions are possible in last two cases!
156.2 Logical Data Model - Operations Low level operations on a Graph GIsDirected - return true if and only if G is directedAdd , AddEdge - adds a given vertex, edgeto GDelete, DeleteEdge - removes specifies node (and related edges), edge from GGet, GetEdge - return label of given vertex, edgeGet-a-successor, GetPredecessors - return start or end vertex of an edgeGetSucessors - return end vertices of all edge starting at a given vertexBuilding blocks for queriesshortest_path(vertex start, vertex end)shortest tour(vertex start, vertex end, setOfVertices stops)locate_nearest_server(vertex client, setOfVetices servers)allocate(setOfVetices servers)
166.2 Physical Data Models Categories of record/file representations Main memory basedDisk basedMain memory representations of graphsAdjacency matrix M[A, B] = 1 if and only if edge(vertex A, vertex B) existsAdjacency list : maps vertex A to a list of successors of AExample: See Figure 6.2(a), (b) and ( c) on next slidenormalized - tables, one for vertices, other for edgesdenormalized - one table for nodes with adjacency listsExample: See Figure 6.2(a), (d) and ( e) on next slide
186.2 Case Studies Revisited Goal: Compare relational schemas for spatial networksRiver networks has an edge table, FallsIntoBART train network does not an edge tableEdge table is crucial for using SQL transitive closure .Representation of river networksConceptual : abstract graph (Figure 6.3, pp. 157)nodes = riversdirectedEdges(R1, R2) if and only R1 falls into R2Table representation given in Table 6.3 (pp. 157)Representation of BART train networkConceptual :entities = Stop, Directed Routemany to many relationship: aMemberOf( Stop, Route)Table representation in Table 6.1 and 6.2 (pp. 155, 156)
19Learning Objectives Learning Objectives (LO) LO1: Understand the concept of spatial network (SN)LO2 : Learn data models for SNLO3: Learn about query languages and query processingQuery building blocksProcessing strategiesLO4: Learn about trendsFocus on concepts not procedures!Mapping Sections to learning objectivesLOLOLO , 6.4LO
206.3 Query Languages For Graphs Recall Relation algebra (RA) based languagesCan not compute transitive closure (Section 3.1, pp. 158)SQL3 provides support for transitive closure on graphssupports shortest pathsSQL support for graph queriesSQL2 - CONNECT clause in SELECT statementFor directed acyclic graphs, e.g. hierarchiesSQL 3 - WITH RECURSIVE statementTransitive closure on general graphsSQL 3 -user defined data typesCan include shortest path operation!
21Concept of Transitive Closure Fig 6.4.Consider a graph G = (V, E)Let G* = Transitive closure of GThen T = graph (V*, E*), whereV* = V(A, B) in E* if and only if there is a path from A to B in G.Example in Figure 6.4G has 5 nodes and 5 edgesG* has 5 nodes and 9 edgesNote edge (1,4) in G* forpath (1, 2, 3, 4) in G.
226.3.2 SQL2 Connect Clause Syntax details FROM clause a table for directed edges of an acyclic graphPRIOR identifies direction of traversal for the edgeSTART WITH specifies first vertex for path computationsSemanticsList all nodes reachable from first vertex using directed edge in specified tableAssumption - no cycle in the graph!Not suitable for train networks, road networksExampleQuery: List direct and indirect tributaries of MississippiTable = FallsInto(source, dest) defined in Table 6.3 (pp. 157)edge direction is from “source” to “dest” (source falls into dest)start node is Mississippi (river id = 1)
23SQL Connect Clause - Example SQL experssion on rightExecution trace of pathsstarts at vertex 1 (Mississippi)adds children of 1adds children of Missouriadds children of Platteadds children of YellostoneResult has edgesfrom descendentsto MississippiSELECT sourceFROM FallsIntoCONNECT BY PRIOR source = destSTART WITH dest =1Fig 6.5
24SQL Connect Clause - Exercise Study 2 SQL queries on rightNote different use of PRIOR keywordCompute results of each queryWhich one returns ancestors of 3?Which returns descendents of 3?Which query lists river affected byoil spill in Missouri (id = 3)?SELECT source FROM FallsIntoCONNECT BY PRIOR source = destSTART WITH dest =3CONNECT BY source = PRIOR destFig 6.3
256.3.2 SQL3: With Recursive Statement SyntaxWITH RECURSIVE <Relational Schema>AS <Query to populate relational schema> Syntax details<Relational Schema> lists columns in result table with directed edges<Query to populate relational schema> has UNION of nested sub-queriesBase cases to initialize result tableRecursive cases to expand result tableSemanticsResults relational schema say X(source, dest)Columns source and dest come from same domain, e.g. VerticesX is a edge table, X(a,b) directed from a to bResult table X is initialized using base case queriesResult expanded using X(a, b) and X(b, c) implies X(a, c)
266.3.3 SQL3 Recursion - Example Revisit Figure 6.4 (pp. 158) on transitive closuresComputing table X in Fig. 6.4(d), from table R in Fig. 6.4(b)WITH RECURSIVE X(source,dest)AS (SELECT source,dest FROM R)UNION(SELECT R.source,X.dest FROM R,X WHERE R.dest=X.source);MeaningInitialize X by copying directed edges in relation RInfer new edge(a,c) if edges (,b) and (b,c) are in XDeclarative qury does not specify algorithm needed to implement itExercise:Write a SQL expression using WITH RECURSIVE to determineall direct and indirect tributaries of the Mississippi riverall ancestors of the Missouri river
276.2 Case Studies Revisited Goal: Compare relational schemas for spatial networksRiver networks has an edge table, FallsIntoBART train network does not an edge tableEdge table is crucial for using SQL transitive closureExercise: Proposed a different set of table to model BART as a graphusing an edge table connecting stopsRiver networks - graph modelCan use SQL transitive closure to compute ancestors or descendent of a riverWe saw an examples using CONNECT BY clauseMore examples in section (pp )Exercises explore use of WITH RECURSIVE statement
286.2 Case Studies Revisited BART train network - non-graph modelentities = Stop, Route, relationship: aMemberOf( Stop, Route)Can not use SQL recursion!No table can be viewed as edge table!RouteStop table is a subset of transitive closureTransitive closure queries on edge(from_stop, to_stop)A few can be answered by querying RouteStop tableSee Query 6 (section 6.3.3, pp.163)Many can not be answeredFind all stops reachable from Downtown Berkeley.
29Learning Objectives Learning Objectives (LO) LO1: Understand the concept of spatial network (SN)LO2 : Learn data models for SNLO3: Learn about query languages and query processingQuery building blocksProcessing strategiesLO4: Learn about trendsFocus on concepts not procedures!Mapping Sections to learning objectivesLOLOLO , 6.4LO
306.4 Query Processing for Spatial Networks Recall Query Processing and Optimization (chapter 5)DBMS decomposes a query into building blocksKeeps a couple of strategy for each building blockSelects most suitable one for a given situationBuilding blocks for graph transitive closure operationsConnectivity(A, B)Is node B reachable from node A?Shortest path(A, B)Identify least cost path from node A to node BFocus on conceptsNot on procedural details which change often
316.4 Strategies for Graph Transitive Closure Categorizing Strategies for transitive closureQ? Building blocksStrategies for Connectivity queryStrategies for shortest path queryQ? Assumption on storage area holding graph tablesMain memory algorithmsDisk based external algorithmsRepresentative strategies for single pair shortest pathConnectivity: Breadth first search, depth first searchShortest path: Dijktra’s algorithm, Best first algorithmDisk based, externalShortest path - Hierarchical routing algorithmConnectivity strategies are already implemented in relation DBMS
326.4.2 Strategies for Connectivity Query Breadth first search -Visit descendent by generationchildren before grandchildrenExample: 1 - (2,4) - (3, 5)Depth first search - basic ideaTry a path till deadendBacktrack to try different pathsLike a maze gameExample:Note backtrack from3 to 2Fig. 6.6
336.4.2 Shortest Path strategies -1 Dijktra’s algorithmIdentify paths to descendent by depth first searchEach iterationExpand descendent with smallest cost path so farUpdate current best path to each node, if a better path is foundTill destination node is expandedProof of correctness is based on assumption of positive edge costsExample:Consider shortest_path(1,5) for graph in Figure 6.2(a), pp. 154Iteration 1 expands node 1 and edges (1,2), (1,4)set cost(1,2) = sqrt(8); cost(1,4) = sqrt(10) using Edge table in Fig. 6.2(d)Iteration 2 expands least cost node 2 and edges (2,3), (2,4)set cost(1,3) = sqrt(8) + sqrt(5)Iteration 3 expands least cost node 4 and edges (4,5)set cost(1,5) = sqrt(10) + sqrt(5)Iteration 4 expands node 3 and Iteration 5 stops node 5.Answer is the path (1-4-5)
356.4.2 Shortest Path Strategies-2 Best first algorithmSimilar to Dijktra’s algorithm with one changeCost(node) = actual_cost(source, node) + estimated_cost(node, destination)estimated_cost should be an underestimate of actual costExample - euclidean distanceGiven effective estimated_cost() function, it is faster than Dijktra’s algorithmExample:Revisit shortest_path(1,5) for graph in Figure 6.2(a), pp. 154Iteration 1 expands node 1 and edges (1,2), (1,4)set actual_cost(1,2) = sqrt(8); actual_cost(1,4) = sqrt(10);estimated_cost(2,5) = 5; estimated_cost(4,5) = sqrt(5)Iteration 2 expands least cost node 4 and edges (4,5)set actual_cost(1,5) = sqrt(10) + sqrt(5), estimated_cost(5,5) = 0Iteration 3 expands node 5Answer is the path (1-4-5)
366.4.2 Shortest Path Strategies-3 Dijktra’s and Best first algorithmsWork well when entire graph is loaded in main memoryOtherwise their performance degrades substantiallyHierarchical Routing AlgorithmsWorks with graphs on seconday storageLoads small pieces of the graph in main memoriesCan compute least cost routesKey ideas behind Hierarchical Routing AlgorithmFragment graphs - pieces of original graph obtained via node partitioningBoundary nodes - nodes of with edges to two fragmentsBoundary graph - a summary of original graphContains Boundary nodesBoundary edges: edges across fragments or paths within a fragment
376.4.2 Shortest Path Strategies-3 Insight:A Summary of Optimal path in original graph can be computedusing Boundary graph and 2 fragmentsThe summary can be expanded into optimal path in original graphexamining a fragments overlapping with the pathloading one fragment in memory at a timeSee Theorems 1 and 2 (pp , section 6.4.4)Illustration of the algorithmFigure 6.7(a) - fragments of source and destination nodesFigure 6.7(b) - computing summary of optimal path usingBoundary graph and 2 fragmentsNote use of boundary edges only in the path computationFigure 6.8(a) - The summary of optimal path using boundary edgesFigure 6.8(b) Expansion back to optimal path in original graph
38Hierarchical Routing Algorithm-Step 1 Step 1: Choose Boundary Node PairMinimize COST(S,Ba)+COST(Ba,Bd)+COST(Bd,D)Determining Cost May Be Non-TravialFig 6.7(a)
396.4.4 Hierarchical Routing- Step 2 Step 2: Examine Alternative Boundary PathsBetween Chosen Pair (Ba,Bd) of boundary nodesFig 6.7(b)
42Learning Objectives Learning Objectives (LO) LO1: Understand the concept of spatial network (SN)LO2 : Learn about data models for SNLO3: Learn about query languages and query processingLO4: Learn about trendsStorage methods for SNFocus on concepts not procedures!Mapping Sections to learning objectivesLOLOLO , 6.4LO
43Spatial Network Storage Problem StatementGiven a spatial networkFind efficient data-structure to store it on disk sectorsGoal - Minimize I/O-cost of operationsFind(), Insert(), Delete(), Create()Get-A-Successor(), Get-Successors()Constraintsspatial networks are much larger than main memoriesProblems with Geometric indices, e.g. R-treeclusters objects by proximity not edge connectivityPerforms poorly if edge connectivity not correlated with proximityTrends: graph based methods
44Graph Based Storage Methods Fig 6.9Insight:I/O cost of operations (e.g. get-a-successor) minimized by maximizing CRRCRR = Pr. (node-pairs connected by an edge are together in a disk sector)Example: spatial network in Fig. 6.9Adjacency list table with node recordsConsider disk sector hold 3 node records2 sectors are (1, 2, 3), (4,5,6)5 edges out of 9 keep node pairs togetherCRR = 5/8
45Graph Based Storage Methods Example: Consider two paging of a spatial networknon-white edges => node pair in same pageFile structure using node partitions on right is preferredit has fewer white edges => higher CRR
46Partitioning a spatial network into sectors Goal: Maximize CRRFig 6.10Fig 6.11
47Clustering and Storing a Sample Network Fig 6.12Storage method ideaDivide nodes into sectorsto maximize CRRUse a secondary indexfor find()using R-tree or B-treeExample: Figure 6.12left part : node divisionright partdisk sectorssecondary indexB-tree/Z-order
48Summary Spatial Networks are a fast growing applications of SDBs Spatial Networks are modeled as graphsGraph queries, like shortest path, are transitive closurenot supported in relational algebraSQL features for transitive closure: CONNECT BY, WITH RECURSIVEGraph Query ProcessingBuilding blocks - connectivity, shortest pathsStrategies - Best first, Dijktra’s and Hierachical routingStorage and access methodsMinimize CRR