STUN: SPATIO-TEMPORAL UNCERTAIN (SOCIAL) NETWORKS Chanhyun Kang Computer Science Dept. University of Maryland, USA Andrea Pugliese.

Slides:

Advertisements

Similar presentations

1 DATA STRUCTURES USED IN SPATIAL DATA MINING. 2 What is Spatial data ? broadly be defined as data which covers multidimensional points, lines, rectangles,

Advertisements

Chapter 5: Tree Constructions

Indexing DNA Sequences Using q-Grams

The A-tree: An Index Structure for High-dimensional Spaces Using Relative Approximation Yasushi Sakurai (NTT Cyber Space Laboratories) Masatoshi Yoshikawa.

Spatial Database Systems. Spatial Database Applications GIS applications (maps): Urban planning, route optimization, fire or pollution monitoring, utility.

Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.

1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011.

Greedy Algorithms Greed is good. (Some of the time)

Indexing and Range Queries in Spatio-Temporal Databases

A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.

Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.

2-dimensional indexing structure

Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:

Cache Placement in Sensor Networks Under Update Cost Constraint Bin Tang, Samir Das and Himanshu Gupta Department of Computer Science Stony Brook University.

Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.

Spatial Information Systems (SIS) COMP Spatial access methods: Indexing.

Graphs & Graph Algorithms 2 Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park.

Hierarchical Constraint Satisfaction in Spatial Database Dimitris Papadias, Panos Kalnis And Nikos Mamoulis.

An Incremental Refining Spatial Join Algorithm for Estimating Query Results in GIS Wan D. Bae, Shayma Alkobaisi, Scott T. Leutenegger Department of Computer.

Spatial Queries Nearest Neighbor Queries.

The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.

R-Trees 2-dimensional indexing structure. R-trees 2-dimensional version of the B-tree: B-tree of maximum degree 8; degree between 3 and 8 Internal nodes.

Indexing Spatio-Temporal Data Warehouses Dimitris Papadias, Yufei Tao, Panos Kalnis, Jun Zhang Department of Computer Science Hong Kong University of Science.

1 Efficient Discovery of Conserved Patterns Using a Pattern Graph Inge Jonassen Pattern Discovery Arwa Zabian 13/07/2015.

Graphs & Graph Algorithms 2 Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.

Minimum Spanning Trees. Subgraph A graph G is a subgraph of graph H if –The vertices of G are a subset of the vertices of H, and –The edges of G are a.

COVERTNESS CENTRALITY IN NETWORKS Michael Ovelgönne UMIACS University of Maryland 1 Chanhyun Kang, Anshul Sawant Computer Science Dept.

Fast Subsequence Matching in Time-Series Databases Christos Faloutsos M. Ranganathan Yannis Manolopoulos Department of Computer Science and ISR University.

Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.

GRIN – A Graph Based RDF Index Octavian Udrea Andrea Pugliese V. S. Subrahmanian Presented by Tulika Thakur.

Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.

Join-Queries between two Spatial Datasets Indexed by a Single R*-tree Join-Queries between two Spatial Datasets Indexed by a Single R*-tree Michael Vassilakopoulos.

Efficient Gathering of Correlated Data in Sensor Networks

Ranking Queries on Uncertain Data: A Probabilistic Threshold Approach Wenjie Zhang, Xuemin Lin The University of New South Wales & NICTA Ming Hua,

The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.

©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.

Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.

On Graph Query Optimization in Large Networks Alice Leung ICS 624 4/14/2011.

CSCI 115 Chapter 7 Trees. CSCI 115 §7.1 Trees §7.1 – Trees TREE –Let T be a relation on a set A. T is a tree if there exists a vertex v 0 in A s.t. there.

12.1Database System Concepts - 6 th Edition Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Join Operation Sorting 、 Other.

Efficient Subwindow Search: A Branch and Bound Framework for Object Localization ‘PAMI09 Beyond Sliding Windows: Object Localization by Efficient Subwindow.

R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search.

Computing & Information Sciences Kansas State University Tuesday, 03 Apr 2007CIS 560: Database System Concepts Lecture 29 of 42 Tuesday, 03 April 2007.

Nearest Neighbor Queries Chris Buzzerd, Dave Boerner, and Kevin Stewart.

Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August.

Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.

Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle.

Indexing Correlated Probabilistic Databases Bhargav Kanagal, Amol Deshpande University of Maryland, College Park, USA SIGMOD Presented.

2004/12/31 報告人 : 邱紹禎 1 Mining Frequent Query Patterns from XML Queries L.H. Yang, M.L. Lee, W. Hsu, and S. Acharya. Proc. of 8th Int. Conf. on Database.

1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree ： An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.

1 Complex Spatio-Temporal Pattern Queries Cahide Sen University of Minnesota.

File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.

Da Yan (HKUST) James Cheng (CUHK) Wilfred Ng (HKUST) Steven Liu (HKUST)

Strategies for Spatial Joins

CPS216: Data-intensive Computing Systems

Chapter 5. Greedy Algorithms

Database Management System

RE-Tree: An Efficient Index Structure for Regular Expressions

Temporal Indexing MVBT.

Probabilistic Data Management

Chapter 12: Query Processing

Query Processing and Optimization

Spatio-temporal Pattern Queries

Spatial Online Sampling and Aggregation

Graphs & Graph Algorithms 2

Switching Lemmas and Proof Complexity

Efficient Processing of Top-k Spatial Preference Queries

Donghui Zhang, Tian Xia Northeastern University

Efficient Aggregation over Objects with Extent

Presentation transcript:

STUN: SPATIO-TEMPORAL UNCERTAIN (SOCIAL) NETWORKS Chanhyun Kang Computer Science Dept. University of Maryland, USA Andrea Pugliese DEIS Dept. University of Calabria, Italy John Grant, V.S. Subrahmanian Computer Science Dept. University of Maryland, USA 1

Motivation 2 Let’s assume that there is a social network including spatio-temporal information with certainty values. Maryland Bethesda Potomac

At time point 5 Within Maryland Motivation Query example Find all people who attended a party in Maryland at time point 5 with certainty at least Spatial constraint Temporal constraint Certainty constraint Common subgraph matching query The query contains not only common graph query but also constraints for spatio-temporal information and certainty values At least 0.5 certainty

Motivation In graph query research Several subgraph matching algorithms and index structures are suggested The indexes and the algorithms consider graph structure property only But in order to answer the query efficiently, we need to consider Graph structure property Spatio-temporal information property Certainty information property So, we suggest a new index structure considering the properties and a query processing algorithm using the index. 4

In this paper Introduce STUN: Spatio-Temporal Uncertainty (Social) Network Define STUN query language Develop STUN index, a disk based index structure Develop a query processing algorithm using STUN index Evaluate the algorithms 5

STUN Spatio-Temporal Uncertainty (Social) Network is an extension of social networks Supports aspects of spatio-temporal uncertainty in networks Where and when the relationships are/were true How certain we are that the relationships hold/held Defined by a set of STUN tuples STUN tuple : STUN quadruple + STUN annotation STUN quadruple : two vertices, a relationship and a certainty value STUN annotation : spatio-temporal information 6

Syntax : STUN quadruple STUN quadruple : ( v, l, v’ ; c ) v, v’ ∈ V (vertices) and l ∈ L (labels) Certainty factor c ∈ [0,1] Ed Jim Friend;0.7 “Jim” is a friend of “Ed” with certainty 0.7 For example, (Jim, Friend, Ed; 0.7) 7

Syntax : STUN annotation STUN annotation : [R,T] Expresses spatial information and temporal information R is a region, a set of space points in a spatial reference system S S ⊆ [0,M] x [0,N] with M,N ∈ R (Real numbers) A space point is a member of S T is a time interval, a pair(st, et) with st ≤ et st and et are time points to express the start and the end of a specific period A time point is a member of a temporal reference system [ L, U ] 8

Syntax : STUN tuple STUN tuple : ( v, l, v’ ; c ) : [R, T] STUN quadruple + STUN annotation A STUN knowledge base is a finite set of STUN tuples. Ex. (Phil, Organized, Party2; 1):[Bethesda, (15,15)] Party2 Phil (,Organized, ;1) [Bethesda, (15,15)] “Phil” organized “Party2” with certainty 1 and the event occurred at time 15 at some location within the region “Bethesda” 9

STUN QUERY LANGUAGE 10

STUN Queries A STUN query q contains Graph part (G q ) Subgraph query Minimum certainty values for the relationships in the graph query Constraint Part (C q ) Constraints for spatial information Constraints for temporal information 11 Find all people who attended a party in Maryland at time point 5 with certainty at least 0.5 Subgraph query Constraint for spatio- temporal information Minimum certainty value Example.

Graph part : G q Subgraph query and Minimum certainty values A set of query graph tuples Variables are denoted using “?”; output variables are underlined A query graph tuple is ( v, l, v’ ; c ) : [R, T] where v, v ’ ∈ V U VAR V, l ∈ L U VAR L, c ∈ [0,1], R ∈ VAR R and T ∈ VAR T STUN Queries 12 Find all people(?I) who attended a party(?P) in Maryland at time point 5 with certainty at least 0.5 Subgraph query Example. Minimum certainty value G q ={(?I, attended, ?P; 0.5):[?s,?t]}

Find all people(?I) who attended a party(?P) in Maryland at time point 5 with certainty at least 0.5 STUN Queries Constraint part: C q Specify spatial constraints and temporal constraints Expressed by Predicate symbols Represent a spatial relation or a temporal relation Parameters for the predicates Ground terms or variables in the graph part 13 Spatial constraint Example. C q ={inside(?s, Maryland), during(?t,[5,5])} Temporal constraint

STUN Query example Find all people(?I) who attended a party(?P) in Maryland at time point 5 with certainty at least 0.5 G q ={(?I, attended, ?P; 0.5):[?s,?t]} C q ={inside(?s,Maryland), during(?t,[5,5])} 14

STUN Query example Finds all people(?I) who have been a friend of ‘ Jim’ in the time interval [ 10, 20 ] with certainty at least 0.9 as well as a friend of ‘ Phil ’ in the same interval with certainty at least 0.6 And who attended a party(?P) in Maryland organized by ‘ Phil ’ that occurred during the time interval [ 0, 20 ] G q ={(?I, attended, ?P; 1.0):[?s1,?t1], (?I, friend, Jim ; 0.9):[?s2,?t2], (?I, friend, Phil ; 0.6):[?s2,?t2], ( Phil, organized, ?P; 1.0):[?s1,?t1],} C q ={inside(?s1, Maryland ), during(?t1,[0,20]), during(?t2,[10,20])} 15

STUN query answer 16 ?P Phil Organized Phil Party3 Substitution θ

STUN INDEX 17

STUN Index A balanced tree Each leaf node represents a portion of the STUN knowledge base. Each inner node captures the subgraph represented by its child nodes. 18

STUN Index Each node occupies a disk page and contains MBR(minimum bounding rectangle) Envelops the regions associated with the STUN tuples in the subgraph of child nodes MBI(minimum bounding interval) Envelops the time intervals associated with the STUN tuples in the subgraph of child nodes On processing queries, MBRs and MBIs are used to prune nodes for the answers using spatial constraints and temporal constraints 19 A spatial reference system R1R1 R2R2 R3R3 R 1,R 2, R 2 : regions N1,N2,N3: nodes N1 N2 N3 R1R1 R2R2 R3R3 MBR of N1 MBR of N2 MBR of N3

STUN Index Reduce the number of nodes to read for answering queries. Each index node should have Few cross edges with other nodes at the same level Small MBR(minimum bounding rectangle) and small MBI(minimum bounding interval) Small MBR overlaps with other nodes at the same level Small MBI overlaps with other nodes at the same level. In order to achieve the constraints Build a vertex and edge weighted undirected graph(WUG) from the STUN KB Then, handle the weights on building the index 20

Building STUN Index I. Initial step Build a vertex and edge weighted undirected graph(WUG) from STUN KB The weights are used to satisfy the constraints Few cross edges Small MBR(minimum bounding rectangle) and small MBI(Minimum bounding interval) Small MBR overlaps and small MBI overlaps II. Coarsening Step Merging vertices using weights of vertices and edges III. Partitioning Step Build a tree index using coarsened graphs 21

Building Index- Initial Step 22 v1v2v0 v1 v2 v e0 e1 e2 Each edge contains a spatio-temporal information with a certainty value labels WUG

Building Index- Initial Step 23

Building Index- Coarsening Coarsen the graph until the size of the coarsened graph is less than 1 disk page At each coarsening level l, the number of vertices in G l is half of the number of vertices in G l-1 Original graph G0G0 G1G1 G2G2 GkGk Merging vertices Coarsening Level 0 Level 1 Level k Level 2 … … 24 N # of vertices N/2 N/4 N/2 k

How to merge vertices 25 Choose a vertex v randomly to merge Select a neighbor m of v with minimum edge weight (v is merged into m) Update the weight, MBR and MBI of edges of v and m (If there is no edge between m and a neighbor of v, add an edge between m and the neighbor) Delete the edge between v and m and the vertex v

MBR(all edges of G k ) MBI(all edges of G k ) Building Index- Partitioning G k-2 G k-1 GkGk 2. Partition 26 3.Induce subgraphs using the mapping information from G k-1 to G k a b MBR(all edges of a ) MBI(all edges of a ) MBR(all edges of b ) MBI(all edges of b ) … … 1. Store G k as a root page 4. Store the subgraphs as child pages 5. Do the works until at the lowest coarsening level recursively Coarsened graphs - Each edge already has a MBR and a MBI

Query Answering STUN index is used to get candidates for variables Retrieve the index tree using mapping information with ground terms(constants) in a query MBR(minimum bounding rectangle) and MBI(minimum bounding interval) are used to filter out the unnecessary pages for the query answer with regard to spatial and temporal constraints 27 Phil ?I Jim ?I friend Phil ?P organized - Check MBRs and MBIs of pages with the constraints for pruning STUN index

Query Answering Overall algorithm I. Get candidates for each variable of a query II. Select a variable that has the smallest number of candidates III. Substitute each candidate for the variable IV. For each substitution, do steps II and III for remaining variables recursively in a depth first manner V. If no variable is left, return the substitutions 28

EVALUATION 29

Experiment : Environment We developed a prototype implementation in about 10,600 lines of Java code Ran the code on a laptop a dual-core 2.8 GHz CPU with 8G of RAM running Window 7 Indexes are on the disk (No explicit buffer to load the index) Experiments for the scalability of the STUN index by varying The size of the graph The complexity of queries The number of constraints in queries Queries are randomly generated from STUN KBs Each query has at least one answer. More than queries are tested 30

Experiment : Dataset YouTube dataset Vertices : people and groups 20% of groups have a region randomly assigned Edge relations ‘follow’ : person to person, a time interval ‘membership’ : person to group, a time interval ‘co-located’ : person to group, a time interval and a region Time intervals are randomly assigned to ‘follow’ and ‘membership’ relationships A ‘co-located’ edge is added between two members if They have ‘membership’ relationships with a same group And they have overlapped time interval with the same group And the same group has an assigned region 31

Experiment: Result Every single data point was obtained by running 200 queries. 32

Experiment: Result 33 The query processing time increases slightly super-linearly with the size of the database thought the slope of the graph increases with the complexity of the query.

Conclusion Introduce Spatio-Temporal Uncertainty (Social) network Define STUN query language Develop a disk based index structure Develop a query processing algorithm Do experiments for evaluating the STUN system 34

Questions 35