Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao 1 1 gStore: Answering SPARQL Queries Via Subgraph Matching 1 Peking University, 2 Hong.

Similar presentations


Presentation on theme: "Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao 1 1 gStore: Answering SPARQL Queries Via Subgraph Matching 1 Peking University, 2 Hong."— Presentation transcript:

1 Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao 1 1 gStore: Answering SPARQL Queries Via Subgraph Matching 1 Peking University, 2 Hong Kong University of Science and Technology, 3 University of Waterloo

2 Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions 2

3 Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions 3

4 Semantic Web 4 “Semantic Web Technologies” is a collection of standard technologies to realize a Web of Data.

5 RDF Data Model 5 URI Literals

6 RDF Graph 6 Entity Vertex Literal Vertex

7 SPARQL Queries 7 SPARQL Query: Select ?name Where { ?m ?name. ?m “ ”. ?m “ ”. } Query Graph

8 Subgraph Match vs. SPARQL Queries 8

9 Naïve Triple Store 9 SPARQL Query: Select ?name Where { ?m ?name. ?m “ ”. ?m “ ”. } SQL: Select T3.Subject From T as T1, T as T2, T as T3 Where T1.Predict=“BornOnDate” and T1.Object=“ ” and T2.Predict=“DiedOnDate” and T2.Object=“ ” and T3. Predict=“hasName” and T1.Subject = T2.Subject and T2. Subject= T3.subject Too many Self-Joins

10 Existing Solutions Three categories of solutions are proposed to speed up query processing: 1.Property Table; Jena [K. Wilkinson et al. SWDB 03], … 2. Vertically Partitioned Solution; SW-store [D. J. Abadi et al. VLDB 07],… 3. Exhaustive-Indexing RDF-3x [T. Neumann et al. VLDB 08], Hexastore [C. Weiss et al. VLDB 08 ],… 10

11 Existing Solutions-Property Table 11 SPARQL Query: Select ?name Where { ?m ?name. ?m “ ”. ?m “ ”. } SQL: Select People.hasName from People where People.BornOnDate = “ ” and People.DiedOnDate = “ ”. Reducing # of join steps

12 Existing Solutions- Vertically Partitioned Solution 12 Fast Merge Join

13 Existing Solutions- Exhaustive-Indexing Each SPARQL query statement can be translated into one “range query”. SPARQL Query: Select ?name Where { ?m ?name. ?m “ ”. ?m “ ”. } 13 Range query & Merge Join

14 Some Limitations 1.Difficult to handle ``wildcard queries’’. 2.Difficult to handle updates. 14

15 Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions 15

16 Intuition of gStore 16 Finding Matches over a Large Graph is not a trivial task.

17 Preliminaries 17 Entity Vertex Literal Vertex

18 Storage Schema in gStore 18 Encoding all neibhors into a “bit-string”, called signature.

19 Encoding Technique (1) 19 “Abr”, “bra”, ”rah”, ”aha”, …., ( hasName, “Abraham Lincoln”) OR ( BornOnDate, “ ”) ( DiedOnDate, “ ”) ( DiedIn, “y:Washington_D.c”) OR

20 Encoding Technique (2) 20

21 Encoding Technique (3) 21 Finding Matches over signature graph G* Verify Each Match in RDF Graph G

22 Outline Background & Related Work Overview of gStore Encoding Technique VS-tree & Query Algorithm Experiments Conclusions 22

23 A Straightforward Solution (1) u1u1 u2u2 L1L1 L2L2

24 A Straightforward Solution (2) Large Join Space !  L1L1 L2L2

25 VS-tree 25

26 Pruning Technique 26 u1u1 u2u Reduced Join Space!

27 An Example for Pruning Effect 27 Query: ?x1 y:hasGivenName ?x5 ?x1 y:hasFamilyName ?x6 ?x1 rdf:type ?x1 y:bornIn ?x2 ?x1 y:hasAcademicAdvisor ?x4 ?x2 y:locatedIn ?x3 y:locatedIn ?x4 y:bornIn ?x3 Before Pruning After Pruning x1 810 X x3 66 x

28 Query Algorithm-Top-Down 28

29 Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions 29

30 Datasets 30 Triple #Size Yago20 million3.1GB DBLP8 million0.8 GB

31 Exact Queries 31

32 Wildcard Queries 32

33 Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions 33

34 Conclusions Vertex Encoding Technique; An Efficient index Structure: VS-tree; A Novel Filtering Technique. 34

35 35

36 Updates- Insertion in G* 36

37 Updates- Insertion in VS*-tree 37

38 Updates- Deletion in VS*-tree 38 To be deleted

39 Framework in gStore 39 Finding Candidate Matches over G* Verify Each Candidate Match

40 A Straightforward Solution (1) 40 uu & 001 = u


Download ppt "Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao 1 1 gStore: Answering SPARQL Queries Via Subgraph Matching 1 Peking University, 2 Hong."

Similar presentations


Ads by Google