Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao 1 1 gStore: Answering SPARQL Queries Via Subgraph Matching 1 Peking University, 2 Hong.

Similar presentations


Presentation on theme: "Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao 1 1 gStore: Answering SPARQL Queries Via Subgraph Matching 1 Peking University, 2 Hong."— Presentation transcript:

1 Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao 1 1 gStore: Answering SPARQL Queries Via Subgraph Matching 1 Peking University, 2 Hong Kong University of Science and Technology, 3 University of Waterloo

2 Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions 2

3 Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions 3

4 Semantic Web 4 “Semantic Web Technologies” is a collection of standard technologies to realize a Web of Data.

5 RDF Data Model 5 URI Literals

6 RDF Graph 6 Entity Vertex Literal Vertex

7 SPARQL Queries 7 SPARQL Query: Select ?name Where { ?m ?name. ?m “1809-02-12”. ?m “1865- 04-15”. } Query Graph

8 Subgraph Match vs. SPARQL Queries 8

9 Naïve Triple Store 9 SPARQL Query: Select ?name Where { ?m ?name. ?m “1809-02-12”. ?m “1865-04-15”. } SQL: Select T3.Subject From T as T1, T as T2, T as T3 Where T1.Predict=“BornOnDate” and T1.Object=“1809-02-12” and T2.Predict=“DiedOnDate” and T2.Object=“1865-04-15” and T3. Predict=“hasName” and T1.Subject = T2.Subject and T2. Subject= T3.subject Too many Self-Joins

10 Existing Solutions Three categories of solutions are proposed to speed up query processing: 1.Property Table; Jena [K. Wilkinson et al. SWDB 03], … 2. Vertically Partitioned Solution; SW-store [D. J. Abadi et al. VLDB 07],… 3. Exhaustive-Indexing RDF-3x [T. Neumann et al. VLDB 08], Hexastore [C. Weiss et al. VLDB 08 ],… 10

11 Existing Solutions-Property Table 11 SPARQL Query: Select ?name Where { ?m ?name. ?m “1809-02-12”. ?m “1865-04-15”. } SQL: Select People.hasName from People where People.BornOnDate = “1809-02-12” and People.DiedOnDate = “1865-04-15”. Reducing # of join steps

12 Existing Solutions- Vertically Partitioned Solution 12 Fast Merge Join

13 Existing Solutions- Exhaustive-Indexing Each SPARQL query statement can be translated into one “range query”. SPARQL Query: Select ?name Where { ?m ?name. ?m “1809-02-12”. ?m “1865-04-15”. } 13 Range query & Merge Join

14 Some Limitations 1.Difficult to handle ``wildcard queries’’. 2.Difficult to handle updates. 14

15 Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions 15

16 Intuition of gStore 16 Finding Matches over a Large Graph is not a trivial task.

17 Preliminaries 17 Entity Vertex Literal Vertex

18 Storage Schema in gStore 18 Encoding all neibhors into a “bit-string”, called signature.

19 Encoding Technique (1) 19 “Abr”, “bra”, ”rah”, ”aha”, …., ( hasName, “Abraham Lincoln”) 0010 0000 0000 0000 0010 0000 0000 1000 0000 0000 0000 0000 0000 0100 0000 0000 0000 0000 0001 1000 0010 0100 0001 OR 1000 0010 0100 0001 ( BornOnDate, “1809-02-12”) 0100 0000 00000100 0010 0100 1000 ( DiedOnDate, “1865-04-15”) 0000 1000 00000000 0010 0100 0000 ( DiedIn, “y:Washington_D.c”) 0000 0010 00001000 0010 0100 0001 0000 0010 00001100 0010 0100 1001 OR

20 Encoding Technique (2) 20

21 Encoding Technique (3) 21 Finding Matches over signature graph G* Verify Each Match in RDF Graph G

22 Outline Background & Related Work Overview of gStore Encoding Technique VS-tree & Query Algorithm Experiments Conclusions 22

23 A Straightforward Solution (1) 23 001 004 006 002 003 006 u1u1 u2u2 L1L1 L2L2

24 A Straightforward Solution (2) 24 001 004 006 002 003 006 Large Join Space !  L1L1 L2L2

25 VS-tree 25

26 Pruning Technique 26 u1u1 u2u2 10010 001 004 006 002 003 006 Reduced Join Space!

27 An Example for Pruning Effect 27 Query: ?x1 y:hasGivenName ?x5 ?x1 y:hasFamilyName ?x6 ?x1 rdf:type ?x1 y:bornIn ?x2 ?x1 y:hasAcademicAdvisor ?x4 ?x2 y:locatedIn ?x3 y:locatedIn ?x4 y:bornIn ?x3 Before Pruning After Pruning x1 810 X2 424197 x3 66 x4 361876686

28 Query Algorithm-Top-Down 28

29 Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions 29

30 Datasets 30 Triple #Size Yago20 million3.1GB DBLP8 million0.8 GB

31 Exact Queries 31

32 Wildcard Queries 32

33 Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions 33

34 Conclusions Vertex Encoding Technique; An Efficient index Structure: VS-tree; A Novel Filtering Technique. 34

35 35 zoulei@pku.edu.cn

36 Updates- Insertion in G* 36

37 Updates- Insertion in VS*-tree 37

38 Updates- Deletion in VS*-tree 38 To be deleted

39 Framework in gStore 39 Finding Candidate Matches over G* Verify Each Candidate Match

40 A Straightforward Solution (1) 40 uu & 001 = u


Download ppt "Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao 1 1 gStore: Answering SPARQL Queries Via Subgraph Matching 1 Peking University, 2 Hong."

Similar presentations


Ads by Google