Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

Similar presentations


Presentation on theme: "1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier."— Presentation transcript:

1 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier

2 2 SPARQLing Constraints for RDF RDF Data Format Machine-readable information Established in the Semantic Web SPARQL Query Language Declarative Language W3C Recommendation since Jan. Constraints Primary and foreign keys Cardinality constraints, … bases on Extension of RDF by constraints With fixed semantics Integration into the Framework The role of SPARQL in this context Extracting constraints Checking constraints Optimization of SPARQL queries under constraints

3 3 Why Constraints? Restricting the state space of the database Maintenance of data consistency (e.g. when data is updated) Semantic Query Optimization Better understanding of the data Here: Translation of Relational Schemata to RDF without loss of information

4 4 The RDF Data Format „Fred“ Teachers t1t2 „43“ „CS“ name faculty rdf:type „Joe“ name age knows „Triples of Knowledge“ (t1, name, „Joe“), (t1, faculty, „CS“), (t1, knows, t2)

5 5 The RDF Data Format „Fred“ Teachers t1t2 „43“ „CS“ name faculty rdf:type „Joe“ name age knows Three elementary types URIs (describe physical/logical entities & properties) Literals (string values) Blank Nodes (not conisdered)

6 6 A Relational Data Scheme namefaculty JoeCS FredCS matricname 11111John 22222Ed taught_byname JoeDB FredWeb c_ids_id Fred11111 Fred22222 TeachersStudents Courses Participants + NOT NULL constraints on each column

7 7 A Translation into RDF Students name Teachers Courses t1 t2 s1 s2 c1 c2 Joe Fred “CS“ 1111122222 “John“ “Ed“ “DB“ “Web“ name matric faculty taught_by Participants p1 p2 s_id c_id rdf:type Problem: Constraints only implicitly given!

8 8 Constraints for RDF Encoding in the schema layer New namespace „rdfc“ provides constraint vocabulary with fixed semantics rdfc:Key for primary keys rdfc:FKey for foreign keys rdfc:ref links foreign keys to primary keys Use built-in RDF container class rdf:Seq

9 9 taught_by Courses c1 c2 “DB“ “Web“ name taught_by rdfc:FKey name T_Key rdfc:Key rdf:_1 name rdfc:Key rdf:Seq name Teachers t1 t2 JoeFred“CS“ faculty C_FKey rdfc:FKey rdf:Seq rdfc:ref rdf:_1 Encoding Constraints

10 10 Types of Constraints Let C, C 1, C 2 be classes and Q i, R i properties Primary keys, foreign keys Key(C,[Q 1,…Q n ]), FKey(C 1,[Q 1,…Q n ],C 2,[R 1,…R n ]) Cardinality constraints Min(C,n,R), Max(C,n,R) for n N Functionality constraints, totality constraints Func(C,Q), Total(C,Q) and many more in the full paper: singleton, subclass, subproperty, property domain, property range

11 11 Satisfiability Given an RDF vocabulary and a set of constraints. Is there a non-empty RDF graph that satisfies the constraints? in general undecidable Shown by reduction from the key implication problem in Relational Databases In the paper, we indicate satisfiable constraint subclasses decidable constraint subclasses

12 12 The SPARQL Query Language SELECT ?name ?faculty ?title WHERE { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. OPTIONAL { ?teacher title ?title. } Declarative language Bases upon graph patterns that are matched against the input graph Different operators to combine these patterns AND („.“) OPTIONAL UNION FILTER

13 13 SPARQL Query Evaluation SELECT ?name ?faculty ?title WHERE { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. OPTIONAL { ?teacher title ?title. } } title „Professor“ ?name?faculty?title Joe“CS“ Fred“CS““Professor“ name Teachers t1 t2 JoeFred“CS“ name faculty ?teacher ?name ?faculty ?title : unbound Variables are matched against the input graph

14 14 Extracting Key Constraints SELECT ?keyname ?class ?keyatt WHERE { ?class rdfc:Key ?keyname. ?keyname rdf:type rdfc:Key. ?keyname ?seq ?keyatt. FILTER (?seq!=rdf:type) } ?keyname?class?keyatt T_KeyTeachersname T_Key rdfc:Key rdf:_1 name rdfc:Key rdf:Seq Teachers …… Extraction of foreign keys very similar

15 15 Constraint checks possible for many types constraints A SPARQL query checks a constraint C if it returns yes for each graph that violates C, no otherwise. Use SPARQL „ASK“ query form (returns „yes“ exactly if query contains a result, „no“ otherwise) Checking Constraints with SPARQL

16 16 Checking primary key constraints ASK { ?x rdf:type C. ?y rdf:type C. ?x p1 ?p1; [...]; pn ?pn. ?y p1 ?p1; [...]; pn ?pn. FILTER (?x!=?y) } Key(C,[p1,...,pn]) Returns „yes“ exactly if constraint is violated. Checking Constraints with SPARQL Checking of foreign keys is a little more complicated, but also possible

17 17 Semantic Query Optimization Idea: use constraint knowledge to find a more efficient query execution plan Has been studied in the context of relational and datalog databases… … and now is applicable in the context of RDF and SPARQL

18 18 Semantic Query Optimization SELECT ?teachername ?coursename ?studentname WHERE { ?course rdf:type Courses; taught_by ?teachername; name ?coursename. ?participant rdf:type Participants; c_id ?teachername; s_id ?studentmatric. ?teacher rdf:type Teachers; name ?teachername. OPTIONAL { ?student rdf:type Students; matric ?studentmatric; name ?studentname. }

19 19 Students name Teachers Courses t1 t2 s1 s2 c1 c2 Joe Fred “CS“ 11111 22222 “John“ “Ed“ “DB“ “Web“ name matric faculty taught_by Participants p1 p2 s_id c_id A Solution Candidate Subgraph

20 20 Semantic Query Optimization SELECT ?teachername ?coursename ?studentname WHERE { ?course rdf:type Courses; taught_by ?teachername; name ?coursename. ?participant rdf:type Participants; c_id ?teachername; s_id ?studentmatric. ?teacher rdf:type Teachers; name ?teachername. OPTIONAL { ?student rdf:type Students; matric ?studentmatric; name ?studentname. } Key(Students,[matric]) FKey(Participants, [s_id], Students, [matric]) Total(Students,[name])

21 21 Semantic Query Optimization SELECT ?teachername ?coursename ?studentname WHERE { ?course rdf:type Courses; taught_by ?teachername; name ?coursename. ?participant rdf:type Participants; c_id ?teachername; s_id ?studentmatric. ?teacher rdf:type Teachers; name ?teachername. ?student rdf:type Students; matric ?studentmatric; name ?studentname. } Key(Teacher, [name]) FKey(Courses, taught_by, Teacher, [name])

22 22 Semantic Query Optimization SELECT ?teachername ?coursename ?studentname WHERE { ?course rdf:type Courses; taught_by ?teachername; name ?coursename. ?participant rdf:type Participants; c_id ?teachername; s_id ?studentmatric. ?student rdf:type Students; matric ?studentmatric; name ?studentname. } Many more optimizations possible Rewriting of filter expressions Elimination of redundant rdf:type specifications

23 23 Future Work Study of other types of constraints and the interaction between constraints Development of a schematic approach to Semantic Query Optimization Mapping to SQL/Datalog? SPARQL-specific semantic optimizations? Efficient constraint checking algorithms

24 24 Thank you for your attention! C. Bizer.D2R MAP-A Database to RDF Mapping Language. In WWW (Posters), 2003. C.Bizer, R.Cyganiak, J. Garbers, and O. Maresch. D2RQ: Treading Non-RDF Relational Databases as Virtual RDF Graphs. User Manual and Language Specification. J. J. King. QUIST: A System for Semantic Query Optimization in Relational Databases. Distributed systems, Vol. II, pages 287-294, 1986. G. Lausen. Relational Databases in RDF. In Joint ODBIS & SWDB Workshop on Semantic Web, Ontologies, Databases, 2007. B. Motik, I. Horrocks, and U. Sattler. Bridging the Gap Between OWL and Relational Databases, In WWW, pages 807-816, 2007. J. Pérez, M. Arenas, and C. Gutierrez. Semantics and Complexity of SPARQL. In CoRR Technical Report cs.DB/0605124, 2006. Recourse Description Framework (RDF): Concepts and Abstract Syntax. http://www.w3.org/TR/rdf-schema/. W3C Recommendation, February 10, 2004. http://www.w3.org/TR/rdf-schema/ RDF Vocabulary Description Language 1.0: RDF Schema. http://www.w3.org/TR/rdf-schema/http://www.w3.org/TR/rdf-schema/. W3C Recommendation, Febuary 10, 2004. RDF Semantics. http://www.w3.org/TR/rdf-mt/http://www.w3.org/TR/rdf-mt/. W3C Recommendation, February 10, 2004. S.T. Shenoy and Z.M. Ozsoyoglu. A System for Semantic Query Optimization. In SIGMOD, pages 181-195, 1987. SPAQL Query Language for RDF. http://www.w3.org/TR/rdf-sparql-query/. W3C Proposed Recommendation, November 12, 2007. http://www.w3.org/TR/rdf-sparql-query/ G.E. Weddell. A Theory of Functional Dependencies for Object-Oriented Data Models. In DOOD, pages 165-184, 1989.

25 25 Additional Resources

26 26 Checking Constraints with SPARQL Checking foreign key constraints ASK { ?x rdf:type C; p1 ?p1; [...]; pn ?pn. OPTIONAL { ?y rdf:type D; q1 ?p1; [...]; qn ?pn. } FILTER (!bound(?y)) } FKey(C,[p1,...,pn],D,[q1,... qn]) Bind objects of type C, with properties bound to ?p1, …, ?pn Bind the (referenced) object to variable ?y, if any Only keep results for which no referenced object exists

27 27 RDFS Constraints Let C i denote classes, Q i denote properties Subclass Constraint SubC(C 1,C 2 ) Subproperty Constraint SubP(Q 1,Q 2 ) Property Domain/Range PropD(Q,C), PropR(Q,C) Restrict the state space of the database No „axioms“ that are used for inferencing

28 28 Satisfiability Given an RDF vocabulary and a set of constraints. Is there a non-empty RDF graph that satisfies the constraints? in general undecidable  Primary keys + Foreign Keys  Singleton  Max-Cardinality  Subclass + Subproperty  Property Domain + Property Range always satisfiable

29 29 Satisfiability Given an RDF vocabulary and a set of constraints. Is there a non-empty RDF graph that satisfies the constraints?  Primary keys + Foreign Keys  Singleton  Max-Cardinality  Subclass + Subproperty  Property Domain + Property Range  Min-Cardinality undecidable in general undecidable

30 30 Satisfiability Given an RDF vocabulary and a set of constraints. Is there a non-empty RDF graph that satisfies the constraints?  Unary primary keys  Unary foreign keys  Min-Cardinality + Max-Cardinality  Subclass + Subproperty  Property Domain + Property Range decidable in ExpTime in general undecidable

31 31 The SPARQL Query Language SELECT ?name ?faculty WHERE { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. } name Teachers t1 t2 JoeFred“CS“ name faculty ?name?faculty Joe“CS“ Fred“CS“ Operator AND („.“)

32 32 The SPARQL Query Language Operator UNION SELECT ?name ?faculty WHERE { { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. FILTER (?name=„Joe“). } UNION { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. FILTER (?name=„Fred“). } ?name?faculty Joe“CS“ Fred“CS“ name Teachers t1 t2 JoeFred“CS“ name faculty

33 33 The SPARQL Query Language SELECT ?name ?faculty WHERE { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. FILTER (?name=„Joe“) } name Teachers t1 t2 JoeFred“CS“ name faculty ?name?faculty Joe“CS“ Operator FILTER

34 34 The SPARQL Query Language SELECT ?name ?faculty ?title WHERE { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. OPTIONAL { ?teacher title ?title. } title „Professor“ ?name?faculty?title Joe“CS“ Fred“CS““Professor“ name Teachers t1 t2 JoeFred“CS“ name faculty Operator OPTIONAL


Download ppt "1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier."

Similar presentations


Ads by Google