Presentation is loading. Please wait.

Presentation is loading. Please wait.

Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

Similar presentations


Presentation on theme: "Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi."— Presentation transcript:

1 Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi

2 SPARQL  A language for expressing queries to retrieve information from various data represented in RDF [SPARQL Spec.]  A query language with the capability to search graph patterns [SPARQL Spec.]  Often SPARQL queries contain  a basic graph pattern: a set of subject, object, predicate triple patterns  RDF terms possibly substituted with variables  Result of the query  a subgraph of the RDF data graph

3 Terminologies  RDF Terms: Given that I is the set of all IRIs, L is the set of all RDF literals and B is the set of all blank nodes in an RDF graph. Within the graph the set of all RDF Terms, T = I U L U B  RDF Dataset: D = {G, (I 1, G 1 ), (I 2, G 2 ),…(I k, G k )}, where G is the default graph (I i, G i ) are named graphs, i = 1 to k, k ≥ 0  An RDF dataset always contains a default graph, which does not have a name  It contains zero or more named graphs  Each named graph is identified by an IRI

4 Terminologies  Query Variable: A query variable, v ∈ V, where V is infinite and V ∩ T = ∅  Triple Pattern: A triple pattern P ∈ {(T U V) x (I U V) x (T U V)}  Solution Mapping: A solution mapping is a partial function M:V -> T where V is the query variable and T is the set of all RDF Terms  Solution Sequence: A list of solutions which might be unordered. Number of solutions might be zero, one or more.

5 Terminologies  Solution Sequence Modifier: (i) Order By (ii) Projection (iii) Distinct (iv) Reduced (v) Offset (vi) Limit  Others: IRI (Internationalized Resource Identifier), Lexical form, language tag (e.g., en, it), datatype IRI (e.g., xsd:boolean), literal, plain literal and typed literal  IRIs and URIs  URIs include a subset of the ASCII character set  IRIs can include Unicode characters (Universal Character Set)  ASK: to perform a test to know if a query expression has a solution. It replies with yes/no.

6 Query  Dataset: :paper1:title“Semantic Matching” Query Expression: SELECT ?title WHERE { :paper1 :title ?title. } Query Result: “Semantic Matching” title  Dataset: :paper1:creator“Fausto Giunchiglia” Query Expression: SELECT ?author WHERE { :paper1 :creator ?author. } Query Result: “Fausto Giunchiglia” SELECT query form returns RDF Terms bound to the variables

7 Query  Dataset: _:a :name "Tim Berners-Lee". _:a :homepage. _:b :name "Fausto Giunchiglia". _:b :homepage. Query Expression: SELECT ?name ?homepage WHERE { ?x :name ?name. ?x :homepage ?homepage } Query Result: Multiple Matches namehomepage Tim Berners-Lee Fausto Giunchiglia

8 Query  Dataset: :x :name "Tim Berners-Lee"@en. :y :name "Fausto Giunchiglia"@en. Query Expression 1: SELECT ?u WHERE { ?u :name "Tim Berners-Lee"} Query Result: u Query Expression 2: SELECT ?u WHERE { ?u :name "Tim Berners-Lee"@en} Query Result: u :x RDF Literals Matching This query has 0 solution because without language tag the search element does not match with dataset element This query has 1 solution because the inclusion of language tag bound u to :x

9 Building RDF Graphs  CONSTRUCT: this query construct returns an RDF graph  Dataset: _:a :creator "Tim Berners-Lee". _:b :creator "Fausto Giunchiglia". Query Expression: CONSTRUCT { ?x :name ?name } WHERE { ?x :creator ?name } Query Result: _:c :name "Tim Berners-Lee". _:d :name "Fausto Giunchiglia".  In this dataset with :creator we mean Dublin Core (dc) creator metadata  In the query with :name we mean FOAF name metadata  We built a graph with FOAF name attribute which was not available in the source dataset

10 RDF Term Restrictions  FILTER: solutions are restricted to those RDF Terms which match with the filter expression  Dataset: _:a :creator "Tim Berners-Lee". _:a :age 53. _:b :creator "Fausto Giunchiglia". _:b :age 54. Query Expression: SELECT ?author WHERE { ?x :creator ?author. FILTER regex(?author, "Tim") } Query Result: author "Tim Berners-Lee".  The above query can be made case insensitive by adding “i” flag in the filter as follows: FILTER regex(?author, “tim”, “i”) Query Expression: SELECT ?author ?age WHERE { ?x :creator ?author. ?x :age ?age FILTER (?age >53) } Query Result: authorage "Fausto Giunchiglia" 54

11 Querying Optional Pattern  OPTIONAL: to allow binding variables to RDF Terms to be included in the solution in case of availability  Dataset: _:a :creator "Tim Berners-Lee". _:a :age 53. _:a :homepage. _:b :creator "Fausto Giunchiglia". _:b :age 54. Query Expression: SELECT ?author ?homepage WHERE { ?x :creator ?author. OPTIONAL {?x :homepage ?homepage}} Query Result: author homepage "Tim Berners-Lee" "Fausto Giunchiglia"  It is a left associative operator  Why do we need it? All entities might not have the same set of attributes

12 ORDER BY Clause  ORDER BY: a facility to order a solution sequence  Dataset: _:a :creator "Tim Berners-Lee". _:a :age 53. _:b :creator "Fausto Giunchiglia". _:b :age 54. Query Expression: SELECT ?author WHERE { ?x :creator ?author; ?x :age ?age} ORDER BY ?author DESC (?age) Query Result: author "Fausto Giunchiglia" "Tim Berners-Lee"

13 DISTINCT and REDUCED Modifiers  DISTINCT: to remove duplicate from a solution sequence Dataset: _:b :creator "Fausto Giunchiglia". _:b :age 54. _:c :creator "Fausto Giunchiglia". _:c :age 54. Query Expression: SELECT DISTINCT ?creator WHERE { ?x :creator ?creator} Query Result: creator "Fausto Giunchiglia"  REDUCED: to permit the duplicates to be removed. Query Expression: SELECT REDUCED ?creator WHERE { ?x :creator ?creator} The cardinality of the elements in the solution set is at least one and no more than the cardinality without removing duplicates

14 OFFSET and LIMIT Clauses  OFFSET: to show the elements of the solution set starting after a specified number. If the number is zero, there will be no effect. Dataset: _:b :creator "Fausto Giunchiglia". _:b :age 54. _:c :creator "Tim Berners-Lee". _:c :age 53. Query Expression: SELECT ?author WHERE { ?x :creator ?author } ORDER BY ?author OFFSET 1 Query Result: author "Tim Berners-Lee"  Limit: to put an upper bound on the number of elements of the solution set returned Query Expression: SELECT ?author WHERE { ?x :creator ?author } ORDER BY ?author LIMIT 1 OFFSET 1 Query Result: author "Tim Berners-Lee"

15 Relational vs RDF queries  Relational queries consist of (among others):  Relational algebra of joins  Foreign key references  RDF queries consists of (among others):  (Logical) statements in triple form  Unification variables are used to connect graph patterns  A relational query:  Produces a new database table that is a combination of two or more input tables (partially or completely)  An RDF query:  Produces a subset of the input RDF graph  Simplifies some issues of table based queries, for example, no need to put subquery construct [D. Allemang and J. Hendler, 2008]

16 Turtle  Turtle is a terse RDF triple language  Turtle is a textual syntax for RDF facilitates writing RDF graph  in a compact and natural language text form  with abbreviations for common usage patterns and datatypes  compatible with triple pattern syntax of SPARQL (and N-Triples)  Triple:  a sequence of (subject, predicate, object) terms  separated by whitespace  terminated by '.' after each triple e.g.,.  List of predicates and objects:  For the same subject can be codified without repeating the common part  Reference to the subject of the previous triple is indicated by the use of a semicolon e.g., ; "Weaving the Web".

17 Turtle  List of objects:  For the same subject and predicate, triples can be codified without repeating the common part  Reference to the subject and predicate of the previous triple is indicated by the use of a comma e.g., "Tim Berners-Lee", "TBL", "Tim BL".  SPARQL differs from Turtle:  SPARQL permits RDF Literals as the subject of RDF triples  SPARQL permits variables (?name or $name) in any part of the triple of the form  prefix and base declarations  Turtle allows prefix and base declarations anywhere outside of a triple  In SPARQL, they are only allowed in the Prologue (at the start of the SPARQL query)  case sensitivity  SPARQL uses case insensitive keywords, except for 'a‘, where 'a' means the IRI http://www.w3.org/1999/02/22-rdf-syntax-ns#type  Turtle's prefix and base declarations are case sensitive  'true' and 'false' are case insensitive in SPARQL and case sensitive in Turtle  TrUe is not a valid boolean value in Turtle

18 SPARQL: Federated Query  Federated SPARQL query can be used to express queries across diverse data sources if  data is stored natively as RDF or  data is viewed as RDF via middleware  It is an opportunity for data consumers to get data distributed across the Web  Federated Query is used for executing queries distributed over different SPARQL endpoints  The SERVICE keyword  allows a query author to direct a portion of a query to a particular SPARQL endpoint  supports SPARQL queries merging data distributed across the Web

19 SPARQL: Federated Query  An example query to a remote SPARQL endpoint  Consider a query to find the names of the people we know  Data about the names of various people is available at the http://people.example.org/sparql endpoint: and one wants to combine with a local FOAF file http://example.org/myfoaf.rdf that contains the single triple: QUERY: PREFIX foaf: SELECT ?name FROM WHERE { foaf:knows ?person. SERVICE { ?person foaf:name ?name. } } This query, on the datasets provided above, has one solution: RESULT: Name ---------- James DATASET 1: @prefix foaf:. @prefix :. :people1 foaf:name "Tim BL". :people2 foaf:name "James". DATASET 2:.

20 SPARQL: Federated Query  An example query with OPTIONAL to two remote SPARQL endpoints  Consider we want to query people and optionally obtain their interests and the names of people they know  Data in the default graph at remote SPARQL endpoint: At http://people.example.org/sparql DATASET 1: @prefix foaf:. @prefix :. :people1 foaf:name "Tim BL". :people2 foaf:name "James". :people3 foaf:name "Jerome". :people3 foaf:interest. At http://people2.example.org/sparql DATASET 2: @prefix foaf:. @prefix :. :people1 foaf:knows :people21. :people21 foaf:name " Chris". :people3 foaf:knows :people22. :people22 foaf:name “Frank". QUERY: PREFIX foaf: SELECT ?person ?interest ?known WHERE { SERVICE { ?person foaf:name ?name. OPTIONAL { ?person foaf:interest ?interest. SERVICE { ?person foaf:knows ?known. } } This query, on the datasets provided above, has three solutions: RESULT: person interest known --------------------------------------------------------- Tim BL James Jerome

21 SPARQL Update  Graph Store: an editable repository of RDF graphs managed by a single service  Update service: a service (often referred to by the informal term SPARQL endpoint) that accepts and processes update requests  Similarly to RDF dataset a Graph Store contains one (unnamed) slot holding a default graph and zero or more named slots holding named graphs  Formal definition: Graph Store The Graph Store can be viewed as a mutable RDF Dataset. GS = {DG, (iri 1, G 1 ),..., (iri n, G n ) }  Where the default graph DG is the RDF graph associated with the unnamed slot  n ≥ 0 and for each 1 ≤ i ≤ n, G i is an RDF graph associated with the named slot identified by IRI iri i  all IRIs are distinct, i.e., i≠j implies iri i ≠iri j

22 SPARQL Update  Update operations can specify the named graph(s) to be edited. In case the named graph is not mentioned, the operation is performed on the default graph  The unnamed or default graph may refer to a separate graph, a graph describing the named graphs, a representation of a union of other graphs, and so on  Unlike an RDF Dataset, named graphs can be added to or deleted from a Graph Store  A Graph Store can keep local copies of RDF graphs defined elsewhere on the Web and modify those copies independently of the original graph  SPARQL Update supports two categories of update operations on a Graph Store  Graph Update - addition and removal of triples from some graphs  Graph Management - creation and deletion of graphs and the graph update operations (to add, move, and copy graphs) for managing graphs  In the case where there is one unnamed graph and no named graphs, SPARQL Update can be used  as a graph update language (as opposed to a Graph Store update language)

23 References  SPARQL Spec. (2008). W3C Recommendation.  SPARQL 1.1 Federated Query (2013). W3C Recommendation.  SPARQL 1.1 Update (2013). W3C Recommendation.  Turtle Terse RDF Triple Language (2013). W3C Recommendation.  D. Allemang and J. Hendler. Semantic web for the working ontologist: modeling in RDF, RDFS and OWL. Morgan Kaufmann Elsevier, Amsterdam, NL, 2008.


Download ppt "Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi."

Similar presentations


Ads by Google