SPARQL: A query language for RDF

Slides:



Advertisements
Similar presentations
1 SPARQL: A query language for RDF Matthew Yau
Advertisements

Alexandra Cristea & Matthew Yau 1.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Introduction to RDF Based on tutorial at
1 RDF Tutorial. C. Abela RDF Tutorial2 What is RDF? RDF stands for Resource Description Framework It is used for describing resources on the web Makes.
RDF – RESOURCE DESCRIPTION FRAMEWORK Antonio Bucchiarone FBK-IRST Trento, Italy 20 Novembre 2009.
ESDSWG2011 – Semantic Web session Semantic Web Sub-group Session ESDSWG 2011 Meeting – Semantic Web sub-group session Wednesday, November 2, 2011 Norfolk,
RDF Tutorial.
© Copyright IBM Corporation 2014 Getting started with Rational Engineering Lifecycle Manager queries Andy Lapping – Technical sales and solutions Joanne.
 Copyright 2004 Digital Enterprise Research Institute. All rights reserved. SPARQL Query Language for RDF presented by Cristina Feier.
SPARQL RDF Query.
SPARQL for Querying PML Data Jitin Arora. Overview SPARQL: Query Language for RDF Graphs W3C Recommendation since 15 January 2008 Outline: Basic Concepts.
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
Dr. Alexandra I. Cristea RDF.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
Semantic Web Presented by: Edward Cheng Wayne Choi Tony Deng Peter Kuc-Pittet Anita Yong.
Semantic Web Query Processing with Relational Databases Artem Chebotko Department of Computer Science Wayne State University.
Module 2b: Modeling Information Objects and Relationships IMT530: Organization of Information Resources Winter, 2007 Michael Crandall.
Semantic Web Andrejs Lesovskis. Publishing on the Web Making information available without knowing the eventual use; reuse, collaboration; reproduction.
Semantic Web Bootcamp Dominic DiFranzo PhD Student/Research Assistant Rensselaer Polytechnic Institute Tetherless World Constellation.
Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Publishing data on the Web (with.
Semantic Web Series 1 Mohammad M. R. Cowdhury UniK, Kjeller.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
SPARQL All slides are adapted from the W3C Recommendation SPARQL Query Language for RDF Web link:
1cs The Need “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee,
Introduction to SPARQL. Acknowledgements This presentation is based on the W3C Candidate Recommendation “SPARQL Query Language for RDF” from
Logics for Data and Knowledge Representation
Chapter 3 Querying RDF stores with SPARQL. Why an RDF Query Language? Why not use an XML query language? XML at a lower level of abstraction than RDF.
By: Dan Johnson & Jena Block. RDF definition What is Semantic web? Search Engine Example What is RDF? Triples Vocabularies RDF/XML Why RDF?
SPARQL W3C Simple Protocol And RDF Query Language
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
Lesley Charles November 23, 2009.
SPARQL AN RDF Query Language. SPARQL SPARQL is a recursive acronym for SPARQL Protocol And Rdf Query Language SPARQL is the SQL for RDF Example: PREFIX.
SPARQL All slides are adapted from the W3C Recommendation SPARQL Query Language for RDF Web link:
RDF and XML 인공지능 연구실 한기덕. 2 개요  1. Basic of RDF  2. Example of RDF  3. How XML Namespaces Work  4. The Abbreviated RDF Syntax  5. RDF Resource Collections.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
1 SPARQL A. Emrah Sanön. 2 RDF RDF is quite committed to Semantic Web. Data model Serialization by means of XML Formal semantics Still something is missing!
Semantic Web Basics Dominic DiFranzo PhD Student/Research Assistant Rensselaer Polytechnic Institute Tetherless World Constellation.
Using the DAWG Test Cases with Relational Databases Matthew Gheen October 26, 2007.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Dr. Lowell Vizenor Ontology and Semantic Technology Practice Lead Alion Science and Technology Semantic Technology: A Basic Introduction.
05/01/2016 SPARQL SPARQL Protocol and RDF Query Language S. Garlatti.
Alexandra Cristea 1.  pronounced "sparkle“  recursive acronym for: ◦ SPARQL Protocol and RDF Query Language  a semantic query language  a query language.
RDF & SPARQL Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015.
CC L A W EB DE D ATOS P RIMAVERA 2015 Lecture 7: SPARQL (1.0) Aidan Hogan
Tool for Ontology Paraphrasing, Querying and Visualization on the Semantic Web Project By Senthil Kumar K III MCA (SS)‏
Semantic Web in Depth RDFa, GRDDL and POWDER Dr Nicholas Gibbins
Semantic Web in Depth SPARQL Protocol and RDF Query Language Dr Nicholas Gibbins –
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
Linked Data & Semantic Web Technology The Semantic Web Part 4. Resource Description Framework (1) Dr. Myungjin Lee.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
SPARQL Query Andy Seaborne. Apache Jena he.org/jena ● Open source - Apache License ● Apache Incubator (accepted November 2010) ●
1 RDF Storage and Retrieval Systems Jan Pettersen Nytun, UiA.
Charlie Abela Department of Intelligent Computer Systems
The Semantic Web By: Maulik Parikh.
Vincenzo Maltese, Fausto Giunchiglia University of Trento
CC La Web de Datos Primavera 2017 Lecture 7: SPARQL [i]
SPARQL.
Introduction to SPARQL
Service-Oriented Computing: Semantics, Processes, Agents
SPARQL SPARQL Protocol and RDF Query Language
Grid Computing 7700 Fall 2005 Lecture 18: Semantic Grid
Logics for Data and Knowledge Representation
CC La Web de Datos Primavera 2016 Lecture 7: SPARQL (1.0)
Grid Computing 7700 Fall 2005 Lecture 18: Semantic Grid
New Perspectives on XML
JSON for Linked Data: a standard for serializing RDF using JSON
Logics for Data and Knowledge Representation
Semantic-Web, Triple-Strores, and SPARQL
Presentation transcript:

SPARQL: A query language for RDF Matthew Yau C.Y.Yau@warwick.ac.uk

What is SPARQL RDF is a format for representing general data about resources. RDF is based on a graph, where subject and object nodes are related by predicate arcs. RDF can be written in XML, or as triples. RDF schema is a method for defining structures for RDF files. It allows RDF resources to be grouped into classes, and allows subclass, subproperty and domain/range descriptions to be specified. SPARQL is a query language for RDF. It provides a standard format for writing queries that target RDF data and a set of standard rules for processing those queries and returning the results.

Object RDF Statements Predicate Subject author Jan Egil Refsnes http://www.w3schools.com/RDF ?subject ?predicate ?object SPARQL searches for all subgraphs that match the graph described by the triples in the query.

A sample of SPARQL SELECT ?student WHERE { ?student b:studies bmod:CS328 }

Prefixes & namespaces <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:cd="http://www.recshop.fake/cd#"> b:studies bmod:CS328 PREFIX b: http://... PREFIX bmod: http://www2.warwick.ac.uk/fac/sci/dcs/teaching/material/ The PREFIX keyword is SPARQL’s version of an xmlns:namespace declaration and works in basically the same way. The xmlns:cd namespace, specifies that elements with the cd prefix are from the namespace "http://www.recshop.fake/cd#".

Try-out at: This link

SPARQL basics SPARQL is not based on XML, so it does not follow the syntax conventions seen before. Names beginning with a ? or a $ are variables. Graph patterns are given as a list of triple patterns enclosed within braces {} The variables named after the SELECT keyword are the variables that will be returned as results. (~SQL)

Combining conditions PREFIX b: http://www2.warwick.ac.uk/rdf/ PREFIX bmod: http://www2.warwick.ac.uk/fac/sci/dcs/teaching/material/ PREFIX foaf: http://xmlns.com/foaf/0.1/ SELECT ?name WHERE { ?student b:studiesbmod:CS328 . ?student foaf:name ?name }

FOAF FOAF (Friend Of A Friend) is an experimental project using RDF, which also defines a standardised vocabulary. The goal of FOAF is to make personal home pages machine-readable, and machine-understandable, thereby creating an internet-wide connected database of people. Friend-of-a-friend: http://xmlns.com/foaf/0.1/

FOAF FOAF is based on the idea that most personal home pages contain similar sets of information. For example, the name of a person, the place they live, the place they work, details on what they are working on at the moment, and links to their friends. FOAF defines RDF predicates that can be used to represent these things. These pages can then be understood by a computer and manipulated. In this way, a database can be created to answer questions such as “what projects are my friends working on?”, “do any of my friends know the director of BigCorp?” and similar.

A sample FOAF document <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1"> <foaf:Person> <foaf:name>Matthew Yau</foaf:name> <foaf:mbox rdf:resource="mailto:C.Y.Yau@warwick.ac.uk " /> </foaf:Person> </rdf:RDF>

FOAF You can see that FOAF stores details about a person's name, workplace, school, and people they know. Note that the meaning of “knows” is deliberately ambiguous. In spite of the “friend of a friend” title, the presence of a “knows” relation does not mean that the people are friends. Likewise, it does not mean that the relationship is reciprocal. Equally, the lack of a “knows” relation does not mean the people do not know each other Thus, unless extra rules are applied, it is not possible to check for reciprocation by looking for a knows relation from the other person.

Multiple results PREFIX b: < http://www2.warwick.ac.uk/rdf/> PREFIXbmod: < http://www2.warwick.ac.uk/fac/sci/dcs/teaching/material/ > PREFIX foaf: http://xmlns.com/foaf/0.1/ SELECT ?module ?name WHERE { ?student b:studies ?module . ?student foaf:name ?name }

Abbreviating the same subject PREFIX b: < http://www2.warwick.ac.uk/rdf/> PREFIXbmod: < http://www2.warwick.ac.uk/fac/sci/dcs/teaching/material/ > PREFIX foaf: http://xmlns.com/foaf/0.1/ SELECT ?module ?name WHERE { ?student b:studies ?module ; foaf:name ?name }

Abbreviating multiple objects SELECT ?module ?name WHERE { ?student b:studies ?module . ?student b:studies bmod:CS328 ; foaf:name ?name } is identical to WHERE { ?student b:studies ?module , bmod:CS328 ;

Optional graph components SELECT ?student ?email WHERE { ?student b:studies mod:CS328 . ?student foaf:mbox ?email } The above query returns the names and e-mail addresses of students studying CS328. However, if a student does not have an e-mail address registered, with a foaf:mbox predicate, then the query will not match and that student will not be included in the list. This may be undesirable.

Optional graph components SELECT ?student ?email WHERE { ?student b:studies mod:CS328 . OPTIONAL { ?student foaf:mbox ?email } } Because the email match is declared as OPTIONAL in the query above, SPARQL will match it if it can, but if it can’t, it will not reject the overall pattern because of it. So if a student has no registered e-mail address, they will still appear in the result list with a blank (unbound) value for the ?email result.

Optional graph components SELECT ?module ?name ?phone WHERE { ?student b:studies ?module . ?student foaf:name ?name . OPTIONAL { ?student b:contactpermission true . ?student b:phone ?phone } } SELECT ?module ?name ?age OPTIONAL { ?student b:age ?age . FILTER (?age > 25) } }

Further optional components SELECT ?student ?email ?home WHERE { ?student b:studies mod:CS328 . OPTIONAL { ?student foaf:mbox ?email . ?student foaf:homepage ?home } } OPTIONAL { ?student foaf:mbox ?email } . OPTIONAL { ?student foaf:homepage ?home } } The first query returns the e-mail and the homepage if the student has both, but if the student lacks either, the optional component does not match and is disregarded as a whole. The above returns the e-mail if the student has one, and the homepage if the student has one. It may return both, neither, or either.

Combining matches SELECT ?student ?email WHERE { ?student foaf:mbox ?email . { ?student b:studies mod:CS328 } UNION { ?student b:studies mod:CS909 } } When patterns are combined using the UNION keyword, the resulting combined pattern will match if either of the subpatterns is matched.

Multiple graphs and the dataset All the queries we have seen so far have operated on single RDF graphs. A SPARQL query actually runs on an RDF dataset which may include multiple RDF graphs. RDF graphs are identified, like everything else, by URI. As with other resources, the URI that represents the graph does not have to be the actual URI of the graph file, although the program processing the query will need to somehow relate the URI to an actual RDF graph stored somewhere.

Stating the dataset SELECT ?student ?email ?home FROM <http://www2.warwick.ac.uk/rdf/student> WHERE { ?student b:studies mod:CS909 . OPTIONAL { ?student foaf:mbox?email . ?student foaf:homepage ?home } } By using several FROM declarations, you can combine several graphs in the dataset: FROM <http://www2.warwick.ac.uk/rdf/foaf>

Multiple graphs The dataset can be composed of one (optional) default graph and any number of named graphs. The FROM keyword specifies the default graph. If you use several FROM keywords, the specified graphs are merged to create the default graph. You can also add graphs to the dataset as named graphs, using the FROM NAMED keyword. However, to match patterns in a named graph you must use the GRAPH keyword to explicitly state which graph they must match in.

Named graphs SELECT ?student ?email ?home FROM NAMED <http://www2.warwick.ac.uk/rdf/student> http://www2.warwick.ac.uk/rdf/foaf WHERE { GRAPH <http://www2.warwick.ac.uk/rdf/student> { ?student b:studies mod:CS909 } . GRAPH <http://www2.warwick.ac.uk/rdf/foaf> { OPTIONAL { ?student foaf:mbox ?email . ?student foaf:homepage ?home } }

Abbreviation using prefixes PREFIX brdf: <http://www2.warwick.ac.uk/rdf/> SELECT ?student ?email ?home FROM NAMED <http://www2.warwick.ac.uk/rdf/student> <http://www2.warwick.ac.uk/rdf/foaf> WHERE { GRAPH brdf:student { ?student b:studies mod:CS909 } . GRAPH brdf:foaf{ OPTIONAL { ?student foaf:mbox ?email . ?student foaf:homepage ?home } }

Named and default graph PREFIX brdf: <http://www2.warwickac.uk/rdf/> SELECT ?student ?email ?home FROM <http://www2.warwickac.uk/rdf/student> FROM NAMED <http://www2.warwickac.uk/rdf/foaf> WHERE { ?student b:studies mod:CS909 . GRAPH brdf:foaf{ OPTIONAL { ?studentfoaf:mbox?email . ?studentfoaf:homepage ?home } }

Graph as a query As well as being a bound resource, the parameter of the GRAPH keyword can also be a variable. By making use of this, it is possible to query which graph in the dataset holds a particular relationship, or to determine which graph to search based on data in another graph. It is not mandatory that all graphs referenced by a SPARQL query be declared using FROM and FROM NAMED. It need not specify any at all, and even if specified, the dataset can be overridden on a per-query basis.

Which graph is it in? PREFIX brdf: <http://www2.warwickac.uk/rdf/> SELECT ?student ?graph WHERE { ?student b:studies mod:CS909 . GRAPH ?graph { ?student foaf:mbox ?email } The output variable graph will hold the URL of the graph which matches the student to an e-mail address. Note that we presume that the query processor will have existing knowledge of some finite set of graphs and their locations, through which it will search.

Re-using the graph reference PREFIXbrdf: <http://www2.warwickac.uk/rdf/> SELECT ?student ?email WHERE { ?student b:studies mod:CS909 . ?student rdfs:seeAlso ?graph . GRAPH ?graph { ?student foaf:mbox ?email } In this case we collect the graph URL from the rdfs:seeAlso property of the student, and then look in that graph for their e-mail address. Note that if the student does not have a rdfs:seeAlso property which points to a graph holding their e-mail address, they will not appear in the result at all.

Sorting results of a query SELECT ?name ?module WHERE { ?student b:studies ?module . ?student foaf:name ?name . } ORDER BY ?name SELECT ?name ?age ?student b:age ?age . ORDER BY DESC (?age) ASC (?name)

Limiting the number of results SELECT ?name ?module WHERE { ?student b:studies ?module . ?student foaf:name ?name . } LIMIT 20

Extracting subsets of the results SELECT ?name ?module WHERE { ?student b:studies ?module . ?student foaf:name ?name . } ORDER BY ?name OFFSET 20 LIMIT 20 Note that if no ORDER BY is specified, the order of results is random, and may vary through multiple executions of the same query. Thus, extracting a subset with OFFSET and LIMIT is only useful if an ORDER BY is also used.

Obtaining a Boolean result Is any student studying any module? ASK { ?student b:studies ?module } Is any student studying CS909? ASK { ?student b:studies bmod: CS909 } Is student 029389 studying CS909? ASK { bstu:029389 b:studies bmod: CS909 } Is anyone whom 029389 knows, studying CS909? ASK {bstu:029389 foaf:knows ?x . ?x b:studies bmod: CS909 } Is any student aged over 30 studying CS909? Is any student aged over 30 studying CS909? ASK { ?student b:studies bmod: CS909 . ?student b:age ?age . FILTER { ?age > 30 } }

Obtaining unique results SELECT ?student WHERE { ?student b:studies ?module } The above query would return each student several times, because the pattern above will match once for each module a student is taking. To avoid this: SELECT DISTINCT ?student

Constructing an RDF result { ?student b:studyFriend ?friend } WHERE { ?student b:studies ?module . ?student foaf:knows ?friend . ?friend b:studies ?module } } The section after the CONSTRUCT keyword is a specification, in triples, of an RDF graph that is constructed to hold the search result. If there is more than one search result, the triples from each result are combined.

Summary The SPARQL language is used for constructing queries that extract data from RDF specifications. SPARQL is not based on XML. It is based on a roughly SQL-like syntax, and represents RDF graphs as triples. The building blocks of a SPARQL queries are graph patterns that include variables. The result of the query will be the values that these variables must take to match the RDF graph. A SPARQL query can return results in several different ways, as determined by the query. SPARQL queries can be used for OWL querying.

References [1] Dean Allemang & Jim Hendler. 2008 Semantic Web for the working ontologist. Morgan Kaufmann publishers. ISBN 978-0-12-373556-0 [2] Eric Prud'hommeaux ; & Andy Seaborne . 2008 SPARQL Query Language for RDF , [Online] http://www.w3.org/TR/rdf-sparql-query/ (accessed 20 April 2009)

Tools to process Download Jena from http://jena.sourceforge.net/ Protégé 3.4 http://protege.stanford.edu/

Questions?