Presentation is loading. Please wait.

Presentation is loading. Please wait.

/ department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 20031 RDF Query Languages Flavius Frasincar

Similar presentations


Presentation on theme: "/ department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 20031 RDF Query Languages Flavius Frasincar"— Presentation transcript:

1 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 20031 RDF Query Languages Flavius Frasincar flaviusf@win.tue.nl

2 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 20032 Contents Why RDF Query Languages? RDF Features (Recap) RDF Query Language Requirements RDF Query Languages RQL (RDF Query Language): –Select: variables –Where: path expressions –From: condition Summary

3 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 20033 Why RDF QLs? RDF is the standard representation language for Web metadata (foundation of the Semantic Web) RDF is already used in: –Large description schemas: ODP (Open Directory Project) - web site classification with 385,965 topics, UNSPSC (United Nations Standard Products and Services Code) - product classification with 16,506 classes –Large description bases: ODP classifies 3,339,355 sites RDF QLs are needed in order to access data from (large) RDF representations

4 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 20034 RDF Primitive Semantics: Subject Predicate Object (one statement) Three alternative notations: Graph Triple (http://example.com/sb.jpg, painted_by, “Rembrandt”) RDF/XML Rembrandt painted_by http://example.com/sb.jpg Rembrandt

5 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 20035 RDF Features RDF: –Data Model: Directed Labeled Graph Nodes: Resources (with or without URIs) or Literals Edges: Properties (attributes or relationships) Labels: Nodes (URI) or Edges (Property URI) RDF Schema: –Multiple classification of resources –Specialization of both classes/properties (simple and multiple) –Unordered, optional, and multivalued properties –Domain and range polymorphism of properties

6 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 20036 RDF vs. XML Different Data Models: –RDF data model: a directed graph with labels on both edges and nodes –XML data model: a tree with labels on edges or nodes Different Semantics: –RDF is able to model complex semantic relations (e.g. class/property hierarchies based on specialization) –XML has only one type of semantics (inclusion semantics) (an element contains another element) RDF has an XML syntax RDF/XML but XML QLs do not support RDF semantics: we need an RDF QL

7 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 20037 Requirements for an RDF QL Understand RDF Data Model (RDF graph or RDF triples) Path expressions can use labels from both nodes and edges Compose queries: the output of one query can be used as input for the next query Declarative: not bound to any implementation (closer to human language!) Support RDF Schema

8 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 20038 RDF Query Languages Triple-based: querying the structure –RDQL –Triple [successor of SiLRI] (Horn logic) e.g. Find statements whose subject is … and object is … XML-based: querying the syntax –RDF Query –RQuery (XQuery) e.g. Find description elements whose attribute value contains … Graph-based (but not graphical): querying the semantics –RQL (OQL) e.g. Find resources classified under … whose property value is …

9 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 20039 RDF Query Language (RQL) Declarative query language for RDF Language proposal (not yet a standard) Based on the RDF-graph representation Supports RDF Schema (a few from the existing RDF QL do that) References (small differences between them): –RQL from ICS-FORTH (Greece) (http://139.91.183.30:9090/RDF/RQL/) –Sesame from Aidministrator (Holland) (http://sesame.aidministrator.nl/) The rest of the presentation refers to the Sesame impl.

10 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200310 RQL Input The input to an RQL query is a complete RDF model, i.e. a model that contains its RDFS-closure (defined in RDF Semantics). Note that the RDFS-closure includes the RDF-closure [RDF-closure] e.g. rdf1: if (xxx aaa yyy) then add (aaa rdf:type rdf:Property) [RDFS-closure] e.g. rdfs9: if (xxx rdfs:subClassOf yyy) and (aaa rdf:type xxx) then add (aaa rdf:type yyy) There are operators variants (append ^) that discard this new data (intensional data) and consider only the given statements (extensional data) from an RDF model

11 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200311 Example: RDF Input

12 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200312 Example:Web Resources &r1 http://www.european-history.com/rembrandt.html &r2 http://www.artchive.com/rembrandt/abraham.jpg

13 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200313 Select-Where-From select X, Y from {X}cult:paints{Y},{X}cult:first_name{Xfname} where Xfname like "Rembrandt" using namespace cult=http://www.icom.com/schema.rdf# Variables on graph labels Path expressions/conditions use variables and constants RQL result is a table of tuples (a relation) that has for each variable (the columns) a value assigned (the rows) List of variables List of path expressions Condition (optional)

14 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200314 RQL Result XY http://www.european-history.com/rembrandt.htmlhttp://www.artchive.com/rembrandt/artist_at_his_easel.jpg http://www.european-history.com/rembrandt.htmlhttp://www.artchive.com/rembrandt/abraham.jpg …abraham.jpg …

15 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200315 Why RQL Result Is a Bag? select X from {X}cult:paints{Y},{X}cult:first_name{Xfname} where Xfname like "Rembrandt" using namespace cult=http://www.icom.com/schema.rdf# X http://www.european-history.com/rembrandt.html e.g. if only one variable is returned there might be multiple bindings of this variable with the same value (we need a Bag)

16 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200316 Namespaces All the labels for nodes and edges are associated with a certain namespace using namespace cult=http://www.icom.com/schema.rdf# adm=http://www.oclc.org/schema.rdf# cult contains information intended for museum specialists (e.g. artists, artifacts, museums descriptions) adm contains information for portal administrators (e.g. title, file_size, mime-type of a certain external resource) (Web) Resources are orthogonally classified using the two above schemas

17 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200317 Select: Variables There are three kinds of variables: –Instance: e.g. X –Class: e.g. $C –Property: e.g. @P “Find all resources together with their associated classes, properties, and property values”: select X, $C, @P, Y is equivalent to select * from {X : $C}@P{Y} (* = all variables) from {X : $C}@P{Y} “A resource X has type C” has two syntaxes X : C (not standalone) or C{X} ( a path expression that limits a node)

18 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200318 From: Path Expressions Path expressions specify a linear path through the RDF data model Each variable used in a path expression is bound to labels from the model “Find all painters and their associated paintings” select Painter, Painting from {Painter}cult:paints{Painting} using namespace cult=http://www.icom.com/schema.rdf#

19 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200319 The ‘.’ in Path Expressions Path expressions can be arbitrarily long The ‘.’is used to specify a join condition between the object and the subject of two consecutive properties select Painter, Painting, Technique from {Painter}cult:paints{Painting}. cult:technique{Technique} using namespace cult=http://www.icom.com/schema.rdf# In the above example Painting is the object of cult:paints and the subject of cult:technique If Painting is not interesting it can be omitted from {Painter}cult:paints. cult:technique{Technique}

20 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200320 List of Path Expressions Since path expressions are linear it is not possible to express two paths with the same origin in one path expression List of path expressions sharing variables select Painter, Painting, Painter_lname from {Painter}cult:paints{Painting}, {Painter}cult:last_name{Painter_lname} using namespace cult=http://www.icom.com/schema.rdf#

21 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200321 Class of a Resource select Painter, $Painter, Painting from {Painter : $Painter}cult:paints{Painting} using namespace cult=http://www.icom.com/schema.rdf# select Painter, Painter_type, Painting from {Painter}rdf:type{Painter_type}, {Painter}cult:paints{Painting} using namespace rdf = http://www.w3.org/1999/02/22-rdf-syntax-ns#, cult = http://www.icom.com/schema.rdf# Q1 returns the most specific type (class) for a resource while Q2 returns all types of this resource Q1 (better) Q2

22 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200322 Class Restriction for Resources select Painter from {Painter :cult:Flemish}cult:paints{Painting} using namespace cult=http://www.icom.com/schema.rdf# Note that cult:Flemish must be part of the domain of cult:paints, otherwise the query returns 0 results. select Painter from cult:Flemish{Painter} using namespace cult=http://www.icom.com/schema.rdf# Q1 returns multiple times a Flemish painter that has more than one paintings while Q2 does not so. Q1 Q2 (better)

23 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200323 Domain and Range select $Domain, $Range from {:$Domain}cult:has_style{:$Range} using namespace cult=http://www.icom.com/schema.rdf# select domain(@P),@P,range(@P) from {}@P{} where @P = cult:has_style using namespace cult=http://www.icom.com/schema.rdf# Q1 return data from schema with RDFS-closure while Q2 return data present in schema without RDFS-closure (both are independent of the model instance) Q1 (better) Q2

24 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200324 Where: Condition The where clause is optional The condition constrains the value of variables bound in the from clause. It uses two kind of operators: –Comparison:, >=, != like (with *)[lexical], in [set] –Logical: and, or, not The first 5 comparison operators are overloaded for sets or single-valued (classes, properties, reals, integers, and literals/resources) based on set comparison or single-value comparison (subClassOf, subPropertyOf, reals comparison, integers comparison, and lexical comparison)

25 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200325 Comparison Operators “Select all artists, their type, and their first name that have a painting resource containing the string ‘abraham’” select Artist, $Artist, ArtistFName from {Artist : $Artist} cult:first_name {ArtistFName} where Artist in select Painter from {Painter} cult:paints {Painting} where Painting like "*abraham*" using namespace cult = http://www.icom.com/schema.rdf#

26 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200326 Logical Operators “Select all painters with a first name that starts with R and all sculptors with a first name that does not start with M” select Artist, ArtistFName from {Artist :$Artist} cult:first_name {ArtistFName} where ($Artist <= cult:Painter and ArtistFName like "R*") or ($Artist <= cult:Sculptor and not (ArtistFName like "M*")) using namespace cult = http://www.icom.com/schema.rdf#

27 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200327 Standard Functions Standard functions are used to retrieve standard RDFS relationships We already did see: domain() and range() Other examples: Class, Property, subClassOf(), subPropertyOf(), typeOf() etc. The standard functions can be used also as standalone queries Class subClassOf ( http://www.icom.com/schema.rdf#Artist ) typeOf( http://www.european-history.com/rembrandt.html ) etc.

28 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200328 Strict Interpretation with ‘^’ “Retrieve the direct subclasses of Artist” subClassOf^ ( http://www.icom.com/schema.rdf#Artist ) “Retrieve all subclasses of Artist” subClassOf ( http://www.icom.com/schema.rdf#Artist ) “Retrieve the most specific classes to which the resource http://www.european-history.com/rembrandt.html belongs to” typeOf^ ( http://www.european-history.com/rembrandt.html ) “Retrieve the classes to which the resource http://www.european- history.com/rembrandt.html belongs to” typeOf ( http://www.european-history.com/rembrandt.html )

29 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200329 Standalone Queries The standard functions: Class, subClassOf, Property, subPropertyOf etc. Any class (resource of type rdf:Class): returns the extension (resources) of this class http://www.icom.com/schema.rdf#Artist Any property (resource of type rdf:Property): returns the extension (pairs subject-object) of this property http://www.icom.com/schema.rdf#creates

30 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200330 Set Operations The query results can be combined using the following operators: union, intersect, and minus “Retrieve the first name and the last name of all painters” (select PainterR, PainterLName, PainterFName from cult:Painter{PainterR}. cult:last_name{PainterLName}, {PainterR}cult:first_name{PainterFName}) union (select PainterR, PainterLName, NULL from cult:Painter{PainterR}. cult:last_name{PainterLName} where not (PainterR in select PainterR from {PainterR}cult:first_name )) using namespace cult = http://www.icom.com/schema.rdf# Note that not all painters have a first name in the input model (outer join operation)

31 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200331 Summary There is a need for RDF query languages (XML query language cannot handle RDF semantics) RQL: declarative query language for uniformly querying RDF schemas and RDF descriptions Select list of variables (variables to be returned) From list of path expressions (variables are bound) Where condition (constrains the value of variables) –Compositional (in and set operations) –Very expressive –Well-defined semantics, syntax can be improved … … but not yet a standard!

32 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200332 Appendix Try your own queries at: http://sesame.aidministrator.nl/sesame/actionFrameset.jsp?repository=museum The result of the query: –HTML Table –RDF-Bag –XML Explore the Museum example (with or without inferred statements): –Schema (ontology) –Instance (data statements)

33 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200333 Exercise 1 “Find the first name of painters that have paintings using the ‘oil on canvas’ technique and return also these paintings” select Painter_fname, Painting from {Painter}cult:paints{Painting}. cult:technique{Painting_technique}, {Painter}cult:first_name{Painter_fname} where Painting_technique like "oil on canvas" using namespace cult=http://www.icom.com/schema.rdf#, adm=http://www.oclc.org/schema.rdf#

34 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200334 Exercise 2 “Find the first name of the painters that have a painting stored in a file with size greater than 5” select Painter_fname from {Painter}cult:paints{Painting}. adm:file_size{Painting_fsize}, {Painter}cult:first_name{Painter_fname} where Painting_fsize > 15 using namespace cult=http://www.icom.com/schema.rdf#, adm=http://www.oclc.org/schema.rdf#

35 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200335 Exercise 3 “Find the resources which are not of type ExtResource” First Solution: select R from rdfs:Resource{R} where not (R in select R from adm:ExtResource{R}) using namespace rdfs=http://www.w3.org/2000/01/rdf-schema#, adm=http://www.oclc.org/schema.rdf#

36 / department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 200336 Exercise 3 (cont’d) Second solution: (select R from rdfs:Resource{R}) minus (select R from adm:ExtResource{R}) using namespace rdfs=http://www.w3.org/2000/01/rdf-schema#, adm=http://www.oclc.org/schema.rdf#


Download ppt "/ department of mathematics and computer science TU/e eindhoven university of technology ISAApril 17, 20031 RDF Query Languages Flavius Frasincar"

Similar presentations


Ads by Google