Presentation on theme: "Semantic Web Andrejs Lesovskis. This lecture’s agenda Metadata Introduction to Resource Description Framework RDF triples RDF serialization formats RDF."— Presentation transcript:
This lecture’s agenda Metadata Introduction to Resource Description Framework RDF triples RDF serialization formats RDF Schema RDF and relational databases
The term "meta" comes from a Greek word that denotes something of a higher or more fundamental nature. Metadata, then, is data about other data. The term refers to any data used to aid the identification, description and location of networked electronic resources. Metadata
For example, we have a file that contains some image data. Then file metadata could be the following: name of an author, the date and time a picture was taken, location where a picture was taken, model of a camera that was used, etc.
Metadata types Metadata can describe: data contents (short summary, limitations, etc); data access history; access rights; relations between data.
Uses of metadata Metadata can be used in: content ranking; resource searching; resource integration; define relations between intelligent agents.
Metadata does not have to be digital Metadata relates to more than the description of an object. Metadata can come from a variety of sources Metadata continue to accrue during the life of an information object or system. One information object's metadata can simultaneously be another information object's data. Metadata facts to remember
W3C Semantic Web Activity Statement "The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web. It is particularly intended for representing metadata about Web resources, such as the title, author, and modification date of a Web page, copyright and licensing information about a Web document, or the availability schedule for some shared resource." W3C
Resource Description Framework A framework (not a language) for a framework for representing information in the Web, RDF is a standard model for data interchange on the Web, Syntax to allow exchange and use of the information stored in various locations, The point is to facilitate reading and correct use of information by machines, not necessarily by people.
What is a resource? Resource is anything that can be identified and described. Resource can be identified by a URI or it can be a blank node. Resource can be abstract. The first precise definition of a resource can be found in the RFC 2396 standard: http://tools.ietf.org/html/rfc2396
Main goals of RDF Integrate data from the multiple sources. Allow the re-use of data in the different projects and organizations. Decentralize data in a way that no single party "owns" all the data.
16 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea XML 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea RDF Yangtze.xml Yangtze.rdf Can be converted to XML and RDF (1)
17 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea RDF provides an ID attribute for identifying the resource being described. The ID attribute is in the RDF namespace. Add the "fragment identifier symbol" to the namespace. 1 2 3 XML and RDF (2)
18 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea Identifies the type (class) of the resource being described. Identifies the resource being described. This resource is an instance of River. These are properties, or attributes, of the type (class). Values of the properties 1 2 3 4 XML and RDF (3)
RDF triple structure (1) RDF data model is based upon the idea of making statements about resources in the form of subject-predicate-object expressions (triples). resource value property
RDF triple structure (2) Every triple contains some statement. Resource Property Value Resource Statement
RDF triple structure (3) Gmail owned by In RDF, the English statement "The owner of the web-site Gmail at http://www.gmail.com is Google." would look like this: url http://www.gmail.com
Binary predicates RDF offers only binary predicates. Think of them as P(x,y) where P is the relationship between the objects x and y. From the example, X = http://www.w3school s.com/RDF Y = Jan Egil Refsnes P = author http://www.w3schools.com/RDF Jan Egil Refsnes author
RDF triple example (3) RDF/XML code that corresponds to the graph on the previous slide: Eric Miller Dr.
RDF uses URI references to define its subjects, predicates, and objects. A URI reference (or URIref) is a URI, together with an optional fragment identifier at the end. E.g., the URI http://www.example.org/index.html#section2 consists of: the URI http://www.example.org/index.html the fragment identifier: section2. A resource is identifiable by a URI reference URIs and RDF
RDF/XML RDF/XML is a syntax, defined by the W3C, to express (serialize) an RDF graph as an XML document; According to the W3C, "RDF/XML is the normative syntax for writing RDF"; Was endorsed as a recommendation on February 10, 2004.
RDF/XML example 40.35 -74.66 Department of Computer Science
Notation3 (N3) Much more compact and readable format that doesn’t use XML syntax ; Is being developed by Tim Berners-Lee and Semantic Web community members; N3 files use UTF-8 encoding.
The word schema comes from the Greek word "σχήμα" (skhēma), which means shape, or more generally, plan. Schema (comp. science) is a logical description of the data in a data base, including definitions and relationships of data. What is schema?
RDF Schema provides a way to express: simple statements defining classes of resources including subclass relationships, statements defining properties including subclass relationships, statements about domain and range of a property. RDF Schema (RDFS)
RDF Schema's type system is similar to those of object-oriented programming languages. RDF Schema allows resources to be defined as instances of one or more classes. Classes can be organized in a hierarchical fashion; for example, a class ex:Dog can be defined as a subclass of ex:Mammal, meaning that any resource which is in class ex:Dog is also in class ex:Mammal. The RDF Schema (RDFS:) is defined in a namespace whose URI is: http://www.w3.org/2000/01/rdf-schema#". RDF Schema: A meta-language
Sample case: MotorVehicle To say that ex:MotorVehicle is a class, write: ex:MotorVehicle rdf:type rdfs:Class. To create an instance of ex : MotorVehicle, write: exthings:companyCar rdf:type ex:MotorVehicle. Naming convention: class names start with an uppercase letter; property and instance names are lowercase. A resource may be an instance of more than one class.
Defining Subclasses Using subClassOf property we can define specialized kinds of motor vehicles (e.g., passenger vehicles, vans, minivans, etc). ex:Van rdf:type rdfs:Class. ex:Van rdfs:subClassOf ex:MotorVehicle. ex:Truck rdf:type rdfs:Class. ex:Truck rdfs:subClassOf ex:MotorVehicle.
Meaning of Subclass subClassOf means if ex:myVan is an instance of ex:Van, then ex:myVan is also, by inference, an instance of ex:MotorVehicle. subClassOf is (obviously) transitive: If ex:Van rdfs:subClassOf ex:MotorVehicle. and ex:MiniVan rdfs:subClassOf ex:Van. then ex:MiniVan is implicitly a subclass of ex:MotorVehicle. A class may be a subclass of more than one class. All classes are implicitly subclasses of class rdfs:Resource.
A Full Class Hierarchy The (ex:Truck rdf:type rdfs:Class) part of the graph is not shown. Notice Minivan is subClassOf two classes.
Class Naming Fragment identifiers, like MotorVehicle, use rdf:ID give the effect of "assigning" URIrefs relative to the schema document. Relative URIrefs based on these names can then be used in other class definitions within the same schema, e.g., #MotorVehicle. The full URIref of this class would be: http://example.org/schemas/vehicles#MotorVehicle We could also include an explicit declaration: xml:base="http://example.org/schemas/vehicles"
53 Properties All properties in RDF are described as instances of class rdf:Property, e.g. exterms:weightInKg rdf:type rdf:Property. RDF Schema provides rdfs:range to define valid fillers for a triple’s Object. RDF Schema provides rdfs:domain to define valid fillers for a triple’s Subject.
rdfs:range property If the property ex:author has values that are instances of class ex:Person, we would write: ex:Person rdf:type rdfs:Class. ex:author rdf:type rdf:Property. ex:author rdfs:range ex:Person. If a property has more than one range, then its filler must be an instance of all of the classes specified as the ranges: ex:hasMother rdf:type rdf:Property. ex:hasMother rdfs:range ex:Person. ex:hasMother rdfs:range ex:Female. ex:Sally ex:HasMother exstaff:frances exstaff:frances must be both a Female and a Person.
55 Typed literals as ranges To say that the range of ex:age is an integer: ex:age rdf:type rdf:Property. ex:age rdfs:range xsd:integer. The datatype xsd:integer is identified by its URIref (http://www.w3.org/2001/XMLSchema#integer). It is optional, but “useful” to declare: xsd:integer rdf:type rdfs:Datatype. This statement documents the existence of the datatype, and indicates explicitly that it is being used in this schema.
rdfs:domain property rdfs:domain indicates that a particular property applies to a class. Suppose books have authors. In RDF: ex:Book rdf:type rdfs:Class. ex:author rdf:type rdf:Property. ex:author rdfs:domain ex:Book. If a property has more than one domain, then any subject instance of that property must be an instance of each named domain.
Specializing Properties Like rdfs:subClassOf, property rdfs:subPropertyOf is used to define a property hierarchy. For example, to say that the property ex:primaryDriver is a kind of ex:driver, write: ex:driver rdf:type rdf:Property. ex:primaryDriver rdf:type rdf:Property. ex:primaryDriver rdfs:subPropertyOf ex:driver. This means that if an instance ex:fred is a ex:primaryDriver of the instance ex:companyVan, then ex:fred is also a ex:driver of ex:companyVan.
subPropertyOf property A property may be a subPropertyOf zero, one or more properties. All RDF rdfs:range and rdfs:domain properties that apply to an RDF property also apply to each of its subproperties. Therefore, because of its subproperty relationship to ex:driver, implicitly ex:primaryDriver also has an rdfs:domain of ex:MotorVehicle.
RDF Container Elements RDF containers are used to describe group of things (resources or literals): rdf:Bag – used to describe a list of values that do not have to be in a specific order. rdf:Seq – used to describe an ordered list of values (for example, in alphabetical order). rdf:Alt – used to describe a list of alternative values (the user can select only one of the values).
RDF Container Elements (rdf:Bag) John Paul George Ringo
RDF Container Elements (rdf:Seq) George John Paul Ringo
RDF Container Elements (rdf:Alt) CD Record Tape
RDF storage RDF triple store is a system that provides a mechanism for persistent storage and access of RDF graphs.
Different Architectures Based on their implementation, triple stores can be divided into 3 broad categories : In-memory, Native, and Non-native, Non-memory. In – Memory : RDF Graph is stored as triples in main –memory. For example, storing an RDF graph using Jena API/ Sesame API. Native : Persistent storage systems with their own implementation of databases. Eg. Sesame Native, Virtuoso, AllegroGraph, Oracle 11g. Non-Native, Non-Memory : Persistent storage systems set-up to run on third party DBs. E. g., Jena SDB.