RDFa: Embedding RDF Knowledge in HTML Some content from a presentation by Ivan Herman of the W3c, Introduction to RDFa, given at the 2011 Semantic Technologies.

Slides:



Advertisements
Similar presentations
Pete Johnston, Eduserv 16 October 2009 Relationship between foaf:maker & dc:creator/dcterms:creator.
Advertisements

LIS650lecture 1 XHTML 1.0 strict Thomas Krichel
A centre of expertise in digital information management Approaches To The Validation Of Dublin Core Metadata Embedded In (X)HTML Documents Background The.
Metadata vocabularies and ontologies Dr. Manjula Patel Technical Research and Development
February Harvesting RDF metadata Building digital library portals with harvested metadata workshop EU-DL All Projects concertation meeting DELOS.
Semantic Descriptions for RESTful Services SA-REST by Knoesis Service Research Lab Tomas Vitvar WSMO Phone Conference January 09,
XML Schema Heewon Lee. Contents 1. Introduction 2. Concepts 3. Example 4. Conclusion.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Creating Linked Data Juan F. Sequeda Semantic Technology Conference June 2011.
Steffen Staab 1WeST Web Science & Technologies University of Koblenz ▪ Landau, Germany Structured Data on the Web Introduction to.
WeB application development
A guide to HTML. Slide 1 HTML: Hypertext Markup Language Pull down View, then Source, to see the HTML code. Slide 1.
Microdata and schema.org. Basics Microdata is a simple semantic markup scheme that’s an alternative to RDFa Microdata Developed by WHATWG and supported.
A really fairly simple guide to: mobile browser-based application development (part 1) Chris Greenhalgh G54UBI / Chris Greenhalgh
Introduction to Computing Using Python CSC Winter 2013 Week 8: WWW and Search  World Wide Web  Python Modules for WWW  Web Crawling  Thursday:
The Semantic Web – WEEK 4: RDF
Introduction to RDF Based on tutorial at
1 RDF Tutorial. C. Abela RDF Tutorial2 What is RDF? RDF stands for Resource Description Framework It is used for describing resources on the web Makes.
Developing a Metadata Exchange Format for Mathematical Literature David Ruddy Project Euclid Cornell University Library DML 2010 Paris 7 July 2010.
CS570 Artificial Intelligence Semantic Web & Ontology 2
RDF – RESOURCE DESCRIPTION FRAMEWORK Antonio Bucchiarone FBK-IRST Trento, Italy 20 Novembre 2009.
Using JavaScript in Linked Data Applications Oshani Seneviratne Oct 12, 2010.
ESDSWG2011 – Semantic Web session Semantic Web Sub-group Session ESDSWG 2011 Meeting – Semantic Web sub-group session Wednesday, November 2, 2011 Norfolk,
RDF Tutorial.
Semantic Web Introduction
Emerging Technologies Semantic Web and Data Integration This meeting will start at 5 min past the hour As a reminder, please place your phone on mute unless.
© Copyright IBM Corporation 2014 Getting started with Rational Engineering Lifecycle Manager queries Andy Lapping – Technical sales and solutions Joanne.
An Introduction to XML Based on the W3C XML Recommendations.
 Copyright 2004 Digital Enterprise Research Institute. All rights reserved. SPARQL Query Language for RDF presented by Cristina Feier.
RDF formats for Linked Data by Mabi Harandi. RDF is not a format, it is a model for data So: It will provide supports for different formats like :  Turtle.
Embedding Knowledge in HTML Some content from a presentations by Ivan Herman of the W3c.
Chapter 3 RDF Syntax 1. Topics Basic concepts of RDF resources, properties, values, statements, triples URIs and URIrefs RDF graphs Literals and Qnames.
1 CP3024 Lecture 9 XML revisited, XSL, XSLT, XPath, XSL Formatting Objects.
2011 Semantic Technologies Conference 7 th of June, 2011, San Francisco, CA, USA Ivan Herman, W3C.
Links and Comments.
Dr. Alexandra I. Cristea RDF.
RDF Kitty Turner. Current Situation there is hardly any metadata on the Web search engine sites do the equivalent of going through a library, reading.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Benelux Semantic Web Meetup, Amsterdam Ivan Herman, W3C.
September 15, 2003Houssam Haitof1 XSL Transformation Houssam Haitof.
Metadata Standards and Applications 4. Metadata Syntaxes and Containers.
Microdata and schema.org. Basics Microdata is a simple semantic markup scheme that’s an alternative to RDFa Microdata Developed by WHATWG and supported.
Logics for Data and Knowledge Representation
CP2022 Multimedia Internet Communication1 HTML and Hypertext The workings of the web Lecture 7.
>> Introduction to HTML: Tags. Hyper - is the opposite of linear Text – words / sentences / paragraphs Mark-up – Marking the text Language – It is a language.
ISBD for the Semantic Web: namespaces, elements, vocabularies, application profile Gordon Dunsire Presented at Centar zu Stalno Stručno Usavršavanje (CSSU),
CC L A W EB DE D ATOS P RIMAVERA 2015 Lecture 2: RDF Model & Syntax Aidan Hogan
Date : 3/3/2010 Web Technology Solutions Class: Application Syndication: Parse and Publish RSS & XML Data.
XHTML TAGS I Basic Tags. North Lake College 2 by Sean Griffin Sample XHTML Code.
RDFa, Microformats, and Atom Semantic Web Presented by: Anuradha Kandula Instructor: Steven Seida.
Practical RDF Ch.6 Creating an RDF Vocabulary DongHyuk Im SNU OOPSLA Lab. Shelley Powers, O’Reilly August 19, 2004.
Embedding Knowledge in HTML Some content from a presentations by Ivan Herman of the W3c.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
 Structured Data An Introduction to Semantic Web “It is very hard for search engines to understand the structure and semantics of data embedded in an.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
Practical RDF Ch.2 Junwon Jung SNU OOPSLA Lab. Shelley Powers, O’Reilly August 5, 2004.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
Department of Computer Science, Florida State University CGS 3066: Web Programming and Design Spring
RDFa Primer Bridging the Human and Data webs Presented by: Didit ( )
1 Extensible Stylesheet Language (XSL) Extensible Stylesheet Language (XSL)
Introduction to RDF Sandro Hawke, Semantic Web Tutorial ISWC 2010.
An Introduction to RDFa QingXia Liu Contents What is RDFa? Why RDFa? Versions of RDFa An Example
ISWC 2010, Shanghai, 8 th November, 2010 Ivan Herman ( 郝易文 ), W3C.
RDFa: Embedding RDF Knowledge in HTML
Embedding Knowledge in HTML
Embedding Knowledge in HTML
RDFa: Embedding RDF Knowledge in HTML
JSON for Linked Data: a standard for serializing RDF using JSON
Resource Description Framework (RDF)
Presentation transcript:

RDFa: Embedding RDF Knowledge in HTML Some content from a presentation by Ivan Herman of the W3c, Introduction to RDFa, given at the 2011 Semantic Technologies Conference.Introduction to RDFa

Simple idea: a serialization of RDF embedded in XHTML, HTML or XML Provides a set of attributes (the a in RDFa) to use with existing tags to carry the RDF metadata 2004: work on developing standards began 2008: RDFa 1.0 a recommendation – Worked only in XHTML, which did not catch onXHTML 2012: RDFa 1.1 a recommendation – Works in HTML4, HTML5 and XHTMLHTML4HTML5 See What is RDFa?

RDF content specified in XML attributes of tags rather than elements The XML/HTML tree structure is used as context, when appropriate Some new attributes are introduced and some existing reused When possible, HTML text content is used for literal values  Same file used by browser & RDF extractor Principles of RDFa

Web page viewed by a person

The source

Source and generated RDF… <p about=" property=" Unique identifier for RDFS Entailment. <p about=" property=" Unique identifier for RDFS Entailment. …. ….

Source and generated RDF… <p about=" property=" Unique identifier for RDFS Entailment. <p about=" property=" Unique identifier for RDFS Entailment. …. ….

Source and generated RDF… <p about=" property=" Unique identifier for RDFS Entailment. <p about=" property=" Unique identifier for RDFS Entailment. "Unique identifier for RDFS Entailment.". "Unique identifier for RDFS Entailment.".

The Web page viewed by a person

The source

Source and generated RDF… <a about=" rel=" href=" RDF Semantics. <a about=" rel=" href=" RDF Semantics. …. ….

Source and generated RDF… <a about=" rel=" href=" RDF Semantics. <a about=" rel=" href=" RDF Semantics. …. ….

Source and generated RDF… <a about=" rel=" href=" RDF Semantics. <a about=" rel=" href=" RDF Semantics...

We have Ntriples in HTML "Unique identifier for RDFS Entailment.".. "Unique identifier for RDFS Entailment.".. Allow URI prefixes and shared subject, like dcterms:. rdfs:seeAlso ; dcterms:description "Unique identifier for RDFS dcterms:. rdfs:seeAlso ; dcterms:description "Unique identifier for RDFS Entailment.". Maybe we can do better, instead of this

Turtle supports several simplifying ideas Use compact URIs when possible – A CURIE or compact URI, typically a URI with a prefix defined elsewhere, e.g., foaf:mboxCURIE Making use of the natural structure for – shared subjects – shared predicates – creating blank nodes – … Turtlizing RDFa

CURIE definition and usage … <p about=" property=" Unique identifier for RDFS Entailment. … … <p about=" property=" Unique identifier for RDFS Entailment. … can be replaced by: … <p about=" property="dcterms:description"> Unique identifier for RDFS Entailment. … … <p about=" property="dcterms:description"> Unique identifier for RDFS Entailment. …

Can be anywhere in the tree and is valid for the whole sub-tree – i.e., html element is not the only place to have it The attribute can hold several definitions: – prefix="dcterm: foaf: CURIEs and “real” URIs can usually be mixed CURIEs cannot be used Details in RDFa

Sharing subjects <html prefix="dcterms: rdfs: … … Unique identifier for RDFS Entailment. …<a rel="rdfs:seeAlso" href=" RDFS Semantics … <html prefix="dcterms: rdfs: … … Unique identifier for RDFS Entailment. …<a rel="rdfs:seeAlso" href=" RDFS Semantics … The basic is inherited by children nodes, so there’s no reason to repeat it

… dcterms:. rdfs:seeAlso ; dcterms:description "Unique identifier for RDFS dcterms:. rdfs:seeAlso ; dcterms:description "Unique identifier for RDFS Entailment.".

On reusing literals Reusing literals is a plus, but you don’t always want to do it The basic rule says: the (RDF) Literal is the enclosed text from the HTML content This is fine in 80% of the cases, but… …it may not be natural in many cases!

Example: dates <body about=".." prefix="dcterms: xsd: <body about=".." prefix="dcterms: xsd: This leads xsd:. dcterms:date xsd:. dcterms:date " "^^xsd:date is the official ISO format (for xsd:date) but “July 5, 2010” is preferred by people

Usage <body about=".." prefix="dcterms: xsd: <p property="dcterms:date" datatype="xsd:date" content=" ">July 5, 2010 <body about=".." prefix="dcterms: xsd: <p property="dcterms:date" datatype="xsd:date" content=" ">July 5, 2010 Also leads xsd:. dcterms:date xsd:. dcterms:date " "^^xsd:date.

Here is our rule so far sets the subject sets the object But that is not always good enough – We may not want to introduce an active link (i.e., "a" element) on the web page – what about other links in HTML? On subjects and objects

We may not always want links… <span rel="rdfs:seeAlso" resource=" Lead <span rel="rdfs:seeAlso" resource=" Lead The attribute is equivalent Sets the object, just but is ignored by browsers, e.g.,:

More features RDFa1.1 has more features that make it easier to represent knowledge compactly in HTML These take advantage of the HTML tree context See the hidden slides if you are interested

Chaining dcterms:creator. foaf:mailbox ; foaf:workplaceHomepage. dcterms:creator. foaf:mailbox ; foaf:workplaceHomepage. Here is what we would like to have in RDFa

Chaining … <span rel="dcterms:creator" resource=" <a rel="foaf:mailbox" <a rel="foaf:workplaceHomepage" href=" … <span rel="dcterms:creator" resource=" <a rel="foaf:mailbox" <a rel="foaf:workplaceHomepage" href=" A straightforward way:

Chaining … <span rel="dcterms:creator" resource=" <a rel="foaf:mailbox" <a rel="foaf:workplaceHomepage" href=" … <span rel="dcterms:creator" resource=" <a rel="foaf:mailbox" <a rel="foaf:workplaceHomepage" href=" A straightforward way:

Chaining: when objects become subjects … <span rel="dcterms:creator" resource=" <a rel="foaf:mailbox" <a rel="foaf:workplaceHomepage" href=" … <span rel="dcterms:creator" resource=" <a rel="foaf:mailbox" <a rel="foaf:workplaceHomepage" href=" An alternative:

@resource becomes a subject for the sub-tree This feature is a bit like in RDF/XML Chaining means

Blank nodes can be created using “_:XX” Shorthand for RDF types Helping single-vocabulary cases Profiles Some extra features

Typing can of course be done But that is a widely used combination, so there is a attribute for that Typing

Typing example <span about=" typeof="foaf:Person"> Ivan Herman, <span about=" typeof="foaf:Person"> Ivan Herman, yields a foaf:Person ; foaf:name "Ivan Herman". a foaf:Person ; foaf:name "Ivan Herman".

In many cases the content is dominated by one vocabulary, e.g., dcterms, foaf, etc. CURIEs and URIs use is intuitive for RDF people but not for average HTML authors! Solution: – define a vocabulary URI for a sub-tree – for that sub-tree, simple etc., are automatically expanded into a full URI using the vocabulary Single-vocabulary case

@vocab and terms: this… … <address about=" typeof="foaf:Person”> Ivan Herman, <a rel="foaf:mailbox" <a rel="foaf:workplaceHomepage" href=" … <address about=" typeof="foaf:Person”> Ivan Herman, <a rel="foaf:mailbox" <a rel="foaf:workplaceHomepage" href="

…becomes … <address about=" typeof="Person"> Ivan Herman, <a rel="mailbox" <a rel="workplaceHomepage" href=" … <address about=" typeof="Person"> Ivan Herman, <a rel="mailbox" <a rel="workplaceHomepage" href="

…becomes … <address about=" typeof="Person"> Ivan Herman, <a rel="mailbox" <a rel="workplaceHomepage" href=" … <address about=" typeof="Person"> Ivan Herman, <a rel="mailbox" <a rel="workplaceHomepage" href="

…becomes … <address about=" typeof="Person"> Ivan Herman, <a rel="mailbox" <a rel="workplaceHomepage" href=" … <address about=" typeof="Person"> Ivan Herman, <a rel="mailbox" <a rel="workplaceHomepage" href="

Prefix and term declarations can be collect-ed in a separate file and referred to via attribute pointing to the file Say, file “ defines – prefix mappings: "foaf" → " "rdfs" → – term mapping: "desc" → Profile files

Profile usage example: this… <html prefix="dcterms: rdfs: … … Unique identifier for RDFS Entailment. …<a rel="rdfs:seeAlso" href=" RDFS Semantics … … Ivan Herman, … <html prefix="dcterms: rdfs: … … Unique identifier for RDFS Entailment. …<a rel="rdfs:seeAlso" href=" RDFS Semantics … … Ivan Herman, …

…becomes … … Unique identifier for RDFS Entailment. …<a rel="rdfs:seeAlso" href=" RDFS Semantics … … Ivan Herman, … … … Unique identifier for RDFS Entailment. …<a rel="rdfs:seeAlso" href=" RDFS Semantics … … Ivan Herman, …

…becomes … … Unique identifier for RDFS Entailment. …<a rel="rdfs:seeAlso" href=" RDFS Semantics … … Ivan Herman, … … … Unique identifier for RDFS Entailment. …<a rel="rdfs:seeAlso" href=" RDFS Semantics … … Ivan Herman, …

Even usage of profiles might be “too much” for many HTML authors – authors will forget to add declaration RDFa defines default profiles: – RDFa clients include these profiles automatically Default profiles

Default for RDFa in general – – includes some widely used prefixes (rdf, rdfs, vcard, og, foaf, dc, or dcterms are typical candidates) – the profile is to be updated regularly by adding new prefixes Default for (X)HTML – – includes the values (next, up, license, …) – the profile is to be updated regularly by values as they evolve in the HTML world Default profiles

So this… … … Unique identifier for RDFS Entailment. …<a rel="rdfs:seeAlso" href=" RDFS Semantics … … … Unique identifier for RDFS Entailment. …<a rel="rdfs:seeAlso" href=" RDFS Semantics …

…becomes: … … Unique identifier for RDFS Entailment. …<a rel="rdfs:seeAlso" href=" RDFS Semantics … … … Unique identifier for RDFS Entailment. …<a rel="rdfs:seeAlso" href=" RDFS Semantics …

Some tools already have RDFa facilities: – e.g., it is possible to add the right DTD to Dreamweaver, Amaya has it at its core, etc. There are plugins to, e.g., WordPress, to generate RDFa markup CMS systems (like Drupal 7) may have RDFa built in their publication system – users generate RDFa whether they know about it or not… Authoring RDFa

Major search engines (Google, Yahoo) process RDFa for vocabularies they understand can use There are libraries, distillers, etc., to extract RDFa information – may be part of RDF development environments like Redland, RDFLib – see, for further references, Facebook’s “social graph” is based on RDFa Consuming RDFa

RDFa+HTML file can just be on a server – the client extracts the RDF content Content negotiations can be set up on the server side – the client gets the format he/she asks for – the RDF content can either be generated on the fly or stored on the server statically Publishing RDFa

Embedded metadata (microdata or RDFa) is used to improve search result page – at the moment only a few vocabularies are recognized, but that will evolve over the years Google’s rich snippets

A number of popular sites publish RDFa as part of their normal pages: – Tesco, BestBuy, Slideshare, The London Gazette, Newsweek, MSNBC, O’Reilly Catalog, the White House… – Creative Commons snippets are in RDFa (e.g., on Flickr) Effects of, e.g., Google of Facebook

Courtesy of Jay Myers, BestBuy, SemTech2010 Presentation BestBuy xxample of RDFa use

BestBuy example of RDFa Use Courtesy of Jay Myers, BestBuy, SemTech2010 Presentation

Reported in a BestBuy blog: – GoodRelations+RDFa improved Google rank tremendously – 30% increase in traffic on BestBuy store pages – Yahoo observers a 15% increase in click-through rate Today, BestBuy uses RDFa for much more than just snippets – E.g., to locate shops that have certain products on stock… Effects on BestBuy

Library of Congress RDFa use

Overstock.com example

Drupal content management system RDF support in Drupal v. 7 Major CMS system Has RDF at his core, pages contain RDFa In one step millions of pages of additional RDF data!

The Examiner.com

Extracting the data rdfa> python getdata.py ent:. … ent:RDFS a ent:Entailment ; dc:creator ; dc:date " "^^xsd:date ; dc:description "Unique identifier for RDFS Entailment" ; rdfs:comment "The specification for the RDFS entailment is … Semantics W3C Recommendation." ; rdfs:isDefinedBy ; rdfs:seeAlso. dc:title "Information Resource RDFS Entailment" ; xhv:stylesheet. a foaf:Person ; rdfs:seeAlso ; foaf:mbox ; foaf:name "Ivan Herman" ; foaf:title "Semantic Web Activity Lead" ; foaf:workplaceHomepage.

getdata.py is very simple import rdflib, sys if not (1 < len(sys.argv) < 4): print 'usage: python getdata.py url [ rdfa | rdfa1.1 | microdata | html ]' print ' eg: python getdata.py " sys.exit(0) url = sys.argv[1] format = sys.argv[2] if len(sys.argv) == 3 else 'rdfa1.1’ g = rdflib.Graph() g.parse(url, format=format) print g.serialize(format='n3')

Greenturtle Chrome plugin

Greenturtle Chrome plugin

Web developers want content providers to add structured data to HTML pages Content providers are incentivsed to do so because their content will be better understood, ranked higher, more useful, etc. RDFa is the most powerful and flexible of the knowledge mark up standards understood by search engines RDFa is also an alternative serialization of full RDF Conclusions