Presentation on theme: "OCLC Research TAI CHI Webinar 5/27/2010 A Gentle Introduction to Linked Data Ralph LeVan Sr. Research Scientist OCLC Research."— Presentation transcript:
OCLC Research TAI CHI Webinar 5/27/2010 A Gentle Introduction to Linked Data Ralph LeVan Sr. Research Scientist OCLC Research
What is Linked Data? Simple answer: “The term Linked Data is used to describe a method of exposing, sharing, and connecting data via dereferenceable URIs on the Web.” – Wikipedia (emphasis added)
What is Linked Data? Opaque answer: Linked Data is a protocol component of the Semantic Web, using URIs, Real World Objects, Content Negotiation and RDF documents to share information on the web. My goal is to try to make that statement a little less opaque.
Why Linked Data? (Aren’t we sharing our data already?) The simplest answer is that Linked Data is a way to share data that we’ve always been eager to share, but in a non- library-centered exchange format. Up until now, our choices for sharing our data have been MARC (not very popular with the Web community) or Dublin Core (not very semantically rich). Linked Data, using RDF as a description syntax, provides a framework for sharing semantically rich data in a Web-friendly way. (Remember, this is the Semantic Web we’re talking about.)
The Semantic Web Dream “The semantic web is a vision of information that is understandable by computers, so computers can perform more of the tedious work involved in finding, combining, and acting upon information on the web.” – Wikipedia “I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers.” – Tim Berners-Lee
The Semantic Web Challenge Finding a web-acceptible forum to develop the markup syntax that lets us expose the semantic richness of MARC- 21 and the structural richness of AACR2 The forum is the W3C Library Linked Data Incubator GroupLibrary Linked Data Incubator Group The Syntax is the Resource Description Framework (RDF)
RDF RDF is a general framework for describing things A statement in RDF has three parts: subject, predicate, object The subject and predicate must be URIs The object can be a value or a URI Eric Miller
Why URIs? In the old days, we’d assign a control number to something important, like an LCCN and associate our data with that control number. URI’s are the modern equivalent of a control number with several bonuses The domain name tells you who created the URI Since the URI is usually an HTTP URL, there’s often something to click through to
URLs: Pages or Concepts? When is a URL a reference to a concept and when is it just a pointer to a page? Does that statement mean I like that web page or I like that service? How do we tell the difference?
Real World Objects A rule has been agreed upon by the W3C on how to tell concept URLs from page URLs Concept URLs, called “Real World Objects, will return an HTTP status of 303 (See Other) with the URL of a page in the Location header That 303 redirect is the clue to Semantic Web applications that they had a pointer to a concept, not just a page
Content Negotiation How does a Semantic Web application find the RDF we’re making? It asks for RDF automatically, whenever it follows a link. HTTP applications, (e.g., browsers) can tell servers what form of a page they would prefer to get back: HTML, PDF, JPEG… They do that by sending an Accept header along with their request for the page. When servers decide what to send back based on that Accept header, we call that Content Negotiation
Content Negotiation A browser going to would send the Accept header “*.*” (meaning that it would take anything) and typically get HTML back.http://viaf.org/viaf/ / A Semantic Web application going to that exact same URL would send the Accept header “application/rdf+xml” and would get RDF back.
What is Linked Data? Linked Data is a component of the Semantic Web, using URIs, Real World Objects, Content Negotiation and RDF documents to share information on the web. Hopefully, that’s a little less opaque now. Maybe some examples would help.
Real World Objects in VIAF is a Real World Object. How do we know this? Because clicking on that URL gets us a 303 redirect to curl --include HTTP/ See Other Server: Apache-Coyote/1.1 Location: /viaf/ /