OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research.

Slides:



Advertisements
Similar presentations
Terminology Services Ralph LeVan Senior Research Scientist OCLC.
Advertisements

Chapter 6 Server-side Programming: Java Servlets
Pierre-Johan CHARTRE Java EE - JAX-RS - Pierre-Johan CHARTRE
SRW/U for DSpace Ralph LeVan Research Scientist. What is SRW/U A Pair of HTTP-based Text Query Protocols – SRW: Search and Retrieve Web Service – SRU:
Linked Data Activities at OCLC Ralph LeVan Senior Research Scientist OCLC Research.
Ralph LeVan Research Scientist
OCLC Online Computer Library Center SRW & DSpace Ralph LeVan OCLC Research.
XPointer and HTTP Range A possible design for a scalable and extensible RDF Data Access protocol. Bryan Thompson draft Presented to the RDF.
Hypertext Transfer PROTOCOL ----HTTP Sen Wang CSE5232 Network Programming.
OCLC Research TAI CHI Webinar 5/27/2010 A Gentle Introduction to Linked Data Ralph LeVan Sr. Research Scientist OCLC Research.
Linked Data browsers. Linked Data Browser One reason DBpedia is successful is its data browser for humans Given a URL to a Dbpedia resourse, it shows.
Browsers and Servers CGI Processing Model ( Common Gateway Interface ) © Norman White, 2013.
HTTP Hypertext Transfer Protocol. HTTP messages HTTP is the language that web clients and web servers use to talk to each other –HTTP is largely “under.
Progress Report 11/1/01 Matt Bridges. Overview Data collection and analysis tool for web site traffic Lets website administrators know who is on their.
Multiple Tiers in Action
1 Web Search Interfaces. 2 Web Search Interface Web search engines of course need a web-based interface. Search page must accept a query string and submit.
Implementation of One Stop Search by XSLT By Dave Low University of Hong Kong 9-Dec-2003.
1 CS6320 – Why Servlets? L. Grewe 2 What is a Servlet? Servlets are Java programs that can be run dynamically from a Web Server Servlets are Java programs.
1 The World Wide Web. 2  Web Fundamentals  Pages are defined by the Hypertext Markup Language (HTML) and contain text, graphics, audio, video and software.
SaaS Software Container By Brian Moore Paul Kopacz.
Julien Thibault / Phil Brewster / Kristina Doing-Harris
Application for Internet Radio Directory 19/06/2012 Industrial Project (234313) Kickoff Meeting Supervisors : Oren Somekh, Nadav Golbandi Students : Moran.
Digital Object: A Virtual Online Storage Solution 598C Course Project Huajing Li.
ASHIMA KALRA.  INTRODUCTION TO JSP INTRODUCTION TO JSP  IMPLICIT OBJECTS IMPLICIT OBJECTS  COOKIES COOKIES.
M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,
RUG Australia meeting 2012 Feb 6, V Tiers & sequencing suppliers Tiers and sequencing and load balancing  Tiers = groups of suppliers.
Web Application Architecture and Communication. Displaying a Web page in a Browser
Configuration Management and Server Administration Mohan Bang Endeca Server.
DSpace UI Alexey Maslov. DSpace in general A digital library tool useful for storage, maintenance, and retrieval of digital documents Two types of interaction:
ALCME: OAI at OCLC Jeffrey A. Young OCLC Online Computer Library Center, Inc.
WikiD (Wiki/Data) Jeffrey A. Young OCLC Office of Research Presented by Stu Weibel CERN OAI4 Geneva, Switzerland 20 October 2005.
Integrating Live Plant Images with Other Types of Biodiversity Records Steve Baskauf Vanderbilt Dept. of Biological Sciences
Java CGI Lecture notes by Theodoros Anagnostopoulos.
© 2012 IBM Corporation Best Practices for Publishing RDF Vocabularies Arthur Ryman,
Nate Trail Network Development & MARC Standards Office 8/1/2006 With help from Sydney Olive How to Build, Display and Find METS Objects.
REST - Introduction Based on material from InfoQ.com (Stefan Tilkov) And slides from MindTouch.com (Steve Bjorg) 1.
© Copyright 2008 STI INNSBRUCK RDF Best Practice Best Practice Recipes for Publishing RDF Vocabularies W3C Working.
SNOWTAM Trial: REST Interface. AIXM XML Developers' Seminar 2 Contents Digital-SNOWTAM Trial Introduction REST Introduction REST in the Digital-SNOWTAM.
Http protocol Response-request Clients not limited to web browsers. Anything that can access code implementing the protocol works: –Standalone programs.
Domain Driven Web Development With WebJinn Sergei Kojarski College of Computer & Information Science Northeastern University joint work with David H. Lorenz.
OCLC Online Computer Library Center Interoperability Standards & Searching Multiple Repositories Ralph LeVan/OCLC Ray Denenberg/Library of Congress.
SNOWTAM Trial: REST Interface. AIXM XML Developers' Seminar 2 Contents Digital-SNOWTAM Trial Introduction REST Introduction REST in the Digital-SNOWTAM.
Web Technologies Interactive Responsiveness Function Hypertext Web E-Publishing Simple Response Web Fill-in Forms Object Web « Full-Blown » Client/Server.
AxKit A member of the Apache XML project Ryan Maslyn Kyle Bechtel.
CNI, 4th April 2006 Slide 1 Key Standards Update: SRU (“Technical” Details) Dr. Robert Sanderson Dept. of Computer Science University of Liverpool
Web Server Design Assignment #2: Conditionals & Persistence Due: 02/24/2010 Old Dominion University Department of Computer Science CS 495/595 Spring 2010.
2007cs Servers on the Web. The World-Wide Web 2007 cs CSS JS HTML Server Browser JS CSS HTML Transfer of resources using HTTP.
REpresentational State Transfer.  Resources  Representations  Verbs  Links  Headers  HTTP Status Codes.
SRW/U: Re-Introduction SRW is a Web Services based Information Retrieval Protocol Motivations: Create an easy to implement protocol with the power of Z39.50.
Fall 2000C.Watters1 World Wide Web and E-Commerce Servers & Server Side Processing.
RESTful Web Services What is RESTful?
DSpace System Architecture 11 July 2002 DSpace System Architecture.
ASP-2-1 SERVER AND CLIENT SIDE SCRITPING Colorado Technical University IT420 Tim Peterson.
Web Server Design Assignment #3: Transfer Encoding & Content Negotiation Due: 03/24/2010 Old Dominion University Department of Computer Science CS 495/595.
7-1 Active Server and ADO Colorado Technical University IT420 Tim Peterson.
Introduction to ORM Hibernate Hibernate vs JDBC. May 12, 2011 INTRODUCTION TO ORM ORM is a programming technique for converting data between relational.
Web Cache. What is Cache? Cache is the storing of data temporarily to improve performance. Cache exist in a variety of areas such as your CPU, Hard Disk.
Fall 2000C.Watters1 World Wide Web and E-Commerce Servers & Server Side Processing.
PHP: Further Skills 02 By Trevor Adams. Topics covered Persistence What is it? Why do we need it? Basic Persistence Hidden form fields Query strings Cookies.
JAFER Toolkit Project Oxford University 1 JAFER Java-based high level Z39.50 toolkit Matthew Dovey; Colin Tatham; Antony Corfield; Richard Mawby Oxford.
Apache Cocoon – XML Publishing Framework 데이터베이스 연구실 박사 1 학기 이 세영.
Spitfire Overview Gavin McCance.
Contents Digital-SNOWTAM Trial Introduction REST Introduction
Web Server Design Assignment #2: Conditionals & Persistence
Unit 6-Chapter 2 Struts.
WEB API.
AutoSuggest This is for ELM Ralph LeVan Sr. Research Scientist
How the VIAF Magic Happens
OAI and Metadata Harvesting
Web Server Design Assignment #2: Conditionals & Persistence
Presentation transcript:

OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

Goal: Expose Text Database Content as Linked Data Technique: Using a combination of the urlrewritefilter from tuckey.org, the content negotiation component from the Freie Universität Berlin’s Pubby server and our Open Source SRW/U server you can expose the records in your database as Linked Data

Roadmap 1.SRW/U Server 2.URIs for records 3.Real World Objects for records 4.Multiple record formats 5.RDF needs to be returned 6.Content Negotiation

SRW/U Server SRW/U server sits in front of text databases We have interfaces for DSpace, Lucene and Pears Easy to write your own interface Convert CQL query to native query language Do search and return a resultset object Return records from the resultset (The Lucene interface is a good simple example of how to build your own database interface) I expose my SRW/U service as /search, but you can put it wherever you want.

URIs for records urlrewritefilter implements apache mod_rewrite patterns for java servlets It sees the URI and converts it to an SRU search. ^/([0-9][0-9]+)/$ /search?query=local.viafID+exact+%22$1%22 E.g. viaf/123 becomes viaf/search?query=viafID+exact+%22123%22

Aside: What to Return? An SRU query returns a searchRetrieveResponse. A smart client can pick its record out of that response, but that seems wrong A bad URI will result in “no records found”, but a 404 (record not found) is more appropriate Solution: add a new parameter (service=APP) to signal that this was a request for a single record E.g. viaf/123 becomes viaf/search?query=viafID+exact+%22123%22&service=APP

Real World Objects for records urlrewritefilter can generate 303 (see other) redirects based on URI patterns ^/([0-9][0-9]+)$ /viaf/$1/ E.g. viaf/123 redirects to viaf/123/ The target, viaf/123/, is called the Generic Record (note: now we use viaf/123/ as the URI that gets turned into the SRU search)

Multiple record formats urlrewritefilter plus the new httpAccept parameter in SRU E.g., viaf/123/marc21.xml becomes viaf/search?query=viafID+exact+123&httpAccept=application /marc21+xml SRU is configured with a list of supported media types and the XSL stylesheets that render them

MimeType Configuration XML.mimeTypes=application/sru+xml;q=0.85, application/xml, text/xml HTML.mimeTypes=text/html;q=0.9, application/xhtml+xml RSS.mimeTypes=application/rss+xml;q=0.8 RSS.styleSheet=viaf2rss.xsl M21.mimeTypes=application/marc21+xml;q=0.7 M21.styleSheet=viaf2marc21.xsl marc21HTML.mimeTypes=application/marc21+html;q=0.7 marc21HTML.styleSheet=viaf2marc21.xsl

Aside: 123/marc21.xml NOT 123.m21 The Generic Record being at viaf/123/ seems to imply that it is a collection of records How do I ask for the HTML version of the MARC21 version of viaf/123 if suffix mangling is all I have?

RDF needs to be returned viaf/123/rdf.xml Making good RDF is tricky and beyond the scope of this presentation (but I think we’re getting close to agreements of sensible basics)

Content Negotiation on Generic Record Pubby has a really nice Content Negotiation module It is configured with the list of supported media types with optional quality measures It takes an HTTP Accept header and returns the supported media type that matches best SRU is configured with a list of supported media types and their quality measures and the XSL stylesheets that render them (see previous slide)

Result viaf/123 redirected to viaf/123/ viaf/123/ turned into viaf/search?query=viafID+exact+123 VIAF record returned by SRW/U server Content Negotiation causes viaf/123/viaf.html to be returned to googlebot and browsers, viaf/123/rdf.xml to applications that ask for application/rdf+xml and viaf/123/viaf.xml returned when no preference is provided

Lucene Database Demonstration TBD

Questions?