KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Institute AIFB www.kit.edu Linked Data and Services.

Slides:



Advertisements
Similar presentations
Pierre-Johan CHARTRE Java EE - JAX-RS - Pierre-Johan CHARTRE
Advertisements

Creating Linked Data Juan F. Sequeda Semantic Technology Conference June 2011.
Semantic Web Introduction
University of Illinois Visualizing Text Loretta Auvil UIUC February 25, 2011.
1 Publishing Linked Sensor Data Semantic Sensor Networks Workshop 2010 In conjunction with the 9th International Semantic Web Conference (ISWC 2010), 7-11.
JSI Sensor Middleware. Slide 2 of x Embedded vs. Midleware based Architecture for Sensor Metadata Management Embedded approach assign an IP address to.
Data Intensive Techniques to Boost the Real-time Performance of Global Agricultural Data Infrastructures SEMAGROW U SING A POWDER T RIPLE S TORE FOR BOOSTING.
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. 1 The Architecture of a Large-Scale Web Search and Query Engine.
LINKED DATA COMS E6125 Prof. Gail Kaiser Presented By : Mandar Mohe ( msm2181 )
Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.
Dark Nebula: Using the Cloud to build a RESTful Web Service John Fisher, Robert Fisher, and Peter Bui Department of Computer Science With the emerging.
Data Sets, Vocabularies and Tools Pablo N. Mendes Freie Universität Berlin 1st year review Luxembourg, December /02/11.
Architectural Design.
Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator Session: Managing Ecological Data for Effective Use and Reuse Patrice Seyed.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Towards a Social Notion of Provenance on the Web Andreas Harth,
Shared innovation How to Publish Linked Data on the Web Dr. Tom Heath Platform Division Talis Information Ltd
Interoperability Scenario Producing summary versions of compound multimedia historical documents.
Chapter 4 Networking and the Internet Introduction to CS 1 st Semester, 2015 Sanghyun Park.
Michalis Vafopoulos NTUA, GFOSS & The transformers GREEN CITY HACKATHON.
Web 2.0: Concepts and Applications 6 Linking Data.
The Semantic Web Web Science Systems Development Spring 2015.
Lecture 10: 9/26/2002CS149D Fall CS149D Elements of Computer Science Ayman Abdel-Hamid Department of Computer Science Old Dominion University Lecture.
Shared innovation Linking Distributed Data across the Web Dr Tom Heath Researcher, Platform Division Talis Information Ltd t
1 Virtualisation and Validation of Smart City Data Dr Sefki Kolozali Institute for Communication Systems Electronic Engineering Department University of.
Integrating Live Plant Images with Other Types of Biodiversity Records Steve Baskauf Vanderbilt Dept. of Biological Sciences
Interoperability through Library APIs Library Technology Services Open House 7/30/15.
KIT – University of the State of Baden-Württemberg and National Large-scale Research Center of the Helmholtz Association Institute of Applied Informatics.
-1- Philipp Heim, Thomas Ertl, Jürgen Ziegler Facet Graphs: Complex Semantic Querying Made Easy Philipp Heim 1, Thomas Ertl 1 and Jürgen Ziegler 2 1 Visualization.
Taking Action: Linked Data for Digital Library Managers Silvia Southwick and Cory Lampert UNLV Digital Collections American Library Association Annual.
Libraries at the Network Level: APIs, Linked Data, and Cloud Computing Roy Tennant OCLC Research rtennant on Twitter.
Boris Villazón-Terrazas, Ghislain Atemezing FI, UPM, EURECOM, Introduction to Linked Data.
Future Learning Landscapes Yvan Peter – Université Lille 1 Serge Garlatti – Telecom Bretagne.
Visualizing Linked Open Data Andra Waagmeester. Overview Context: Pathways Howto: Linked data Make sense of linked data Visualizing linked data.
LOD for the Rest of Us Tim Finin, Anupam Joshi, Varish Mulwad and Lushan Han University of Maryland, Baltimore County 15 March 2012
Keyword Query Routing.
MyActivity: A Cloud-Hosted Ontology-Based Framework for Human Activity Querying Amin BakhshandehAbkear Supervisor:
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
Problems in Semantic Search Krishnamurthy Viswanathan and Varish Mulwad {krishna3, varish1} AT umbc DOT edu 1.
Quality views: capturing and exploiting the user perspective on data quality Paolo Missier, Suzanne Embury, Mark Greenwood School of Computer Science University.
Large-scale Linked Data Management Marko Grobelnik, Andreas Harth (Günter Ladwig), Dumitru Roman Big Linked Data Tutorial Semantic Days 2012.
Linked Data: Emblematic applications on Legacy Data in Libraries.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Introduction to the Semantic Web and Linked Data
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Dr. Lowell Vizenor Ontology and Semantic Technology Practice Lead Alion Science and Technology Semantic Technology: A Basic Introduction.
Shridhar Bhalerao CMSC 601 Finding Implicit Relations in the Semantic Web.
RDF and Relational Databases
© Copyright 2015 STI INNSBRUCK PlanetData D2.7 Recommendations for contextual data publishing Ioan Toma.
Information Sharing on the Social Semantic Web Aman Shakya* and Hideaki Takeda National Institute of Informatics, Tokyo, Japan The Second NEA-JC Workshop.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Linked Data Publishing on the Semantic Web Dr Nicholas Gibbins
© 2010 IBM Corporation RESTFul Service Modelling in Rational Software Architect April, 2011.
Linked Open Data Dataset from Related Documents Petya Osenova and Kiril Simov IICT-BAS LDL-2016, LREC, Portoroz.
Shared innovation Linking Distributed Data across the Web Dr Tom Heath Researcher, Platform Division Talis Information Ltd t
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
Samad Paydar WTLab Research Group Ferdowsi University of Mashhad LD2SD: Linked Data Driven Software Development 24 th February.
Shared innovation Linking Distributed Data across the Web Dr Tom Heath Researcher, Platform Division Talis Information Ltd t
SDN controllers App Network elements has two components: OpenFlow client, forwarding hardware with flow tables. The SDN controller must implement the network.
The Semantic Web By: Maulik Parikh.
Linked Data Web that can be processed by machines
Warm Handshake with Websites, Servers and Web Servers:
Harnessing the Semantic Web to Answer Scientific Questions:
Data.gov: Web, Data Web, Social Data Web 7/22/2010 #health2stat.
Work plan revisited Activity 3 Impact Activity 4 Management
YourDataStories: Transparency and Corruption Fighting through Data Interlinking and Visual Exploration Georgios Petasis1, Anna Triantafillou2, Eric Karstens3.
Probabilistic Data Management
LOD reference architecture
CSIRO ICT Centre Conference
Linked Data Ryan McAlister.
Presentation transcript:

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Institute AIFB Linked Data and Services Andreas Harth and Barry Norton

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Outline Motivation Linked Data Principles Query Processing over Linked Data Linked Data Services (LIDS) and Linked Open Services (LOS) Conclusion

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Motivation Semantic Web/Linked Data technologies are well-suited for data integration Taking the LIDS off Data Silos Andreas Harth Data Integration Interactive Data Exploration Common Data Format/Access Protocol ! ?

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linked Data Principles* 1. Use URIs to name things; not only documents, but also people, locations, concepts, etc. 2. To enable agents (human users and machine agents alike) to look up those names, use HTTP URIs 3. When someone looks up a URI we provide useful information; with 'useful' in the strict sense we usually mean structured data in RDF. 4. Include links to other URIs allowing agents (machines and humans) to discover more things (*)

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Correspondence between thing-URI and source-URI 5 User Agent Web Server HTTP GET RDF

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Correspondence between thing-URI and source-URI 6 User Agent Web Server HTTP GET 303HTTP GET RDF

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association

Queries over Linked Data SELECT ?f ?n WHERE { an:f#ah foaf:knows ?f. ?f foaf:name ?n. } ?f?n SELECT ?x1 ?x2 WHERE { dblppub:HoganHP08 dc:creator ?a1. ?x1 owl:sameAs ?a1. ?x2 foaf:knows ?x1. }

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Data warehousing or materialisation-based approaches (MAT) Querying Data Across Sources Andreas Harth Data Summaries for On-Demand Queries over Linked Data CRAWL INDEXSERVE SELECT * FROM… RS Distributed query processing approaches (DQP) RS

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association DQP on Linked Data Andreas Harth Data Summaries for On-Demand Queries over Linked Data SELECT * FROM… RS RS SELECT ?s WHERE… TP HTTP GET HTTP GET ODBC

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Query Processing Overview Andreas Harth Data Summaries for On-Demand Queries over Linked Data TP (an:f#ah foaf:knows ?f) SELECT ?f ?n WHERE { an:f#ah foaf:knows ?f. ?f foaf:name ?n. } TP (?f foaf:name ?n) ?f?n Brickley Select source(s) HTTP GET RDF HTTP GET RDF

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Barry

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Problem: Source Selection for Triple Patterns Andreas Harth Data Summaries for On-Demand Queries over Linked Data (?s ?p ?o) (#s ?p ?o) (?s #p ?o) (?s ?p #o) (#s #p ?o) (#s ?p #o) (?s #p #o) (#s #p #o) Given a triple pattern, which source can contribute bindings for the triple pattern?

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Keep index of properties and/or classes contained in sources (?s #p ?o), (?s rdf:type #o) Covers only queries containing schema-level elements Commonly used properties select potentially too many sources Schema-Level Indices [Stuckenschmidt et al. 2004] Andreas Harth Data Summaries for On-Demand Queries over Linked Data SELECT ?f ?n WHERE { an:f#ah foaf:knows ?f. ?f foaf:name ?n. } SELECT ?x1 ?x2 WHERE { dblppub:HoganHP08 dc:creator ?a1. ?x1 owl:sameAs ?a1. ?x2 foaf:knows ?x1. }

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Exploits correspondence between thing-URI and source-URI Linked Data sources (aka RDF files) return typically triples with a subject corresponding to the source Sometimes the sources return triples with object corresponding to the source (#s ?p ?o), (#s #p ?o), (#s #p #o) (?s ?p #o), (?s #p #o) Incomplete wrt. patterns but also wrt. to URI reuse across sources Limited parallelism, unclear how to schedule lookups Direct Lookup (DL) [Hartig et al. 2009] Andreas Harth Data Summaries for On-Demand Queries over Linked Data SELECT ?f ?n WHERE { an:f#ah foaf:knows ?f. ?f foaf:name ?n. } SELECT ?x1 ?x2 WHERE { dblppub:HoganHP08 dc:creator ?a1. ?x1 owl:sameAs ?a1. ?x2 foaf:knows ?x1. }

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Combined description of schema-level and instance-level Use approximation to reduce index size (incurs false positives) Possible to use entire query for source selection Parallel lookups since sources can be determined for the entire query (?s ?p ?o), (#s ?p ?o), (?s #p ?o), (?s ?p #o), (#s #p ?o), (#s ?p #o), (?s #p #o), (#s #p #o) and combinations of triple patterns Approximate Data Summaries Andreas Harth Data Summaries for On-Demand Queries over Linked Data SELECT ?f ?n WHERE { an:f#ah foaf:knows ?f. ?f foaf:name ?n. } SELECT ?x1 ?x2 WHERE { dblppub:HoganHP08 dc:creator ?a1. ?x1 owl:sameAs ?a1. ?x2 foaf:knows ?x1. }

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Implementation Deploy wrappers „in the cloud“ Google App Engine: hosting of Java and Python webapps on Google’s Cloud infrastructure Limited amount of processing time (6hrs/day) Single-threaded applications Suited for deploying wrappers e.g. converts Twitter user data to RDFhttp://twitter2foaf.appspot.com/

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linking Open Data Cloud 2007

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linking Open Data Cloud 2008

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linking Open Data Cloud 2009

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linking Open Data Cloud 2010

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Geonames Services

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Geonames Services

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Geonames Services {"weatherObservation": {"clouds":"broken clouds", "weatherCondition":"drizzle", "observation":"LESO Z 03007KT 340V040 CAVOK 23/15 Q1010", "windDirection":30, "ICAO":"LESO",...

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association {"weatherObservation": {"clouds":"broken clouds", "weatherCondition":"drizzle", "observation":"LESO Z 03007KT 340V040 CAVOK 23/15 Q1010", "windDirection":30, "ICAO":"LESO",... Geonames Services

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linked Open Service Principles REST Principles 1. Application state and functionality is divided into resources 2. Every resource is uniquely addressable 3. All resources share a uniform interface: a) A constrained set of well-defined operations b) A constrained set of content types Linked Data Principles 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) 4. Include links to other URIs. so that they can discover more things. Linked Open Service Principles 1. Describe services as LOD prosumers with input and output descriptions as SPARQL graph patterns 2. Communicate RDF by RESTful content negotiation 3. The output should make explicit its relation with the input

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association LOS Weather Service

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association LOS Geo Resources

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Resource-Based Linked Open Services GET Accept: text/html 303 REDIRECT /page GET Accept: application/rdf+xml (or text/n3) 303 REDIRECT /data Linked Data Linked Service GET /weather Accept: application/rdf+xml (or text/n3) 200

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Interlinking Data with Data from Services?

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Data Services Given input, provide output Input and output are related in a service-specific way Do not change the state of the world E.g. GeoNames findNearbyWikipedia service Input: lat/lon Output: places Relation: output places that are nearby input place InputOutput Service relation defines

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linked Data Services We’d like to integrate data services with Linked Data 1. LIDS need to adhere to Linked Data principles We’d like to use data services in software programs 2. LIDS need machine-readable descriptions of input and output Compared to naïve approach: assign URI to service output Relationship between input and output is explicitly described Dynamicity is supported Multiple or no output resources can be linked to input

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association 1. Data Services as Linked Data Input is given as URI ?lat=37.416&lng= #point Resolving the URI yields : :point :point foaf:based_near dbp:Palo_Alto%2C_California ; foaf:based_near dbp:Packard%27s_garage. Service Endpoint Parameters Input Identifier Input Output Relation

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association 2. LIDS Descriptions LIDS characterised by Endpoint URI ep, which is the base for all input entities Local identifier i of input entity List of parameters X i Basic graph pattern T i describing conditions on parameters Basic graph pattern T o describing minimum output data Example: ep = i = point X i = {?lat, ?lng} T i = ?point a Point. ?point geo:lat ?lat. ?point geo:long ?lng T o = ?point foaf:based_near ?feature

Interlink LIDS and Linked Data Generate service URIs with input bindings, from evaluating : select X i where T i sameAs: binding for i

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Scale-Up Experiment: Link BTC to GeoNames 3 billion triples from the Billion Triple Challenge (BTC) 2010 data set: Annotate with LIDS wrapper of GeoNames findNearby service Annotation time: < 12 hours on laptop! ~ 12 hours for uncompressing the data set, cleaning results, and gather statistics Original BTC data: 74 different domains that linked to GeoNames URIs Interlinking process added 891 new now linked to LIDS geowrap In total 2,448,160 new links were added

Query Answering using LIDS and Linked Data Query execution resolves URIs => enlarges data set LIDS are interlinked Query is executed again on new data set Repeat until no new links or no new data Combine results

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Experiment: Query Answering Input: List of 562 (potential) universities from Facebook Graph API Output: Facebook fans and DBpedia student numbers for 104 universities PREFIX u: SELECT ?n ?f ?s WHERE { u:list foaf:topic ?u. ?u foaf:name ?n. ?u og:fan_count ?f.?u d:numberOfStudents ?s }

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linked Services and PlanetData Several areas seem likely to produce services: Stream, inc. Sensor, resources (latest values) Any others exposing dynamic resources Dynamic computations, inc. on-the-fly quality assessments Other areas seem likely to consider service technologies and move towards more service-like HTTP interactions Access control (OpenID, OAuth, etc.) Finally, remaining areas could serve to complement LIDS/LOS alignment Provenance