RDF as a Lingua Franca: Key Architectural Strategies David Booth, Ph.D. Cleveland Clinic (contractor) Semantic Technology Conference 15-June-2009 Latest.

Slides:



Advertisements
Similar presentations
Chapter 19 – Service-oriented Architecture
Advertisements

© 2011 TIBCO Software Inc. All Rights Reserved. Confidential and Proprietary. Towards a Model-Based Characterization of Data and Services Integration Paul.
TU e technische universiteit eindhoven / department of mathematics and computer science Modeling User Input and Hypermedia Dynamics in Hera Databases and.
Semantic Web Introduction
Snejina Lazarova Senior QA Engineer, Team Lead CRMTeam Dimo Mitev Senior QA Engineer, Team Lead SystemIntegrationTeam Telerik QA Academy SOAP-based Web.
Achieving Distributed Extensibility and Versioning in XML Dave Orchard W3C Lead BEA Systems.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice RDF and SOA David Booth, Ph.D. HP.
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Enterprise Information Integration.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice RDF and SOA David Booth, Ph.D. HP.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
SOA with Progress Philipp Walther Consultant. © 2007 Progress Software Corporation2 Agenda  SOA  Enterprise Service Bus (ESB)  The Progress SOA Portfolio.
Presentation 7 part 2: SOAP & WSDL. Ingeniørhøjskolen i Århus Slide 2 Outline Building blocks in Web Services SOA SOAP WSDL (UDDI)
A Secure Interoperable Infrastructure For Healthcare Information System Ehsan ul Haq Abrar Ahmed Sair
Apache Axis: A Set of Java Tools for SOAP Web Services.
Peoplesoft: Building and Consuming Web Services
CS 415 N-Tier Application Development By Umair Ashraf July 6,2013 National University of Computer and Emerging Sciences Lecture # 9 Introduction to Web.
Web Services Michael Smith Alex Feldman. What is a Web Service? A Web service is a message-oriented software system designed to support inter-operable.
UNIT-V The MVC architecture and Struts Framework.
Project Proposal: Academic Job Market and Application Tracker Website Project designed by: Cengiz Gunay Client: Cengiz Gunay Audience: PhD candidates and.
1 Electronic Health Records with Cleveland Clinic and Oracle Semantic Technologies David Booth, Ph.D., Cleveland Clinic (contractor) Oracle OpenWorld 20-Sep-2010.
JSP Standard Tag Library
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 1 Quick Tutorial – Part 1 Using Oracle BPM with Open Data Web Services David Webber.
Adapting Legacy Computational Software for XMSF 1 © 2003 White & Pullen, GMU03F-SIW-112 Adapting Legacy Computational Software for XMSF Elizabeth L. White.
T Network Application Frameworks and XML Web Services and WSDL Sasu Tarkoma Based on slides by Pekka Nikander.
Denotation as a Two-Step Mapping in Semantic Web Architecture David Booth, Ph.D. Cleveland Clinic (contractor) Identity Workshop, IJCAI 2009, Pasadena.
Web Architecture & Services (2) Representational State Transfer (REST)
December 15, 2011 Use of Semantic Adapter in caCIS Architecture.
An Introduction to Software Architecture
1 Technologies for distributed systems Andrew Jones School of Computer Science Cardiff University.
Web Services (SOAP, WSDL, UDDI) SNU OOPSLA Lab. October 2005.
Resource Identity and Semantic Extensions: Making Sense of Ambiguity David Booth, Ph.D. Cleveland Clinic (contractor) Semantic Technology Conference 25-June-2010.
Nadir Saghar, Tony Pan, Ashish Sharma REST for Data Services.
1 Advanced Software Architecture Muhammad Bilal Bashir PhD Scholar (Computer Science) Mohammad Ali Jinnah University.
STASIS Technical Innovations - Simplifying e-Business Collaboration by providing a Semantic Mapping Platform - Dr. Sven Abels - TIE -
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
INT-9: Implementing ESB Processes with OpenEdge ® and Sonic ™ David Cleary Principal Software Engineer.
SOAP-based Web Services Telerik Software Academy Software Quality Assurance.
Semantic Phyloinformatic Web Services Using the EvoInfo Stack Speaker: John Harney LSDIS Lab, Dept. of Computer Science, University of Georgia Mentor(s):
David Orchard W3C Lead BEA Systems Web service and XML Extensibility and Versioning.
Representational State Transfer (REST). What is REST? Network Architectural style Overview: –Resources are defined and addressed –Transmits domain-specific.
Web Services from 10,000 feet Part I Tom Perkins NTPCUG CertSIG XML Web Services.
Web Services An Introduction Copyright © Curt Hill.
Web Technologies Lecture 10 Web services. From W3C – A software system designed to support interoperable machine-to-machine interaction over a network.
Advanced Web Technologies Lecture # 5 By: Faraz Ahmed.
1 Service Oriented Architecture SOA. 2 Service Oriented Architecture (SOA) Definition  SOA is an architecture paradigm that is gaining recently a significant.
.NET Mobile Application Development XML Web Services.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
1 Semantic Web Technologies for UK HE and FE Institutions: Part 1: Background to the Development of the Web Brian Kelly UK Web Focus UKOLN
MTA SZTAKI Department of Distributed Systems Hogyan mixeljünk össze webszolgáltatásokat, ontológiákat és ágenseket? Micsik András.
Christopher Pierce (Cleveland Clinic)
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
© 2010 IBM Corporation RESTFul Service Modelling in Rational Software Architect April, 2011.
Representational State Transfer COMP6017 Topics on Web Services Dr Nicholas Gibbins –
Software Architecture Patterns (3) Service Oriented & Web Oriented Architecture source: microsoft.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
SysML v2 Model Interoperability & Standard API Requirements Axel Reichwein Consultant, Koneksys December 10, 2015.
Semantic Web Application Patterns: Pipelines, Versioning and Validation David Booth, Ph.D. (Consultant) W3C Linked Enterprise Data Patterns Workshop 7-Dec-2011.
Service Oriented Architecture (SOA) Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
David Booth, Ph.D. HP Software
Sabri Kızanlık Ural Emekçi
WEB SERVICES.
T Network Application Frameworks and XML Web Services and WSDL Sasu Tarkoma Based on slides by Pekka Nikander.
SOFTWARE DESIGN AND ARCHITECTURE
Meaningful Use of Electronic Medical Records through Semantic Technologies: The Cleveland Clinic Experience Christopher Pierce, Ph.D. (Cleveland Clinic)
Unit – 5 JAVA Web Services
Web Ontology Language for Service (OWL-S)
Representational State Transfer
Chapter 2 Database Environment.
Semantic Markup for Semantic Web Tools:
Presentation transcript:

RDF as a Lingua Franca: Key Architectural Strategies David Booth, Ph.D. Cleveland Clinic (contractor) Semantic Technology Conference 15-June-2009 Latest version of these slides:

About the speaker Senior Software Architect, Cleveland Clinic's SemanticDB project Senior research architect, HP Software – W3C GRDDL standard W3C Fellow – W3C Web Services Architecture document – W3C WSDL 2.0 standard AT&T Bell Labs Ph.D. Computer Science, UCLA

3 Outline Part 1: The Problem – Babelization – SOA and RDF Part 2: Architectural Strategies 1.RDF message semantics 2.GRDDL transformations from XML to RDF 3.REST-based SPARQL endpoints 4.Semantic Data Federation 5.Named graphs 6.Monotonicity Part 3: Example: Cleveland Clinic SemanticDB

4 PART 1 The Problem

5 Problem 1: Babelization Proliferation of data models (XML schemas, etc.) Parsing issues influence data models No consistent semantics Data chaos Tower of Babel, Abel Grimmer ( )

6 Problem 2: Integration complexity Many data producers, many data consumers Producers and consumers interact in complex ways Tight coupling hampers independent versioning...

7 Problem 3: Client/service versioning Need to version clients and services independently Data models evolve No such thing as the data model: – There are several, slightly different but related models Client v3 Client v2 Client v1 Service v3 Service v2 Service v1

8 RDF and SOA RDF can help: – Bridge vocabularies / data formats – Looser data coupling – Consistent semantics across applications SOA can help: – Looser process coupling How?

9 PART 2 Architectural Strategies

10 1. RDF message semantics Interface contract can specify RDF, regardless of serialization RDF pins the semantics Client Service RDF

11 But Web services use XML! XML is well known and used Existing apps may require specific XML or other formats that cannot be changed How can we gain the benefits of RDF message semantics while still accommodating XML?

12 Custom XML serializations of RDF Recall: RDF is syntax independent – Specifies info model -- not syntax! – Can be serialized in any agreed-upon way Therefore: – Can view existing XML formats as custom serialization of RDF! How? GRDDL...

13 What is GRDDL? "Gleaning Resource Descriptions from Dialects of Languages" W3C standard Permits RDF to be "gleaned" from XML XML document or schema specifies GRDDL transformation GRDDL transformation produces RDF from XML document – Transformation is typically written in XSLT

14 2. GRDDL transformations from XML to RDF Therefore: Same XML document can be consumed by: −Legacy XML app −RDF app App interface contract can specify RDF −Serializations can vary −Semantics are pinned by RDF Helps bridge XML and RDF worlds

15 Bridging XML and RDF Normalize to RDF Serialize as XML/other/RDF Service Core App Processing Client XML/other Input: Accept whatever formats are required – Use GRDDL to transform XML to RDF Output: Serialize to whatever formats are required – Generate XML/other directly (or even RDF!), or – SPARQL query can generate specific view first

16 3. REST-based SPARQL endpoints Consumer Producer SPARQL RDF HTTP Why REST and why SPARQL?

17 What is REST? REST: Representational State Transfer Architectural style Identified by Roy Fielding in PhD thesis Based on uniform interface – HTTP GET, PUT, POST, DELETE

18 Why REST? HTTP is ubiquitous Simpler than SOAP-based Web services (WS*) Looser process coupling – Easier to change/version the process flow

19 What is SPARQL? W3C standard Query language for RDF Modeled after SQL: SELECT... WHERE...

20 Why SPARQL? RDF gives looser data coupling Insulates consumers from internal model changes – Inferencing can transform data to consumer's desired model One endpoint supports multiple consumer needs – Each consumer gets what it wants Simpler interface for consumers – Uniform SPARQL interface instead of a different set of parameters for each REST endpoint

21 4. Semantic Data Federation Get data from multiple sources Provide data to consumers Model transformation, caching, etc. Conceptual component -- not necessarily a separate service Semantic Data Federation A1 A2 A3 B1 B2 C1 C2 X Y Z Ontologies & Rules SPARQL Adapters

22 Key features of semantic data federation REST-based SPARQL endpoint – Client gets just the data it wants Support for a variety of data sources – E.g., SQL, SPARQL(!), etc. – Easy to add a new data source adapter, e.g., HTTP Caching – Not multiple masters Inferencing Provides loose coupling at both data and process levels

23 Why inferencing? Allows new data sources to be more readily connected to existing data Allows new output vocabularies to be more readily supported in response to client needs Easier versioning with both clients and data sources – Inferencing can help bridge across versions

24 Data source adapters Semantic Data Federation Ontologies & Rules SPARQLAdapters Responsible for: – Mechanics of getting the data – Transforming from native format to RDF May involve custom code or reusable tools – E.g., Gloze performs XML RDF lift/drop

25 Add a new data source Ontologies & Rules Adapter SPARQL Strategy: 1.Adapter transforms native format to corresponding RDF Not directly to hub ontology! 2.Bridging rules transform to hub ontology Adapter Ontologies & Rules Data Source Semantic Data Federation

26 Adding a new output vocabulary Ontologies & Rules Adapter SPARQL Strategy: 1.Bridging rules transform from hub ontologies to new output vocabulary 2.Client can query using desired vocabulary Ontologies & Rules Data Source Client Semantic Data Federation

27 5. Named graphs Different queries require different subsets of data Entire data may be too big to process all at once So... Sets of RDF data can be bundled as named graphs Query strategy can pull in only the named graphs that are needed, i.e., a working set – Graphs can be freely merged – Contents can overlap

28 Using named graphs for data subsets Examples: Specific longitudinal data across patients Detailed data for each surgical event Data on a particular group of patients

29 6. Monotonicity Monotonicity: Old conclusions remain true when new facts are added System design choice – not automatic Without monotonicity: – Data change invalidates everything downstream – System is more tightly coupled – Different components must be versioned in lock step With monotonicity: – New data can be added freely – Easier versioning – More robust

30 Monotonicity is valuable, but not free! Data models can be simpler without monotonicity – Engineering trade-off Non-monotonic design: – “Patient123 highBloodPressure true” Monotonic design: – “Patient123 highBloodPressure true at 12:22PM 23-Aug-2007” – “Patient123 highBloodPressure false at 04:05PM 24-Aug-2007” How to get the best of both worlds?

31 Distilling data to simplify queries Detailed raw data can be distilled into simpler assertion sets – Easier for specific queries Example raw data: – “Patient123 BP: 150/96 at 12:22PM 23-Aug-2007” – “Patient123 BP: 155/97 at 06:32PM 23-Aug-2007” Distilled for “23-Aug-2007”: – “Patient123 highBloodPressure true” Meaning: “Patient123 had high blood pressure at some time”

32 Using named graphs for distilled data Distilled data: – Easier for specific queries – Less general than raw data – May involve information loss Named graph can act as context – Semantics are qualified (or loosened) – E.g. Named graph for 23-Aug-2007 indicates “Patient123 had high blood pressure at some time” SPARQL update language (SPARUL) will make named graphs easy to create from queries Raw data should also be kept (in separate named graphs)

33 Adding named graphs for distilled data “Is obese” “Had high blood pressure prior to admission” “Has condition X” Raw data Named graphs of distilled data

34 Abandoning unneeded named graphs Unneeded named graphs can be ignored – And eventually discarded Raw data Named graphs of distilled data

35 Summary of monotonicity strategy Don't change data! – Create new named graphs instead – Use named graphs to compartmentalize data But if you must change data: – Use named graphs to limit downstream impact – Only regenerate those that are affected Retain both raw data and distilled data (in separate named graphs)

36 Summary of architectural strategies 1.RDF message semantics 2.GRDDL transformations from XML to RDF 3.REST-based SPARQL endpoints 4.Semantic Data Federation 5.Named graphs 6.Monotonicity

37 PART 3 Example: Cleveland Clinic SemanticDB

38 SemanticDB Project Applies semantic web technology to: – Clinical research – Outcomes reporting – Quality reporting Sponsored by Cleveland Clinic's Heart and Vascular Institute

39 Cleveland Clinic SemanticDB Project SPARQL interface Patient registry Genetic patient registry Tagged literature, e.g., PUBMED Cyc natural language processing Patient-centric systems Semantic wiki Structured query Natural language query Instance data User interfaces... Data-source adaptors Semantic Data Federation Gene Ontology (GO) Ontology of Medicine Domain-specific Ontologies Data-source Ontologies SQL, SPARQL Cyc upper ontology Ontologies... Patient Data Entry

40 More information Cleveland Clinic SemanticDB project: RDF and SOA: SPARQL: GRDDL:

41 Questions?