Web Services as Integrators of Public Chemistry Databases Gary Wiggins School of Informatics Indiana University

Slides:



Advertisements
Similar presentations
웹 서비스 개요.
Advertisements

18 Copyright © 2005, Oracle. All rights reserved. Distributing Modular Applications: Introduction to Web Services.
Web Service Architecture
Siebel Web Services Siebel Web Services March, From
Overview of Web Services
31242/32549 Advanced Internet Programming Advanced Java Programming
Web Service Ahmed Gamal Ahmed Nile University Bioinformatics Group
1 Understanding Web Services Presented By: Woodas Lai.
Web Services Darshan R. Kapadia Gregor von Laszewski 1http://grid.rit.edu.
Web Services Nasrullah. Motivation about web service There are number of programms over the internet that need to communicate with other programms over.
WEB SERVICES DAVIDE ZERBINO.
Interactive Systems Technical Design Seminar work: Web Services Janne Ojanaho.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
1 Introduction to SOA. 2 The Service-Oriented Enterprise eXtensible Markup Language (XML) Web services XML-based technologies for messaging, service description,
Presentation 7 part 2: SOAP & WSDL. Ingeniørhøjskolen i Århus Slide 2 Outline Building blocks in Web Services SOA SOAP WSDL (UDDI)
Latest techniques and Applications in Interprocess Communication and Coordination Xiaoou Zhang.
A New Computing Paradigm. Overview of Web Services Over 66 percent of respondents to a 2001 InfoWorld magazine poll agreed that "Web services are likely.
Grid Computing, B. Wilkinson, 20043a.1 WEB SERVICES Introduction.
Web Services Seppo Heikkinen MITA seminar/TUT
Ch 12 Distributed Systems Architectures
Web Service What exactly are Web Services? To put it quite simply, they are yet another distributed computing technology (like CORBA, RMI, EJB, etc.).
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Web Services Michael Smith Alex Feldman. What is a Web Service? A Web service is a message-oriented software system designed to support inter-operable.
Web service testing Group D5. What are Web Services? XML is the basis for Web services Web services are application components Web services communicate.
Processing of structured documents Spring 2003, Part 6 Helena Ahonen-Myka.
Introduction to UDDI From: OASIS, Introduction to UDDI: Important Features and Functional Concepts.
Web Services Mohamed Fahmy Dr. Sherif Aly Hussein.
Web Services Architecture1 - Deepti Agarwal. Web Services Architecture2 The Definition.. A Web service is a software system identified by a URI, whose.
Web services: Why and How OOPSLA 2001 F. Curbera, W.Nagy, S.Weerawarana Nclab, Jungsook Kim.
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
DISTRIBUTED COMPUTING
UDDI ebXML(?) and such Essential Web Services Directory and Discovery.
Outline  Enterprise System Integration: Key for Business Success  Key Challenges to Enterprise System Integration  Service-Oriented Architecture (SOA)
1 HKU CSIS DB Seminar: HKU CSIS DB Seminar: Web Services Oriented Data Processing and Integration Speaker: Eric Lo.
1 Technologies for distributed systems Andrew Jones School of Computer Science Cardiff University.
Dodick Zulaimi Sudirman Lecture 14 Introduction to Web Service Pengantar Teknologi Internet Introduction to Internet Technology.
Web Services (SOAP, WSDL, UDDI) SNU OOPSLA Lab. October 2005.
Dr. Bhavani Thuraisingham October 2006 Trustworthy Semantic Webs Lecture #16: Web Services and Security.
Web Services based e-Commerce System Sandy Liu Jodrey School of Computer Science Acadia University July, 2002.
EU Project proposal. Andrei S. Lopatenko 1 EU Project Proposal CERIF-SW Andrei S. Lopatenko Vienna University of Technology
© DATAMAT S.p.A. – Giuseppe Avellino, Stefano Beco, Barbara Cantalupo, Andrea Cavallini A Semantic Workflow Authoring Tool for Programming Grids.
Web Services Standards. Introduction A web service is a type of component that is available on the web and can be incorporated in applications or used.
1 Advanced Software Architecture Muhammad Bilal Bashir PhD Scholar (Computer Science) Mohammad Ali Jinnah University.
Web Services. Abstract  Web Services is a technology applicable for computationally distributed problems, including access to large databases What other.
Introduction to Server-Side Web Development Introduction to Server-Side Web Development using JSP and Web Services JSP and Web Services 18 th March 2005.
Web Services Presented By : Noam Ben Haim. Agenda Introduction What is a web service Basic Architecture Extended Architecture WS Stacks.
RSISIPL1 SERVICE ORIENTED ARCHITECTURE (SOA) By Pavan By Pavan.
Kemal Baykal Rasim Ismayilov
1 Registry Services Overview J. Steven Hughes (Deputy Chair) Principal Computer Scientist NASA/JPL 17 December 2015.
1 G52IWS: Web Services Chris Greenhalgh. 2 Contents The World Wide Web Web Services example scenario Motivations Basic Operational Model Supporting standards.
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
Introduction to Web Services Presented by Sarath Chandra Dorbala.
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
A service Oriented Architecture & Web Service Technology.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Service Oriented Architecture (SOA) Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Java Web Services Orca Knowledge Center – Web Service key concepts.
Sabri Kızanlık Ural Emekçi
Service-Oriented Computing: Semantics, Processes, Agents
Implementing a service-oriented architecture using SOAP
Wsdl.
Service-centric Software Engineering
WEB SERVICES DAVIDE ZERBINO.
OWL-S: Bringing Services to the Semantic Web
Presentation transcript:

Web Services as Integrators of Public Chemistry Databases Gary Wiggins School of Informatics Indiana University

Interdisciplinary Science  “The boundary between the known and the unknown is where science flourishes.” --Michael Shermer  “There are known knowns… There are known unknowns… There are unknown unknowns…” --Donald Rumsfeld  BUT….  What about UNKNOWN knowns?

zzzzzCAS ““““In an industry first, Chemical Abstracts Service (CAS) has unveiled a revolutionary new literature searching tool which will permit scientists to search and retrieve the world’s chemical literature— including patents and obscure technical reports—in their sleep.” --Author unknown

Overview of the Talk  Introduction to Web Services  Public Databases and Data Repositories for Chemistry  Projects Underway or Planned at Indiana University

Web Services Introduction  What are “Web Services”? A distributed invocation system built on Grid computing A distributed invocation system built on Grid computing Independent of platform and programming languageIndependent of platform and programming language Built on existing Web standardsBuilt on existing Web standards A service oriented architecture with A service oriented architecture with Interfaces based on Internet protocolsInterfaces based on Internet protocols Messages in XML (except for binary data attachments)Messages in XML (except for binary data attachments)

Service Oriented Architecture (SOA)  Goal is to achieve loose coupling among interacting software agents  Define service: a unit of work done by a service provider to achieve desired end results for a service consumer  Both provider and consumer are roles played by software agents on behalf of their owners.

Web Services Architectures  Individual services are registered globally Broken down into individual services with inputs and outputs specified Broken down into individual services with inputs and outputs specified  Services are published  Services are requested  Open registry, publishing, and requesting

Service-Oriented Architecture  From Curcin et al. DDT, 2005, 10(12),867

Web Services for Science  Invisible Services, Semantic Web, and Grid  Easy-to-use tools for any scientist  High throughput, resource intensive computing done for low cost/resources  Shared community Collaborations between labs and fields Collaborations between labs and fields Shared data Shared data Shared tools Shared tools

e-Science and the Grid 1  e-Science: Major UK Program global collaboration in key areas of science and the next generation of infrastructure that will enable it global collaboration in key areas of science and the next generation of infrastructure that will enable it reflects growing importance of international laboratories, satellites and sensors and their integrated analysis by distributed teamsreflects growing importance of international laboratories, satellites and sensors and their integrated analysis by distributed teams total investment of some £200M over the five-year period from 2001 to 2006total investment of some £200M over the five-year period from 2001 to 2006  CyberInfrastructure: the analogous US initiative  Grid Technology: supports e-Science & Cyberinfrastructure

e-Science and the Grid 2

Basic Architectures: Servlets/CGI and Web Services Browser Web Server HTTP GET/POST DB or MPI Appl. JDBC Web Server DB or MPI Appl. JDBC Browser Web Server SOAP GUI Client SOAP WSDL

When To Use Web Services?  Applications do not have severe restrictions on reliability and speed.  Two or more organizations need to cooperate. One needs to write an application that uses another’s service. One needs to write an application that uses another’s service.  Services can be upgraded independently of clients.  Services can be easily expressed with simple request/response semantics and simple state.

Web Services Benefits  Web services provide a clean separation between a capability and its user interface.  Increase in productivity  Increase in flexibility  Rapid return on investment  Integration across multiple applications

Web Services Advantages  Output in human- and computer-readable formats  I/O formats based on standard Internet protocols  Resources accessible server to server allow automated I/O  Integration based on specific services: you select services or data needed without downloading the entire data set

Web Services Advantages  Description protocols provide details of service provided and interface components  Semantic Web standards increase efficiency  Use a central registry and standardized description of services  Quality and status of the information is dynamically available

Web Services Drawbacks  Based on new technologies  Time and commitment required to learn  Standards still in a state of flux  Issues with quality of data, (and for chemistry, quantity of open data), security, and privacy

Components of Web Services  Protocols WSDL WSDL SOAP SOAP UDDI UDDI  XML as a basis for the protocols  Ontologies OWL: Ontology Web Language OWL: Ontology Web Language  Semantic Web

WSDL: Web Service Definition Language  Describes a service’s interface to clients  Services register themselves with Web Services  WSDL describes how to contact and interact with services I/O, operations and messages to aid interaction with client I/O, operations and messages to aid interaction with client

SOAP: Simple Object Access Protocol  Maps abstract WSDL service descriptions to concrete implementations  Flexible protocol to communicate information between server and server or client and server using XML  Supports Remote Procedure Calls  Allows layers (security, authentication, transactions) over the basic SOAP elements

UDDI: Universal Description, Discovery, and Integration  Provides ways for clients and services to interact with other services  Uses XML  Defines the means of access, e.g., URL URL  Defines services hosted by an entity  Business-oriented tags  Uses SOAP for communicating

XML and Web services  XML lends itself to distributed computing: It’s just a data description. It’s just a data description. Platform, programming language independent Platform, programming language independent  Web Services Description Language (WSDL) Describes how to invoke a service Describes how to invoke a service Can bind to SOAP, other protocols for actual invocation Can bind to SOAP, other protocols for actual invocation  Simple Object Access Protocol (SOAP) Wire protocol extension for conveying RPC calls Wire protocol extension for conveying RPC calls Can be carried over HTTP, SMTP Can be carried over HTTP, SMTP

RDF: Resource Description Framework  A standard for statements describing resources  RDF statements consist of a Subject: the resource being described Subject: the resource being described Predicate: the property ascribed to the resource Predicate: the property ascribed to the resource Object: the entity to which the resource bears that property Object: the entity to which the resource bears that property  An alternative to UDDI: said to more flexible and extensible than UDDI

RDFS: Resource Description Framework Schema  RDF statements ascribe properties to resources, but RDF provides no means for describing those properties.  RDFS defines classes and properties that are used to describe classes, properties, and other resources.  RDFS vocabulary descriptions are written in RDF.  RDFS provides a means of specifying a vocabulary for a particular domain.

OWL: Web Ontology Language  Builds on RDF and RDFS and adds a means for richer descriptions of properties and classes Disjoint classes Disjoint classes Cardinality of classes Cardinality of classes Characteristics of relations, like symmetry Characteristics of relations, like symmetry

Standards for Web Services  Business Process Execution Language for Web Services (BPEL4WS)  Ontology Web Language Semantics (OWL-S)  Web Service Modeling Ontology (WSMO)

Standards Setting Bodies: OASIS  OASIS: Organization for Advancement of Structured Information Standards ebXML: e-business XML ebXML: e-business XML UDDI: Universal Description, Discovery and Integration UDDI: Universal Description, Discovery and Integration  Global Grid Forum community of users, developers, and vendors leading the global standardization effort for grid computing community of users, developers, and vendors leading the global standardization effort for grid computing

Standards Setting Bodies: W3C  W3C: World Wide Web Consortium OWL: Ontology Web Language OWL: Ontology Web Language RDF/RDFS: Resource Description Framework/Schema RDF/RDFS: Resource Description Framework/Schema SOAP: Simple Object Access Protocol SOAP: Simple Object Access Protocol URI/URL/URN: Universal Resource Identifier/Locator/Name URI/URL/URN: Universal Resource Identifier/Locator/Name WSDL: Web Service Definition Language WSDL: Web Service Definition Language XML: eXtensible Markup Language XML: eXtensible Markup Language

Web Services Integration Projects: Biosciences  myGrid  BIOPIPE  BioMOBY

Web Services for Chemistry: Problems  Performance and scalability  Proprietary data  Competition from high-performance desktop applications -- Geoff Hutchison, it’s a puzzle blog,  ALSO: Lack of a substantial body of trustworthy Open Access databases Lack of a substantial body of trustworthy Open Access databases Non-standard chemical data formats (over 40 in regular use and requiring normalization to one another) Non-standard chemical data formats (over 40 in regular use and requiring normalization to one another)

Necessary Ingredients in Chemistry  Chemical communities to assemble Open Access databases Well-defined quality assurance procedures performed by distributed peer-review systems Well-defined quality assurance procedures performed by distributed peer-review systems Software underlying the databases needs to be open source. Software underlying the databases needs to be open source.

BlueObelisk.org  A group of chemists, programmers, and informaticians working collaboratively on projects such as: Chemistry Development Kit (CDK) Chemistry Development Kit (CDK) JChemPaint JChemPaint Jmol Jmol JUMBO JUMBO NMRShiftDB NMRShiftDB Octet Octet Open Babel Open Babel QSAR QSAR World Wide Molecular Matrix (WWMM) World Wide Molecular Matrix (WWMM)

Components of the Semantic Web for Chemistry  XML – eXtensible Markup Language  RDF – Resource Description Framework  RSS – Rich Site Summary  Dublin Core – allows metadata-based newsfeeds  OWL – for ontologies  BPEL4WS – for workflow and web services Murray-Rust et al. Org. Biomol. Chem. 2004, 2, Murray-Rust et al. Org. Biomol. Chem. 2004, 2,

Chemistry Databases on the Web  Marc Nicklaus lists 37 databases as of October 2001 Must have structure searching and at least 100 molecules Must have structure searching and at least 100 molecules  SoaringBear’s List has 15 databases

Institutional Repositories  NARSTO Quality Systems Science Center Pollutant species in the troposphere over North America Pollutant species in the troposphere over North America Part of the Carbon Dioxide Information Analysis Center at ORNL Part of the Carbon Dioxide Information Analysis Center at ORNL NARSTO Data and Information Sharing Tool NARSTO Data and Information Sharing Tool

Public Databases  Developmental Therapeutics Program/NCI Some assay data for download Some assay data for download Structures for over 200,000 compounds Structures for over 200,000 compounds  Zinc and other screening databases  NIST computational chemistry database  Environmental fate and exposure databases

Other Public Databases 1  ChemExper Chemical Directory > 200,000 substances; > 10,000 IR spectra > 200,000 substances; > 10,000 IR spectra  HIC-Up; Hetero-Compound Identification Centre – Uppsala 5384 substances as of 1/15/ substances as of 1/15/  Chemicals with Pharmaceutical Activity; a 3D Structural Database 400 3D structures 400 3D structures

Isatin in ChemExper

Isatin ChemExper IR Spectrum

Isatin from HIC-Up

Other Public Databases 2  Cheminformatics.org 41 data sets in 9 categories as of 8/18/05 41 data sets in 9 categories as of 8/18/  WebReactions

Other Public Databases 3  MolTable  MatWeb Materials Property Data  Spectral Database for Organic Compounds (SDBS) Over 32,000 compounds Over 32,000 compounds Has EI-MS, FT-IR, 1 H NMR, 13 C NMR, Raman, ESR Has EI-MS, FT-IR, 1 H NMR, 13 C NMR, Raman, ESR  NMRShiftDB (Christoph Steinbeck) 14,753 structures as of 8/19/05 14,753 structures as of 8/19/05 Features peer-reviewed submission of data sets Features peer-reviewed submission of data sets

Comment on Link to PubChem  “It is great to know that - with PubChem - there will be something like a single point of entry for people interested in structures and their properties. We are looking forward to link our datasets to PubChem structures.” --Christoph Steinbeck, commenting on the NMRShiftDB, CHMINF-L, 19 April 2005

Other Public Databases: Commercial Teasers  FTIRsearch.com (Thermo Electron) Demo file of 575 spectra from 87,000 in the full database Demo file of 575 spectra from 87,000 in the full database  ChemACX 30 of >350 suppliers catalog data 30 of >350 suppliers catalog data  Sunset Molecular Discovery, LLC Wombat (World of Molecular BioAcTivity) Wombat (World of Molecular BioAcTivity) 117,007 entries with over 230,000 biological activities117,007 entries with over 230,000 biological activities Wombat PK Wombat PK Database for Clinical Pharmacokinetics: 643 substances with 4668 measurementsDatabase for Clinical Pharmacokinetics: 643 substances with 4668 measurements Three sample files from Wombat containing 341 Histamine-1 receptor antagonists Three sample files from Wombat containing 341 Histamine-1 receptor antagonists

Indiana University Existing Projects  System for the Integration of Bioinformatics Services (SIBIOS)  PlatCom: A Platform for Computational Comparative Genomics  Reciprocal Net

Indiana University Planned Projects  Design of a Grid-based distributed data architecture  Development of tools for HTS data analysis and virtual screening  Database for quantum mechanical simulation data  Chemical prototype projects Novel routes to enzymatic reaction mechanisms Novel routes to enzymatic reaction mechanisms Mechanism-based drug design Mechanism-based drug design Data-inquiry-based development of new methods in natural product synthesis Data-inquiry-based development of new methods in natural product synthesis

Community Grids Lab

Open Systems Lab

Web Services Future  Depends on Adoption of standards Adoption of standards Incorporation of WS in current and newly developed applications Incorporation of WS in current and newly developed applications Security, privacy, quality of data issues Security, privacy, quality of data issues Development of WS tools and resources for e-Science Development of WS tools and resources for e-Science

“Tripos Embraces Web Services”  Goal: to make it easier to incorporate Tripos’ high-throughput workflows Tripos’ Service-Oriented Informatics (SOI) Tripos’ Service-Oriented Informatics (SOI)  In partnership with SciTegic: Web services ensure software compatibility between the companies’ products.

Provocative Thought “While much bioscience is published with the knowledge that machines will be expected to understand at least part of it, almost all chemistry is published purely for humans to read.” Murray-Rust et al. Org. Biomol. Chem. 2004, 2, Murray-Rust et al. Org. Biomol. Chem. 2004, 2, 3201.

Closing quote “The future of chemistry depends on the automated analysis of chemical knowledge, combining disparate data sources in a single resource, such as the World-Wide Molecular Matrix, which can be analysed using computational techniques to assess and build on these data.” Townsend et al. Org. Biomol. Chem. 2004, 2, Townsend et al. Org. Biomol. Chem. 2004, 2, 3299.

Acknowledgements  Christoph Steinbeck  Geoffrey Fox, Marlon Pierce  Peter Murray-Rust  Michael D. Jensen, Timothy B. Patrick, Joyce A. Mitchell  Elsevier

Bibliography  Coles, Simon J.; Day, Nick E.; Murray-Rust, Peter; Rzepa, Henry S.; Zhang, Yong. “Enhancement of the chemical semantic web through InChIfication.” Organic & Biomolecular Chemistry 2005, 3,  Curcin, Vera; Ghanem, Moustafa; Guo, Yike. "Web services in the life sciences." Drug Discovery Today 2005, 10(12),  Day, Michael. Metadata: Mapping between metadata formats. (accessed 8/6/2005)  De Roure, D.; Jennings, N. R.; Shadbolt, N. R. “The Semantic Grid: Past, present and future.” Proceedings of the IEEE 2005, 93(3),  Jensen, Michael; Patrick, Timothy B.; Mitchell, Joyce A. “Web services for bioinformatics.” PSB 2004 Tutorial  Murray-Rust, Peter S.; Mitchell, John B.O.; Rzepa, Henry S. “Communication and re-use of chemical information in bioscience.” BMC Bioinformatics 2005, 6, 180.  Murray-Rust, Peter; Mitchell, John B.O.; Rzepa, Henry S. “Chemistry in bioinformatics.” BMC Bioinformatics 2005, 6,

Bibliography  Murray-Rust, Peter; Rzepa, Henry S.; Tyrrell, Simon M.; Zhang, Yong. “Representation and use of chemistry in the globabl electronic age.” Organic & Biomolecular Chemistry 2004, 2,  Salamone, Salvatore. “Tripos embraces Web Services. Bio-IT World, 2005, 4(8), 8.  Shermer, Michael. “Rumsfeld’s Wisdom.” Scientific American 2005, 293(3), 38.  Townsend, Joe A.; Adams, Sam E.; Waudby, Christopher A.; de Souza, Vanessa K.; Goodman, Jonathan M; Murray-Rust, Peter. “Chemical documents: machine understanding and automated information extraction.” Organic & Biomolecular Chemistry 2004, 2,  World Wide Molecular Matrix.  Summary Report – W3C Workshop on Semantic Web for Life Sciences, October 27-28,