Presentation is loading. Please wait.

Presentation is loading. Please wait.

26/04/2015BioAID1 e-science is…. Legos “Science is built up of facts, as a house is built of stones; but an accumulation of facts is no more a science.

Similar presentations

Presentation on theme: "26/04/2015BioAID1 e-science is…. Legos “Science is built up of facts, as a house is built of stones; but an accumulation of facts is no more a science."— Presentation transcript:

1 26/04/2015BioAID1 e-science is…

2 Legos “Science is built up of facts, as a house is built of stones; but an accumulation of facts is no more a science than a heap of stones is a house.” – Henri Poincaré, Science and Hypothesis, 1905

3 Who will annotate the annotators themselves? facilitating resource management with (semantic) web services M. Scott Marshall

4 Examples on web Example of less accessible: WSDL list for AIDA services (these services “annotate”) Human-readable service info: But not machine-readable..

5 Outline Vision – an e-science virtual laboratory Some definitions Some requirements Essential concepts of semantic web facets for interfaces Conclusions

6 The Vision: Scientist as knowledge worker For Knowledge Workers: –Knowledge is the data (i.e. rules, relations, properties, hypotheses, etc.) For Today's Biologist: –Numbers, sequences, organisms(!), and images are the data Manipulate knowledge instead of data –Find support for relations between concepts instead of discovering table and column names and numbers. In the virtual laboratory, everything is a resource that can be described and manipulated with semantics

7 User....? –End users – scientists using our applications –API users – programmers extending and using our code –System administrators – setting up services, grids etc. –Other classes... If you’re not sure which one someone means please shout and ask them! Slide courtesy of Tom Oinn, OMII-EBI Workshop

8 Service Oriented Architecture (SOA) A way of doing computing where services are somehow combined to perform some overall function Implies a communication framework between the services Used because it’s easier to reconfigure the arrangement of a set of services than to rewrite a script –Services as LEGO bricks Slide courtesy of Tom Oinn, OMII-EBI Workshop

9 Grid Not just Globus, or EGEE, or Naregi... No such thing as ‘the grid’ –Unlike ‘the internet’ which does exist! We mean : –A computational facility, normally comprising multiple computers, which provides some combination of compute and data storage capacity and which can abstract over its inner workings in some fashion –Very loose definition! Can be part of a Service Oriented Architecture Slide courtesy of Tom Oinn, OMII-EBI Workshop

10 Knowledge “data”, “information”, “facts”, “knowledge” Knowledge is a statement that can be tested for truth. (by a machine)

11 RDF : a web format for knowledge RDF is a W3C language to express statements. RDF Triple: Subject Predicate Object Graph of Knowledge: Node Edge Node

12 OWL : The Web Ontology Language A W3C standard for ontology representation based on description logic.

13 Resources are shared on the web Shared: – CPU time – network bandwidth – memory – storage space But also: –Data –Knowledge –Services

14 Computational experiment: what we want to do with the resources Database Computational experiment in workflow environment Database...

15 What are the tasks? Search – discovering resources that match our needs Workflow composition Data integration Enactment/Deployment Access control Registry of a resource

16 Issues raised by computational experimentation How will we find relevant data? How will we automatically integrate such data into our experiment? How will we find apropriate services? How will we integrate our results as usable data for a new (computational) experiment? -> annotation

17 26/04/2015BioAID17 Finding the stone… Where is the piece that is red, has a triangular top, and was previously used to build a roof?

18 Computational Experiments Anticipated needs of the data consumer Data integration - combining different types of data –Data annotation: beyond formats Not only: –Data types (integer, string, etc.) But also: –Data semantics: What do the data represent? »Determined by the experimental design –Provenance: What has been done to the data? »Description of the procedure(s) that produced/transformed the data Discover and enact appropriate (web) services with appropriate data Reuse results from a computational experiment as data in another computational experiment –derived data is “tagged” and put into the repository

19 Anticipated needs of the data supplier (and consumer) Data in: –Simple submission/registration of data to e-science repository Semi-automatic annotation Data out: –Easy search and retrieval of previous datasets (my personal and my group’s data) –Easy search and retrieval of relevant datasets from public repository Combining data: –Different types and different sources Example: Intersecting views of data –data mapped to physical or semantic space (Examples follow..)

20 The Semantic Gap User ResourcesMiddlewareApplication

21 The Model in the middle User ResourcesMiddlewareApplication My Model Model

22 Why semantic annotation? We want annotation to be “machine-readable”: Free text – arbitrary text tags generated by users won’t always match up –Simplest problem: Finding a “named” object Hyponyms - Different names exist for the same object in different contexts and roles. Synonyms - The same name is used for different objects. Which name should I use? Standardized vocabulary list –can only find literal matches Example: Using data types to search for services will find too many! Semantic tags –allow searching for similar items: “Find items like this one.” –allow searching with a description: “Find items with these properties.” –semantic description of service (SA-WSDL) as well as data (OWL)

23 What is an ontology? Definitions: –A collection of things that are defined in terms of their properties and relations to other things. –A specification of a conceptualization that is designed for reuse across multiple applications and implementations (Gruber ’93, ‘95, Guarino’ ‘96, Guarino and Giaretta ‘95) General applications: –Searching for objects that are resources, documents, concepts, experimental data, or collections of these things. –Knowledge capture Example: Biological model with hypothetical knowledge Common applications in bioinformatics: –Annotation of database entries (e.g. gene products) –Categorization of clustered elements (e.g. genes)

24 Inheritance in ontologies Often represented as DAG’s (Directed Acyclic Graphs) or hierarchies (trees) Power of inheritance –Subsumption relations (ISA) apply transitivity to create inheritance of class and properties downward along chains in the hierarchy. Use an element as a metadata tag for semantic annotation (ontotag) –An ontotag serves as a pointer into a “semantic space” Animal MammalBird RobinHeronPenguin

25 Gene Ontology Mouse p53: {List of GO identifiers} Process:apoptosis, DNA damage response, signal transduction by p53 class mediator... Component:cytoplasm, cytosol... Function: DNA binding, protein binding... Cluster of genes X from micro array analysis Collection of {List of GO identifiers} per gene in cluster Þ Most prevalent GO identifiers: Þ Apoptosis, Cytosol, Protein Binding Þ Significant relationships between GO classes (e.g. cell death and DNA damage response)


27 Semantic annotation - ontotags Workflow provenance Author Evidence Scientific Model Data type Data value(s) Metadata Evidence Ontology Gene Ontology Author Provenance

28 KSinBIT’06 Resource mngmt use case: data integration Finding a basis for relation Epigenetic Mechanisms Transcription Chromatin Histone Modification Transcription Factors Transcription Factor Binding Sites position “There is a relation” Common Domain Instances Classes Hypothesis

29 KSinBIT’06 Scenario: A Use Case is born E-scientist explains benefits of semantic web to (wet lab) biologist Biologist wants to see a demonstration with actual data => Use Case: Find evidence of a relation between transcription and histone modifications Our approach: Annotate data with our own semantic types so that we can issue a query using our own terms

30 KSinBIT’06 E-science perspective on data integration: From cartoon to model to semantic data integration Biological concepts (‘myModel’) Data Biologist readable model Computer readable model

31 Some of the pieces we need knowledge representation – triples pointing at things: EPR's and URI's, not just the things but the statements about the things unification and reasoning annotation: linking knowledge to resources

32 Provenance – example in Taverna

33 Computational experiment Database Some provenance should be added by the module/service itself Database...

34 26/04/2015BioAID34 The AIDA toolbox for knowledge extraction and knowledge management reusable components to enhance science

35 Living examples: dynamic interfaces Yahoo Pipes interface to AIDA medline search: NOLJphxuA MeSH facet interface from Exhibit: W3C Health Care and Life Sciences KB (unofficial URL):

36 Conclusions The Web is a collection of resources: resource sharing Disclosure of semantic models can greatly enhance resource sharing and resource management Semantic annotation can be applied to any type of resource: data and (web)services. Semantic annotation and provenance can be added by the (web)services themselves. Need text mining for web services (to support semantic annotation) Need web services for text mining

37 26/04/2015BioAID37 26/04/2015BioAID37 Acknowledgements AIDA developers team: Sophia Katrenko, Edgar Meij, Willem van Hage, Frans Verster, (Machiel Jansen), Scott Marshall. Guus Schreiber, Maarten de Rijke, Pieter Adriaans Jan Top, Nicole Koenderink, Food informatics, Wageningen University Martijn Schuemie, Erasmus University Rotterdam OMII-UK and myGrid team, especially Tom Oinn, Katy Wolstencroft, Stian Soiland, Stuart Owen, Andy Gibson, Alan Rector, Robert Stevens, Carole Goble W3C Semantic Web Health Care and Life Sciences Interest Group Hideaki Sugawara, Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics ( This work was supported by the Dutch Ministry of Economic Affairs via VL-e and BioRange (BSIK grants)

38 The End “Science is built up of facts, as a house is built of stones; but an accumulation of facts is no more a science than a heap of stones is a house.” – Henri Poincaré, Science and Hypothesis, 1905

Download ppt "26/04/2015BioAID1 e-science is…. Legos “Science is built up of facts, as a house is built of stones; but an accumulation of facts is no more a science."

Similar presentations

Ads by Google