1 The OBO Foundry Barry Smith Center of Excellence in Bioinformatics & Life Sciences, University at Buffalo IFOMIS, Saarland University

Slides:



Advertisements
Similar presentations
Upper Ontology Summit Tuesday March 14 The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National Center.
Advertisements

Upper Ontology Summit Wednesday March 15 The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National.
On the Future of the NeuroBehavior Ontology and Its Relation to the Mental Functioning Ontology Barry Smith
Goal and Status of the OBO Foundry Barry Smith. 2 Semantic Web, Moby, wikis, crowd sourcing, NLP, etc.  let a million flowers (and weeds) bloom  to.
Ontology in Buffalo Barry Smith. 2 Ontology (phil.) The science of being Ontologies (tech.) Standardized classification systems which enable data from.
The Environment Ontology Barry Smith 1.
1 Pax Terminologica Barry Smith Institute for Formal Ontology and Medical Information Science, Saarland University, Saarbrücken.
1 Introduction to Biomedical Ontology Barry Smith University at Buffalo
The Future of Health Information Barry Smith Ontology Research Group Center of Excellence in Bioinformatics and Life Sciences University at Buffalo ontology.buffalo.edu/smith.
1 The OBO Foundry Towards Gold Standard Terminology Resources in the Biomedical Domain Thomas Bittner (based on a presentation by Barry Smith)
1 Intelligence Ontology: A Strategy for the Future Barry Smith University at Buffalo
1 How Ontologies Create Research Communities Barry Smith
1 Workshop 7.00 Welcoming Remarks 7.15 Barry Smith (Buffalo, NY) 7.40 Lindsay Cowell (Duke University, NC) 8.05 Nigam Shah (Stanford University, CA) 8.30.
1 An Ontology of Relations for Biomedical Informatics Barry Smith 10 January 2005.
1 The OBO Foundry Barry Smith University at Buffalo
1 How Ontologies Create Research Communities Barry Smith University at Buffalo
1 Ontology in 15 Minutes Barry Smith. 2 Main obstacle to integrating genetic and EHR data No facility for dealing with time and instances (particulars)
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
1 The OBO Foundry 2 A prospective standard designed to guarantee interoperability of ontologies from the very start (contrast.
The Problem of Reusability of Biomedical Data OBO Foundry & HL7 RIM Barry Smith.
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
National center for ontological research. Part One: The History of NCOR and ECOR Part Two: How to Establish JCOR: The Japanese Consortium.
Underlying Ontologies for Biomedical work - The Relation Ontology (RO) and Basic Formal Ontology (BFO) Thomas Bittner SUNY Buffalo
1 Logical Tools and Theories in Contemporary Bioinformatics Barry Smith
1 Introduction to Ontology Barry Smith
Room for Lunch: Arlington Room Room for Evening Reception: Grand Prairie Room.
Why a Credit Card Number is Not a Number Barry Smith 1.
OBO-Foundry. OBO was conceived and announced in in october 2001 Michael Ashburner and Suzanna Lewis with acknowledgements of others in the GO.
CTO - Clinical Trials/Research in the Ontology of Biomedical Investigation Richard H. Scheuermann U.T. Southwestern Medical Center.
The RNA Ontology RNAO Colin Batchelor Neocles Leontis May 2009 Eckart, Colin and Jane In Cambridge.
1 BIOLOGICAL DOMAIN ONTOLOGIES & BASIC FORMAL ONTOLOGY Barry Smith.
CoE Ontology Research Group (ORG) Barry Smith Center of Excellence in Bioinformatics and Life Sciences Ontology Research Group Department of Philosophy.
How to Organize the World of Ontologies Barry Smith 1.
New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical Ontology in Buffalo Part I: The Gene Ontology Barry Smith and Werner Ceusters.
1 What an Ontology is For Barry Smith University at Buffalo Common Anatomy Reference Ontology Workshop.
Introduction to Ontologies for Environmental Biology Barry Smith
The OBO Foundry Chris Mungall Lawrence Berkeley Laboratory NCBO GO Consortium May 2007.
1 Part III.The OBO Foundry Project: Towards Scientific Standards and Principles-Based Coordination in Biomedical Ontology Development.
1 How Ontologies Create Research Communities Barry Smith
1 The Canonical Life Barry Smith
1 Ontology (Science) Barry Smith University at Buffalo
1 The Future of Clinical Bioinformatics: Overcoming Obstacles to Information Integration Barry Smith Brussells, Eurorec Ontology Workshop, 25 November.
The OBO Foundry approach to ontologies and standards with special reference to cytokines Barry Smith ImmPort Science Talk / Discussion June 17, 2014.
Limning the CTS Ontology Landscape Barry Smith 1.
Ontological realism as a strategy for integrating ontologies Ontology Summit February 7, 2013 Barry Smith 1.
Intelligence Ontology A Strategy for the Future Barry Smith University at Buffalo
Why we need the OBO Core Michael Ashburner, Suzanna Lewis and Barry Smith.
Amo amos amot amomus amotis amont. Happy birthday Swiss-Prot Fortaleza August 2006.
1 How Ontologies Create Research Communities Barry Smith University at Buffalo
Building Ontologies with Basic Formal Ontology Barry Smith May 27, 2015.
Ontology of Disease and the OBO Foundry Chris Mungall NCBO GO Nov 2006.
Alan Ruttenberg PONS R&D Task force Alan Ruttenberg Science Commons.
1 Introduction to Bio-Ontologies Barry Smith
2 3 where in the body ? where in the cell ?
Ontology and the Semantic Web Barry Smith August 26,
What is an ontology and Why should you care? Barry Smith 1.
Need for common standard upper ontology
Introduction to Biomedical Ontology for Imaging Informatics Barry Smith, PhD, FACMI University at Buffalo May 11, 2015.
1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo
1 Ontology (Science) vs. Ontology (Engineering) Barry Smith University at Buffalo
Big Data that might benefit from ontology technology, but why this usually fails Barry Smith National Center for Ontological Research 1.
Building Ontologies with Basic Formal Ontology Barry Smith May 27, 2015.
Upper Ontology Summit The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National Center for Ontological Research National.
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
What is an ontology and Why should you care?
Intelligence Ontology: A Strategy for the Future
Ontology in 15 Minutes Barry Smith.
OBO Foundry Principles
Ontology in 15 Minutes Barry Smith.
OBO Foundry Update: April 2010
Presentation transcript:

1 The OBO Foundry Barry Smith Center of Excellence in Bioinformatics & Life Sciences, University at Buffalo IFOMIS, Saarland University Standards and Ontology

2  how do we know what data we have ?  how do I know what data you have ?  how do we know what data we don’t have ?  how do we make different sorts of data combinable, as we need to do in large domains such as neurodevelopment, immunology, cancer...? we are accumulating huge amounts of sequence data, image data, pharma data,...

3 genomic medicine, molecular medicine, translational medicine, personalized medicine... need methods for data integration to enable reasoning across data at multiple granularities to identify biomedically relevant relations on the side of the entities themselves

4

5 where in the body ? what kind of disease process ? = we need ontologies we need semantic annotation of data

6 Semantic Web, Moby, wikis, etc.  let a million flowers (and weeds) bloom  to create integration rely on (automatically generated?) post hoc mappings how create broad-coverage semantic annotation systems for biomedicine?

7 most successful, thus far: UMLS built by trained experts massively useful for information retrieval and information integration UMLS Metathesaurus a system of post hoc mappings between source vocabularies separately built

8

9 UMLS-based mappings fall short of creating interoperability because local usage is respected regimentation frowned upon, no concern for cross- framework consistency UMLS terminologies have different grades of formal rigor, different degrees of completeness, different update policies

10 with UMLS-based annotations we can know what data we have (via term searches), but it is noisy we can map between data at single granularities (via ‘synonyms’), but synonymy information is noisy how do we know what data we don’t have ? how do we reason with data (as at the molecular level), when no common logical backbone ?

11 for science to develop high quality annotation resources in a collaborative, community effort? create an evolutionary path towards improvement of terminologies, of the sort we find elsewhere in science find ways to reward early adopters of the results what is to be done?

12 for science science works out from a consensus core, and strives to isolate and resolve inconsistencies as it extends at the fringes we need to create a consensus core start with what for human beings are trivialities (low hanging fruit) and work out from there for science, consistency is a sine qua non

13 FMA Pleural Cavity Pleural Cavity Interlobar recess Interlobar recess Mesothelium of Pleura Mesothelium of Pleura Pleura(Wall of Sac) Pleura(Wall of Sac) Visceral Pleura Visceral Pleura Pleural Sac Parietal Pleura Parietal Pleura Anatomical Space Organ Cavity Organ Cavity Serous Sac Cavity Serous Sac Cavity Anatomical Structure Anatomical Structure Organ Serous Sac Mediastinal Pleura Mediastinal Pleura Tissue Organ Part Organ Subdivision Organ Subdivision Organ Component Organ Component Organ Cavity Subdivision Organ Cavity Subdivision Serous Sac Cavity Subdivision Serous Sac Cavity Subdivision part_of is_a Foundational Model of Anatomy

14 for science include ontologies corresponding to the basic biomedical sciences in the core clinical medicine relies on anatomy and molecular biology to provide integration across medical specialisms

15 for science where do we find scientifically validated information linking gene products and other entities represented in biochemical databases to semantically meaningful terms pertaining to disease, anatomy, development, histology in different model organisms? but we need more

16

17 what makes GO so wildly successful ?

18 science basis of the GO: trained experts curating peer-reviewed literature different model organism databases employ scientific curators who use the experimental observations reported in the biomedical literature to associate GO terms with gene products in a coordinated way The methodology of annotations

19  cellular locations  molecular functions  biological processes used to annotate the entities represented in the major biochemical databases thereby creating integration across these databases and making them available to semantic search A set of standardized textual descriptions of

20 what cellular component? what molecular function? what biological process?

21 This process leads to improvements and extensions of the ontology which in turn leads to better annotations  a virtuous cycle of improvement in the quality and reach of both future annotations and the ontology itself RESULT: a slowly growing computer-interpretable map of biological reality within which major databases are automatically integrated in semantically searchable form

22 Five bangs for your GO buck science base cross-species database integration cross-granularity database integration through links to the things which are of biomedical relevance  semantic searchability links people to software

23 but now need to improve the quality of GO to support more rigorous logic-based reasoning across the data annotated in its terms need to extend the GO by engaging ever broader community support for the addition of new terms and for the correction of errors

24 but also need to extend the methodology to other domains, including clinical domains  need for disease ontology immunology ontology symptom (phenotype) ontology clinical trial ontology...

25 the problem existing clinical vocabularies are of variable quality and low mutual consistency need for prospective standards to ensure mutual consistency and high quality of clinical counterparts of GO need to ensure consistency of the new clinical ontologies with the basic biomedical sciences if we do not start now, the problem will only get worse

26 the solution establish common rules governing best practices for creating ontologies and for using these in annotations apply these rules to create a complete suite of orthogonal interoperable biomedical reference ontologies this solution is already being implemented

27 a shared portal for (so far) 58 ontologies (low regimentation)  NCBO BioPortal First step (2003)

28

29 Second step (2004) Second step (2004) reform efforts initiated, e.g. linking GO to other OBO ontologies to ensure orthogonality id: CL: name: osteoblast def: "A bone-forming cell which secretes an extracellular matrix. Hydroxyapatite crystals are then deposited into the matrix to form bone." is_a: CL: relationship: develops_from CL: relationship: develops_from CL: GO Cell type New Definition + = Osteoblast differentiation: Processes whereby an osteoprogenitor cell or a cranial neural crest cell acquires the specialized features of an osteoblast, a bone-forming cell which secretes extracellular matrix.

30 The OBO Foundry Third step (2006)

31 a family of interoperable gold standard biomedical reference ontologies to serve the annotation of inter alia  scientific literature  model organism databases  clinical trial data The OBO Foundry The OBO Foundry

32 A prospective standard designed to guarantee interoperability of ontologies from the very start (contrast to: post hoc mapping) established March initial candidate OBO ontologies – focused primarily on basic science domains several being constructed ab initio by influential consortia who have the authority to impose their use on large parts of the relevant communities.

33 undergoing rigorous reform new GO Gene Ontology ChEBI Chemical Ontology CL Cell Ontology FMA Foundational Model of Anatomy PaTO Phenotype Quality Ontology SO Sequence Ontology CARO Common Anatomy Reference Ontology CTO Clinical Trial Ontology FuGO Functional Genomics Investigation Ontology PrO Protein Ontology RnaO RNA Ontology RO Relation Ontology The OBO Foundry

34 OntologyScopeURLCustodians Cell Ontology (CL) cell types from prokaryotes to mammals obo.sourceforge.net/cgi- bin/detail.cgi?cell Jonathan Bard, Michael Ashburner, Oliver Hofman Chemical Entities of Bio- logical Interest (ChEBI) molecular entitiesebi.ac.uk/chebi Paula Dematos, Rafael Alcantara Common Anatomy Refer- ence Ontology (CARO) anatomical structures in human and model organisms (under development) Melissa Haendel, Terry Hayamizu, Cornelius Rosse, David Sutherland, Foundational Model of Anatomy (FMA) structure of the human body fma.biostr.washington. edu JLV Mejino Jr., Cornelius Rosse Functional Genomics Investigation Ontology (FuGO) design, protocol, data instrumentation, and analysis fugo.sf.netFuGO Working Group Gene Ontology (GO) cellular components, molecular functions, biological processes Ontology Consortium Phenotypic Quality Ontology (PaTO) qualities of anatomical structures obo.sourceforge.net/cgi -bin/ detail.cgi? attribute_and_value Michael Ashburner, Suzanna Lewis, Georgios Gkoutos Protein Ontology (PrO) protein types and modifications (under development)Protein Ontology Consortium Relation Ontology (RO) relationsobo.sf.net/relationshipBarry Smith, Chris Mungall RNA Ontology (RnaO) three-dimensional RNA structures (under development)RNA Ontology Consortium Sequence Ontology (SO) properties and features of nucleic sequences song.sf.netKaren Eilbeck

35 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy?) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Annotations plus ontologies yield an ever-growing computer-interpretable map of biological reality.

36 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy?) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Building out from the original GO

37  Disease Ontology (DO)  Biomedical Image and Image Process Ontology (BiiO)  Upper Biomedical Ontology (OBO UBO)  Ontology of Biomedical Investigations (OBI)  Clinical Trial Ontology (CTO) Under consideration: The OBO Foundry

38 OBO Foundry = a subset of OBO ontologies, whose developers have agreed in advance to accept a common set of principles reflecting best practice in ontology development designed to ensure  tight connection to the biomedical basic sciences  compatibility  interoperability, common relations  formal robustness  support for logic-based reasoning The OBO Foundry

39 CRITERIA  The ontology is OPEN and available to be used by all.  The ontology is in, or can be instantiated in, a COMMON FORMAL LANGUAGE.  The developers of the ontology agree in advance to COLLABORATE with developers of other OBO Foundry ontology where domains overlap. CRITERIA The OBO Foundry

40 CRITERIA  UPDATE: The developers of each ontology commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement.  ORTHOGONALITY: They commit to working with other Foundry members to ensure that, for any particular domain, there is community convergence on a single controlled vocabulary. The OBO Foundry

41 for science if we annotate a database or body of literature with one high-quality biomedical ontology, we should be able to add annotations from a second such ontology without conflicts orthogonality of ontologies implies additivity of annotations The OBO Foundry

42 CRITERIA  IDENTIFIERS: The ontology possesses a unique identifier space within OBO.  VERSIONING: The ontology provider has procedures for identifying distinct successive versions to ensure BACKWARDS COMPATIBITY with annotation resources already in common use  The ontology includes TEXTUAL DEFINITIONS and where possible equivalent formal definitions of its terms. CRITERIA

43  CLEARLY BOUNDED: The ontology has a clearly specified and clearly delineated content.  DOCUMENTATION: The ontology is well- documented.  USERS: The ontology has a plurality of independent users. CRITERIA The OBO Foundry

44  COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.* * Smith et al., Genome Biology 2005, 6:R46 CRITERIA The OBO Foundry

45 Foundational is_a part_of Spatial located_in contained_in adjacent_to Temporal transformation_of derives_from preceded_by Participation has_participant has_agent OBO Relation Ontology The OBO Foundry

46 Further criteria will be added over time in light of lessons learned in order to bring about a gradual improvement in the quality of Foundry ontologies ALL FOUNDRY ONTOLOGIES WILL BE SUBJECT TO CONSTANT UPDATE IN LIGHT OF SCIENTIFIC ADVANCE IT WILL GET HARDER The OBO Foundry

47 But not everyone needs to join The Foundry is not seeking to serve as a check on flexibility or creativity ALL FOUNDRY ONTOLOGIES WILL ENCOURAGE COMMUNITY CRITICISM, CORRECTION AND EXTENSION WITH NEW TERMS IT WILL GET HARDER The OBO Foundry

48  to introduce some of the features of SCIENTIFIC PEER REVIEW into biomedical ontology development  CREDIT for high quality ontology development work  KUDOS for early adopters of high quality ontologies / terminologies e.g. in reporting clinical trial results GOALS The OBO Foundry

49  to providing a FRAMEWORK OF RULES to counteract the current policy of ad hoc creation of new annotation schemas by each clinical research group by  REUSABILITY: if data-schemas are formulated using a single well-integrated framework ontology system in widespread use, then this data will be to this degree itself become more widely accessible and usable GOALS The OBO Foundry

50  to serve as BENCHMARK FOR IMPROVEMENTS in discipline-focused terminology resources  once a system of interoperable reference ontologies is there, it will make sense to calibrate existing terminologies in its terms in order to achieve more robust alignment and greater domain coverage  exploit the avenue of EVIDENCE-BASED MEDICINE (NIH CLINICAL RESEARCH NETWORKS) to foster their use by clinicians GOALS The OBO Foundry

51 June 2006: establishment of MICheck: reflects growing need for prescriptive checklists specifying the key information to include when reporting experimental results (concerning methods, data, analyses and results). the vision is spreading The OBO Foundry

52  MICheck: ‘a common resource for minimum information checklists’ analogous to OBO / NCBO BioPortal  MICheck Foundry: will create ‘a suite of self- consistent, clearly bounded, orthogonal, integrable checklist modules’ * * Taylor CF, et al. Nature Biotech, in press MICheck Foundry The OBO Foundry

53 Transcriptomics (MIAME Working Group) Proteomics (Proteomics Standards Initiative) Metabolomics (Metabolomics Standards Initiative) Genomics and Metagenomics (Genomic Standards Consortium) In Situ Hybridization and Immunohistochemistry (MISFISHIE Working Group) Phylogenetics (Phylogenetics Community) RNA Interference (RNAi Community) Toxicogenomics (Toxicogenomics WG) Environmental Genomics (Environmental Genomics WG) Nutrigenomics (Nutrigenomics WG) Flow Cytometry (Flow Cytometry Community) MICheck/Foundry communities

54 how to replicate the successes of the GO in clinical medicine? choose two or three representative disease domains work out reasoning challenges for those domains work with specialists to create ontologies interoperable with OBO Foundry basic science ontologies to address these reasoning challenges work with leaders of professional associations and of clinical trial initiatives to foster the collection of clinical data annotated in their terms Fourth Step (the future)

55 CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy?) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) OBO Foundry coverage (canonical ontologies) GRANULARITY RELATION TO TIME

56 INDEPENDENT CONTINUANTS organism system organ organ part tissue cell acellular anatomical structure biological molecule genome DEPENDENT CONTINUANTS physiology (functions) pathology acute stage progressive stage resolution stage

57 Draft Ontology for Acute Respiratory Distress Syndrome

58 Draft Ontology for Muscular Sclerosis what data do we have? what data do the others have? what data do we not have?

59 Draft Ontology for Muscular Sclerosis to apprehend what is unknown requires a complete demarcation of the relevant space of alternatives

60 Goal: to advance ontology as science National Center for Ontological Research

61