Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 The OBO Foundry Barry Smith University at Buffalo

Similar presentations


Presentation on theme: "1 The OBO Foundry Barry Smith University at Buffalo"— Presentation transcript:

1 1 The OBO Foundry Barry Smith University at Buffalo http://ontology.buffalo.edu/smith

2 2 Semantic Web, Moby, wikis, etc.  let a million flowers (weeds) bloom  to create integration rely on (automatically generated ?) post hoc mappings how create broad-coverage semantic annotation systems for biomedicine?

3 3 most successful, thus far: UMLS built by trained experts massively useful for information retrieval and information integration MeSH creates out of literature a semantically searchable space UMLS Metathesaurus a system of post hoc mappings between independent source vocabularies

4 4 for UMLS local usage respected regimentation frowned upon cross-framework consistency not important no concern to establish consistency with basic science different grades of formal rigor, different degrees of completeness, different update policies

5 5 with UMLS-based annotations we can know what data we have (via term searches) we can map between data at single granularities (via ‘synonyms’) how do we combine data across granularities? but how do we resolve logical conflicts ? how do we know what data we don’t have ? how do we reason with data ?

6 6 for science how ensure current annotation efforts not wasted because the terminologies used become obsolete? develop high quality annotation resources in a collaborative, community effort create an evolutionary path towards improvement, of the sort we find elsewhere in science a new approach:

7 7 for science science works out from a consensus core, and strives to isolate and resolve inconsistencies as it extends its outer fringes we need to create a consensus core starting with what for human beings are trivialities (low hanging fruit) and working outwards from there for science, consistency is the sine qua non

8 8 FMA Pleural Cavity Pleural Cavity Interlobar recess Interlobar recess Mesothelium of Pleura Mesothelium of Pleura Pleura(Wall of Sac) Pleura(Wall of Sac) Visceral Pleura Visceral Pleura Pleural Sac Parietal Pleura Parietal Pleura Anatomical Space Organ Cavity Organ Cavity Serous Sac Cavity Serous Sac Cavity Anatomical Structure Anatomical Structure Organ Serous Sac Mediastinal Pleura Mediastinal Pleura Tissue Organ Part Organ Subdivision Organ Subdivision Organ Component Organ Component Organ Cavity Subdivision Organ Cavity Subdivision Serous Sac Cavity Subdivision Serous Sac Cavity Subdivision part_of is_a Foundational Model of Anatomy

9 9 for science where do you find scientifically validated information linking gene products and other entities represented in biochemical databases to semantically meaningful terms pertaining to disease, anatomy, development in different model organisms? a new approach

10 10 science basis of the GO: trained experts curating peer-reviewed literature Model organism databases employ scientific curators who use the experimental observations reported in the biomedical literature to associate GO terms with gene products in a regimented way The methodology of annotations

11 11 cellular locations molecular functions biological processes used to annotate the entities represented in the major biochemical databases thereby creating integration across these databases A set of standardized textual descriptions of

12 12 what cellular component? what molecular function? what biological process?

13 13 This process leads to improvements and extensions of the ontology which in turn leads to better annotations  a virtuous cycle of improvement in the quality and reach of both future annotations and the ontology itself RESULT: a slowly growing computer-interpretable map of biological reality within which major databases are automatically integrated in semantically searchable form

14 14 But now need to improve the quality of the GO to support more rigorous logic-based reasoning across the data annotated in its terms need to improve the extent of the GO by engaging ever broader community support for the addition of new terms and for the correction of errors

15 15 But also need to extend the methodology to other domains, including: clinical trials clinical records literature of clinical medicine  need disease, symptom (phenotype) ontologies

16 16 the problem existing clinical vocabularies are of variable quality and low mutual consistency need for prospective standards to ensure mutual consistency and high quality of clinical counterparts of GO need to ensure consistency of the new clinical ontologies with the basic biomedical sciences if we do not start now, the problem will only get worse

17 17

18 18 the solution establish common rules governing best practices for creating ontologies and for using these in annotations apply these rules to create a complete suite of orthogonal interoperable biomedical reference ontologies this solution is already being implemented

19 19 a shared portal for (so far) 58 ontologies (low regimentation) http://obo.sourceforge.nethttp://obo.sourceforge.net  NCBO BioPortal First step (2003)

20 20

21 21 Second step (2004): reform efforts initiated, e.g. linking GO to other OBO ontologies to ensure orthogonality id: CL:0000062 name: osteoblast def: "A bone-forming cell which secretes an extracellular matrix. Hydroxyapatite crystals are then deposited into the matrix to form bone." is_a: CL:0000055 relationship: develops_from CL:0000008 relationship: develops_from CL:0000375 GO Cell type New Definition + = Osteoblast differentiation: Processes whereby an osteoprogenitor cell or a cranial neural crest cell acquires the specialized features of an osteoblast, a bone-forming cell which secretes extracellular matrix.

22 22 The OBO Foundry http://obofoundry.org/ Third step (2006)

23 23 a family of interoperable gold standard biomedical reference ontologies to serve the annotation of  scientific literature  model organism databases  clinical data  experimental results The OBO Foundry

24 24 A subset of OBO ontologies, whose developers have agreed in advance to accept a common set of principles designed to ensure  tight connection to the biomedical basic sciences  compatibility  interoperability  formal robustness  support for logic-based reasoning

25 25 A prospective standard designed to guarantee interoperability of ontologies from the very start (contrast to: post hoc mapping) established March 2006 12 initial candidate OBO ontologies – focused primarily on basic science domains several being constructed ab initio

26 26 undergoing rigorous reform new GO Gene Ontology ChEBI Chemical Ontology CL Cell Ontology FMA Foundational Model of Anatomy PaTO Phenotype Quality Ontology SO Sequence Ontology CARO Common Anatomy Reference Ontology CTO Clinical Trial Ontology FuGO (OBI) Functional Genomics InvestigationOntology PrO Protein Ontology RnaO RNA Ontology RO Relation Ontology

27 27 OntologyScopeURLCustodians Cell Ontology (CL) cell types from prokaryotes to mammals obo.sourceforge.net/cgi- bin/detail.cgi?cell Jonathan Bard, Michael Ashburner, Oliver Hofman Chemical Entities of Bio- logical Interest (ChEBI) molecular entitiesebi.ac.uk/chebi Paula Dematos, Rafael Alcantara Common Anatomy Refer- ence Ontology (CARO) anatomical structures in human and model organisms (under development) Melissa Haendel, Terry Hayamizu, Cornelius Rosse, David Sutherland, Foundational Model of Anatomy (FMA) structure of the human body fma.biostr.washington. edu JLV Mejino Jr., Cornelius Rosse Functional Genomics Investigation Ontology (FuGO) design, protocol, data instrumentation, and analysis fugo.sf.netFuGO Working Group Gene Ontology (GO) cellular components, molecular functions, biological processes www.geneontology.orgGene Ontology Consortium Phenotypic Quality Ontology (PaTO) qualities of anatomical structures obo.sourceforge.net/cgi -bin/ detail.cgi? attribute_and_value Michael Ashburner, Suzanna Lewis, Georgios Gkoutos Protein Ontology (PrO) protein types and modifications (under development)Protein Ontology Consortium Relation Ontology (RO) relationsobo.sf.net/relationshipBarry Smith, Chris Mungall RNA Ontology (RnaO) three-dimensional RNA structures (under development)RNA Ontology Consortium Sequence Ontology (SO) properties and features of nucleic sequences song.sf.netKaren Eilbeck

28 28 OntologyScopeURLCustodians Cell Ontology (CL) cell types from prokaryotes to mammals obo.sourceforge.net/cgi- bin/detail.cgi?cell Jonathan Bard, Michael Ashburner, Oliver Hofman Chemical Entities of Bio- logical Interest (ChEBI) molecular entitiesebi.ac.uk/chebi Paula Dematos, Rafael Alcantara Common Anatomy Refer- ence Ontology (CARO) anatomical structures in human and model organisms (under development) Melissa Haendel, Terry Hayamizu, Cornelius Rosse, David Sutherland, Foundational Model of Anatomy (FMA) structure of the human body fma.biostr.washington. edu JLV Mejino Jr., Cornelius Rosse OBI design, protocol, data instrumentation, and analysis fugo.sf.netFuGO Working Group Gene Ontology (GO) cellular components, molecular functions, biological processes www.geneontology.orgGene Ontology Consortium Phenotypic Quality Ontology (PaTO) qualities of anatomical structures obo.sourceforge.net/cgi -bin/ detail.cgi? attribute_and_value Michael Ashburner, Suzanna Lewis, Georgios Gkoutos Protein Ontology (PrO) protein types and modifications (under development)Protein Ontology Consortium Relation Ontology (RO) relationsobo.sf.net/relationshipBarry Smith, Chris Mungall RNA Ontology (RnaO) three-dimensional RNA structures (under development)RNA Ontology Consortium Sequence Ontology (SO) properties and features of nucleic sequences song.sf.netKaren Eilbeck

29 29 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Annotations plus ontologies yield an ever-growing computer-interpretable map of biological reality.

30 30 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Building out fron the original GO

31 31 CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) OBO Foundry coverage (canonical ontologies) GRANULARITY RELATION TO TIME

32 32 CARO Common Anatomy Reference Ontology FMA Homologies Canonical Anatomies vs. Pathoanatomies

33 33  Disease Ontology (DO)  Biomedical Image Ontology (BIO)  Upper Biomedical Ontology (OBO UBO)  Environment Ontology (EnvO)  Systems Biology Ontology (SBO) Under consideration:

34 34 CRITERIA  The ontology is open and available to be used by all.  The ontology is in, or can be instantiated in, a common formal language.  The developers of the ontology agree in advance to collaborate with developers of other OBO Foundry ontology where domains overlap. CRITERIA

35 35 CRITERIA  UPDATE: The developers of each ontology commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement.  ORTHOGONALITY: They commit to working with other Foundry members to ensure that, for any particular domain, there is community convergence on a single controlled vocabulary.

36 36 for science communities must work together to ensure consistency  orthogonality  additivity of annotation frameworks ADDITIVITY: if we annotate a database or body of literature with one high-quality biomedical ontology, we should be able to add annotations from a second such ontology without conflicts GLOSS ON ORTHOGONALITY

37 37 CRITERIA  IDENTIFIERS: The ontology possesses a unique identifier space within OBO.  VERSIONING: The ontology provider has procedures for identifying distinct successive versions.  The ontology includes textual definitions for all terms. CRITERIA

38 38  CLEARLY BOUNDED: The ontology has a clearly specified and clearly delineated content.  DOCUMENTATION: The ontology is well- documented.  USERS: The ontology has a plurality of independent users. CRITERIA

39 39  COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.* * Smith et al., Genome Biology 2005, 6:R46 CRITERIA

40 40 Foundational is_a part_of Spatial located_in contained_in adjacent_to Temporal transformation_of derives_from preceded_by Participation has_participant has_agent OBO Relation Ontology

41 41 Further criteria will be added over time in light of lessons learned in order to bring about a gradual improvement in the quality of the ontologies in the Foundry ALL ONTOLOGIES WILL BE SUBJECT TO CONSTANT UPDATE IN LIGHT OF SCIENTIFIC ADVANCE IT WILL GET HARDER

42 42 But not everyone needs to join The Foundry is not seeking to serve as a check on flexibility or creativity ALL ONTOLOGIES WILL WELCOME CONSTANT COMMUNITY CRITICISM, CORRECTION AND EXTENSION IT WILL GET HARDER

43 43  to introduce some of the features of SCIENTIFIC PEER REVIEW into biomedical ontology development  REUSABILITY: if data-schemas are formulated using a single well-integrated framework ontology system in widespread use, then this data will be to this degree itself become more widely accessible and usable GOALS

44 44  to create controlled vocabularies for semantic annotation of clinical trial records, scientific journal articles...  to help in creating better mappings e.g. between human and model organism phenotypes * * Zhang and Bodenreider, AMIA 2005 GOALS

45 45  to aid literature search: http://www.gopubmed.org/http://www.gopubmed.org/  to counteract the current policy of ad hoc creation of new annotation schemas by each clinical research group by providing a common shared framework  to end the terminology wars; by advancing community-based regimentation of clinical and other vocabularies in a scientific spirit GOALS

46 46  to serve as a benchmark for improvements in discipline-focused terminology resources  once interoperable reference ontologies are there, it will make sense to calibrate existing terminologies in order to achieve greater domain coverage and alignment of different but veridical views GOALS

47 47 June 2006: establishment of MICheck: reflects growing need for prescriptive checklists specifying the key information to include when reporting experimental results (concerning methods, data, analyses and results). the vision is spreading

48 48  MICheck: ‘a common resource for minimum information checklists’ analogous to OBO / NCBO BioPortal  MICheck Foundry: will create ‘a suite of self- consistent, clearly bounded, orthogonal, integrable checklist modules’ * * Taylor CF, et al. Nature Biotech, under review The vision is spreading

49 49 Transcriptomics (MIAME Working Group) Proteomics (Proteomics Standards Initiative) Metabolomics (Metabolomics Standards Initiative) Genomics and Metagenomics (Genomic Standards Consortium) In Situ Hybridization and Immunohistochemistry (MISFISHIE Working Group) Phylogenetics (Phylogenetics Community) RNA Interference (RNAi Community) Toxicogenomics (Toxicogenomics WG) Environmental Genomics (Environmental Genomics WG) Nutrigenomics (Nutrigenomics WG) Flow Cytometry (Flow Cytometry Community) MICheck/Foundry communities

50 50 INDEPENDENT CONTINUANTS organism system organ organ part tissue cell acellular anatomical structure biological molecule genome DEPENDENT CONTINUANTS physiology (functions) pathology acute stage progressive stage resolution stage next step: repertoire of disease ontologies built out of OBO Foundry elements

51 51 Ontology for Acute Respiratory Distress Syndrome

52 52 Draft Ontology for Multiple Sclerosis what data do we have? what data do the others have? what data do we not have?

53 53 Draft Ontology for Multiple Sclerosis to apprehend what is unknown requires a complete demarcation of the relevant space of alternatives

54 54 The Arguments Against Will it scale? Are we ready? Is medical classification not entirely conventional (folklore)? How will we avoid semantic drift? Do we have the reasoning tools? Application ontologies? Where will we get the data?


Download ppt "1 The OBO Foundry Barry Smith University at Buffalo"

Similar presentations


Ads by Google