Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Introduction to Biomedical Ontology Barry Smith University at Buffalo

Similar presentations


Presentation on theme: "1 Introduction to Biomedical Ontology Barry Smith University at Buffalo"— Presentation transcript:

1 1 Introduction to Biomedical Ontology Barry Smith University at Buffalo http://ontology.buffalo.edu/smith

2 NCBO: National Center for Biomedical Ontology (NIH Roadmap Center) 2 Stanford Medical Informatics University of San Francisco Medical Center The Mayo Clinic University of Washington, Department of Structural Biology University of Pittsburgh, Department of Biomedical Informatics Biomedical Informatics Research Network (BIRN) University at Buffalo, Department of Philosophy

3 On June 22, 1799, in Paris, everything changed 3

4 International System of Units 4

5 Multiple kinds of data in multiple kinds of silos Lab / pathology data EHR data Clinical trial data Patient histories Medical imaging Microarray data Model organism data Flow cytometry Mass spec Genotype / SNP data 5

6 How to find data? How to find other people’s data? How to reason with data when you find it? How to work out what data does not yet exist? 6

7 7 How to solve the problem of making the data we find queryable and re- usable by others? Part of the solution must involve: standardized terminologies and coding schemes

8 But there are multiple kinds of standardization for biomedical data, and they do not work well together Terminologies (SNOMED, UMLS) CDEs (Clinical research) Information Exchange Standards (HL7 RIM) LIMS (LOINC) MGED standards for microarray data, etc. top-down grid frameworks (caBIG) 8

9 9 most successful, thus far: UMLS Unified Medical Language System collection of separate terminologies built by trained experts massively useful for information retrieval and information integration UMLS Metathesaurus a system of post hoc mappings between overlapping source vocabularies developed according to different and sometimes conflicting standards

10 10 for UMLS local usage respected regimentation frowned upon cross-framework consistency not important no concern to establish consistency with basic science different grades of formal rigor, different degrees of completeness, different update policies, capricious policies for empirical testing

11 A good solution to the silo problem must be: modular incremental bottom-up evidence-based revisable incorporate a strategy for motivating potential developers and users 11

12 12 ontologies = standardized labels designed for use in annotations to make the data cognitively accessible to human beings and algorithmically accessible to computers

13 13 ontologies = high quality controlled structured vocabularies for the annotation (description) of data

14 Ramirez et al. Linking of Digital Images to Phylogenetic Data Matrices Using a Morphological Ontology Syst. Biol. 56(2):283–294, 2007

15 15 what cellular component? what molecular function? what biological process? ontologies used in curation of literature

16 16 Ontologies help integrate complex representations of reality help human beings find things in complex representations of reality help computers reason with complex representations of reality

17 The Gene Ontology

18 Ontologies facilitate grouping of annotations brain 20 hindbrain 15 rhombomere 10 Query brain without ontology 20 Query brain with ontology 45 but they succeed in this only if there is one consensus ontology for each domain 18

19 19

20 20

21 21 People are extending the GO methodology to other domains of biology and of clinical and translational medicine?

22 It is easier to write useful software if one works with a simplified model (“…we can’t know what reality is like in any case; we only have our concepts…”) This looks like a useful model to me (One week goes by:) This other thing looks like a useful model to him Data in Pittsburgh does not interoperate with data in Vancouver Science is siloed The standard engineering methodology

23 23 an analogue of the UMLS problem proliferation of tiny ontologies by different groups with urgent annotation needs

24

25 25 the solution establish common rules governing best practices for creating ontologies in coordinated fashion, with an evidence- based pathway to incremental improvement

26 26 a shared portal for (so far) 58 ontologies (low regimentation) http://obo.sourceforge.nethttp://obo.sourceforge.net  NCBO BioPortal First step (2001)

27 27

28 OBO builds on the principles successfully implemented by the GO recognizing that ontologies need to be developed in tandem 28

29 The methodology of cross-products compound terms in ontologies to be defined as cross-products of simpler terms: E.g elevated blood glucose is a cross-product of PATO: increased concentration with FMA: blood and CheBI: glucose. = factoring out of ontologies into discipline- specific modules (orthogonality) 29

30 The methodology of cross-products enforcing use of common relations in linking terms drawn from Foundry ontologies serves to ensure that the ontologies are maintained and revised in tandem logically defined relations serve to bind terms in different ontologies together to create a network 30

31 31 The OBO Foundry http://obofoundry.org/ Third step (2006)

32 32 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Building out from the original GO

33 33 CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) initial OBO Foundry coverage GRANULARITY RELATION TO TIME

34 34 CRITERIA  opennness  common formal language.  collaborative development  evidence-based maintenance  identifiers  versioning  textual and formal definitions CRITERIA

35 Orthogonality = modularity one ontology for each domain no need for mappings (which are in any case too expensive, too fragile, too difficult to keep up-to-date as mapped ontologies change) everyone knows where to look to find out how to annotate each kind of data 35

36 36  COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the Basic Formal Ontology (BFO) CRITERIA

37 OBO Foundry provides guidelines (traffic laws) to new groups of ontology developers in ways which can counteract current dispersion of effort

38 38 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Building out from the original GO

39 39 CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) GRANULARITY RELATION TO TIME

40 Basic Formal Ontology continuant occurrent biological processes independent continuant cellular component dependent continuant molecular function

41 BFO: The Very Top continuant independent continuant dependent continuant quality function role disposition occurrent

42 function - of liver: to store glycogen - of birth canal: to enable transport - of eye: to see - of mitochondrion: to produce ATP - of liver: to store glycogen not optional; reflection of physical makeup of bearer

43 role optional: exists because the bearer is in some special natural, social, or institutional set of circumstances in which the bearer does not have to be

44 role - bearers can have more than one role person as student and staff member - roles often form systems of mutual dependence husband / wife first in queue / last in queue doctor / patient host / pathogen

45 role of some chemical compound: to serve as analyte in an experiment of a dose of penicillin in this human child: to treat a disease of this bacteria in a primary host: to cause infection

46 A good solution to the silo problem must be: modular incremental bottom-up evidence-based revisable incorporate a strategy for motivating potential developers and users 46

47 Because the ontologies in the Foundry are built as orthogonal modules which form an incrementally evolving network scientists are motivated to commit to developing ontologies because they will need in their own work ontologies that fit into this network users are motivated by the assurance that the ontologies they turn to are maintained by experts 47

48 More benefits of orthogonality helps those new to ontology to find what they need to find models of good practice ensures mutual consistency of ontologies (trivially) and thereby ensures additivity of annotations 48

49 More benefits of orthogonality it rules out the sorts of simplification and partiality which may be acceptable under more pluralistic regimes thereby brings an obligation on the part of ontology developers to commit to scientific accuracy and domain-completeness 49


Download ppt "1 Introduction to Biomedical Ontology Barry Smith University at Buffalo"

Similar presentations


Ads by Google