Presentation is loading. Please wait.

Presentation is loading. Please wait.

Biomedical Ontologies: The State of the Art Barry Smith and Werner Ceusters MIE, Sarajevo, August 30 1.

Similar presentations


Presentation on theme: "Biomedical Ontologies: The State of the Art Barry Smith and Werner Ceusters MIE, Sarajevo, August 30 1."— Presentation transcript:

1 Biomedical Ontologies: The State of the Art Barry Smith and Werner Ceusters MIE, Sarajevo, August 30 1

2 Part 1: Barry Smith Ontologies are Representations of What is General in Reality Part 2: Werner Ceusters Referent Tracking: Pinning Ontologies to Instances in Reality 2

3 Uses of ‘ontology’ in PubMed abstracts 3

4 By far the most successful: GO (Gene Ontology) 4

5 You’re interested in which genes control heart muscle development 17,536 results 5

6 attacked time control Puparial adhesion Molting cycle hemocyanin Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Immune response Toll regulated genes Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Immune response Microarray data shows changed expression of thousands of genes. How will you spot the patterns? 6

7 You’re interested in which of your hospital’s patient data is relevant to understanding how genes control heart muscle development 7

8 Lab / pathology data EHR data Clinical trial data Family history data Medical imaging Microarray data Model organism data Flow cytometry Mass spec Genotype / SNP data How will you spot the patterns? How will you find the data you need? 8

9 One strategy for bringing order into this huge conglomeration of data is through the use of Common Data Elements Discipline-specific (cancer, NIAID, …) Do not solve the problems of balkanization (data siloes) Do not evolve gracefully as knowledge advances Support data cumulation, but do not readily support data integration and computation 9

10 An ontology is not a terminology Existing term lists and CDEs built to serve specific data-processing in ad hoc ways Ontologies designed from the start to ensure integratability and reusability of data by incorporating a common logical structure 10

11 How does the Gene Ontology work? with thanks to Jane Lomax, Gene Ontology Consortium 11

12 GO provides a controlled system of representations for use in annotating data multi-species, multi-disciplinary, open source contributing to the cumulativity of scientific results obtained by distinct research communities compare use of kilograms, meters, seconds … in formulating experimental results 12

13 13

14 Definitions 14

15 Gene products involved in cardiac muscle development in humans 15

16 GO provides answers to three types of questions for each gene product in what parts of the cell has it been identified? exercising what types of molecular functions? with what types of biological processes? when is a particular gene product involved in the course of normal development? in the process leading to abnormality with what functions is the gene product associated in other biological processes? 16

17 Some pain-related terms in GO GO:0048265GO:0048265response to pain GO:0019233GO:0019233 sensory perception of pain GO:0048266GO:0048266behavioral response to pain GO:0019234GO:0019234sensory perception of fast pain GO:0019235GO:0019235sensory perception of slow pain GO:0051930GO:0051930regulation of sensory perception of pain GO:0050967GO:0050967detection of electrical stimulus during sensory perception of pain GO:0050968GO:0050968detection of chemical stimulus involved in sensory perception of pain GO:0050966GO:0050966detection of mechanical stimulus involved in sensory perception of pain 17

18 GO:0050968 detection of chemical stimulus involved in sensory perception of pain 18

19 GO provides a tool for algorithmic reasoning 19

20 Hierarchical view representing relations between represented types 20

21 GO allows a new kind of biological research, based on analysis and comparison of the massive quantities of annotations linking GO terms to gene products 21

22 One standard method Sjöblöm T, et al. analyzed13,023 genes in 11 breast and 11 colorectal cancers using functional information captured by GO for given gene product types identified 189 as being mutated at significant frequency and thus as providing targets for diagnostic and therapeutic intervention. Science. 2006 Oct 13;314(5797):268-74. 22

23 Uses of GO in studies of: Biomedical discovery acceleration, with applications to craniofacial development. PMID: 19325874 Persistent changes in spinal cord gene expression after recovery from inflammatory hyperalgesia: a preliminary study on pain memory. PMID: 18366630 Spinal cord transcriptional profile analysis reveals protein trafficking and RNA processing as prominent processes regulated by tactile allodynia. PMID: 17069981 Immune system involvement in abdominal aortic aneurisms (PMID 17634102) 23

24 $100 mill. invested in literature curation using GO over 11 million annotations relating gene products described in the UniProt, Ensembl and other databases to terms in the GO experimental results reported in 52,000 scientific journal articles manually annoted by expert biologists using GO ontologies provide the basis for capturing biological theories in computable form 24

25 GO is amazingly successful in overcoming problems of balkanization but it covers only generic biological entities of three sorts: –cellular components –molecular functions –biological processes and it does not provide representations of diseases, symptoms, … 25

26 Extending the GO methodology to other domains of biology and medicine 26

27 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry 27

28 OntologyScopeURLCustodians Cell Ontology (CL) cell types from prokaryotes to mammals obo.sourceforge.net/cgi- bin/detail.cgi?cell Jonathan Bard, Michael Ashburner, Oliver Hofman Chemical Entities of Bio- logical Interest (ChEBI) molecular entitiesebi.ac.uk/chebi Paula Dematos, Rafael Alcantara Common Anatomy Refer- ence Ontology (CARO) anatomical structures in human and model organisms (under development) Melissa Haendel, Terry Hayamizu, Cornelius Rosse, David Sutherland, Foundational Model of Anatomy (FMA) structure of the human body fma.biostr.washington. edu JLV Mejino Jr., Cornelius Rosse Functional Genomics Investigation Ontology (FuGO) design, protocol, data instrumentation, and analysis fugo.sf.netFuGO Working Group Gene Ontology (GO) cellular components, molecular functions, biological processes www.geneontology.orgGene Ontology Consortium Phenotypic Quality Ontology (PaTO) qualities of anatomical structures obo.sourceforge.net/cgi -bin/ detail.cgi? attribute_and_value Michael Ashburner, Suzanna Lewis, Georgios Gkoutos Protein Ontology (PrO) protein types and modifications (under development)Protein Ontology Consortium Relation Ontology (RO) relationsobo.sf.net/relationshipBarry Smith, Chris Mungall RNA Ontology (RnaO) three-dimensional RNA structures (under development)RNA Ontology Consortium Sequence Ontology (SO) properties and features of nucleic sequences song.sf.netKaren Eilbeck 28

29 OBO Foundry recognized by NIH as framework to address mandates for re-usability of data collected through Federally funded research see NIH PAR-07-425: Data Ontologies for Biomedical Research (R01) 29

30 Initial Candidate Members –GO Gene Ontology –CL Cell Ontology –SO Sequence Ontology –ChEBI Chemical Ontology –PATO Phenotype (Quality) Ontology –FMA Foundational Model of Anatomy –ChEBI Chemical Entities of Biological Interest –CARO Common Anatomy Reference Ontology –PRO Protein Ontology The OBO Foundry 30

31 Under development –Disease Ontology –Infectious Disease Ontology –Mammalian Phenotype Ontology –Plant Trait Ontology –Environment Ontology –Ontology for Biomedical Investigations –Behavior Ontology –RNA Ontology –RO Relation Ontology The OBO Foundry 31

32 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry 32

33 OBO Foundry is organized in terms of Basic Formal Ontology Each Foundry ontology can be seen as an extension of a single upper level ontology (BFO) 33

34 Basic Formal Ontology (BFO) Continuant Occurrent (Process, Event) Independent Continuant Dependent Continuant http://ifomis.uni-saarland.de/bfo/ 34

35 Fundamental Dichotomy Continuants preserve their identity through change vs. Occurrents (aka processes) –have temporal parts –unfold themselves in successive phases –exist only in their phases –have all their parts of necessity 35

36 Ontology and Referent Tracking Continuant Occurrent process, event Independent Continuant thing Dependent Continuant quality................ types instances 36

37 CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RNAO, PRO) Molecular Function (GO) Molecular Process (GO) rationale of OBO Foundry coverage (homesteading principle) GRANULARITY RELATION TO TIME 37

38 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT COMPLEX OF ORGANISMS Family, Community, Deme, Population Organ Function (FMP, CPRO) Population Phenotype Population Process ORGAN AND ORGANISM Organism (NCBI Taxonomy) (FMA, CARO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cell Com- ponent (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) E N V I R O N M E N T 38

39 The Gene Ontology (GO) Continuant Occurrent Independent Continuant Dependent Continuant cell component biological process molecular function Kumar A., Smith B, Borgelt C. Dependence relationships between Gene Ontology terms based on TIGR gene product annotations. CompuTerm 2004, 31-38. Bada M, Hunter L. Enrichment of OBO Ontologies. J Biomed Inform. 2006 Jul 26 39

40 Users of BFO GO / OBO Foundry NCI BiomedGT SNOMED CT ACGT Clinical Genomics Trials on Cancer – Master Ontology / Formbuilder (Case Report Forms for Cancer Clinical Trials) Ontology for Risks Against Patient Safety (RAPS) (EU) 40

41 Users of BFO MediCognos / Microsoft Healthvault Cleveland Clinic Semantic Database in Cardiothoracic Surgery Major Histocompatibility Complex (MHC) Ontology (NIAID) Neuroscience Information Framework Standard (NIFSTD) 41

42 IDO Infectious Disease Ontology MITRE, Mount Sinai, UTSouthwestern – Influenza IMBB/VectorBase – Vector borne diseases (A. gambiae, A. aegypti, I. scapularis, C. pipiens, P. humanus) Colorado State University – Dengue Fever Duke University – Tuberculosis, Staph. aureus Case Western Reserve – Infective Endocarditis University of Michigan – Brucilosis 42

43 Users of BFO Interdisciplinary Prostate Ontology (IPO) Nanoparticle Ontology (NPO): Ontology for Cancer Nanotechnology Research Neural Electromagnetic Ontologies (NEMO): Ontology-based Tools for Representation and Integration of Event-related Brain Potentials Ontology for General Medical Science 43

44 depends_on Continuant Occurrent process, event Independent Continuant thing Dependent Continuant quality................ quality depends on bearer 44

45 Specifically dependent continuants the quality of whiteness of this cheese your role as lecturer the disposition of this patient to experience diarrhea 45

46 depends_on Continuant Occurrent process Independent Continuant thing Dependent Continuant quality................ temperature depends on bearer 46

47 Realizable dependent continuants plan function role disposition capability tendency continuants 47

48 Their realizations execution expression exercise realization application course occurrents 48

49 Continuant Independent Continuant Dependent Continuant.......... Non-realizable Dependent Continuant (quality) Realizable Dependent Continuant (function, role, disposition) 49

50 realization depends_on disposition Continuant Occurrent Independent Continuant bearer Dependent Continuant disposition................ 50 Process of realization

51 Dependence a is dependent on b =def. a is necessarily such that if b ceases to exist than a ceases to exist 51

52 Specifically Dependent Continuants Specifically Dependent Continuant Quality, Pattern Realizable Dependent Continuant if any bearer ceases to exist, then the quality or function ceases to exist the color of my skin the function of my heart to pump blood my weight 52

53 Generically Dependent Continuants Generically Dependent Continuant Information Object Gene Sequence if one bearer ceases to exist, then the entity can survive, because there are other bearers (copyability) the pdf file on my laptop the DNA (sequence) in this chromosome 53

54 Four distinct classificatory tasks 1.of people (patients, carriers, …) 2.of diseases (cases, instances, problems, …) 3.of courses of disease (symptoms, treatments…) 4.of representations (records, observations, data, diagnoses…) ICD confuses 1. & 2. HL7, most standard terminologies, confuse 2. and 4 54

55 Four distinct BFO categories 1.person (patient, carrier, …) – independent continuant 2.disease (case, instance, problem, …) – specifically dependent continuant 3.course of disease (symptom, treatment…) – occurrent 4.representation (record, datum, diagnosis…) – generically dependent continuant 55

56 Four distinct BFO categories 1.people (patients, carriers, …) – independent continuants 2.diseases (cases, instances, problems, …) – dispositions 3.courses of disease (symptoms, treatments…) – realizations of dispositions 4.representations (records, data, diagnoses…) – generically dependent continuants 56

57 Big Picture (with thanks to Richard Scheuermann) 57

58 A disease is a disposition rooted in a physical disorder in the organism and realized in pathological processes. etiological process produces disorder bears disposition realized_in pathological process produces abnormal bodily features recognized_as signs & symptomsinterpretive process produces diagnosis used_in 58

59 Elucidation of Primitive Terms ‘bodily feature’ - an abbreviation for a physical component, a bodily quality, or a bodily process. disposition - an attribute describing the propensity to initiate certain specific sorts of processes when certain conditions are satisfied. clinically abnormal - some bodily feature that (1) is not part of the life plan for an organism of the relevant type (unlike aging or pregnancy), (2) is causally linked to an elevated risk either of pain or other feelings of illness, or of death or dysfunction, and (3) is such that the elevated risk exceeds a certain threshold level.* *Compare: baldness 59

60 Definitions - Foundational Terms Disorder =def. – A causally linked combination of physical components that is clinically abnormal. Pathological Process =def. – A bodily process that is a manifestation of a disorder and is clinically abnormal. Disease =def. – A disposition (i) to undergo pathological processes that (ii) exists in an organism because of one or more disorders in that organism. 60

61 Dispositions and Predispositions All diseases are dispositions; not all dispositions are diseases. A predisposition is a disposition. Predisposition to Disease of Type X =def. – A disposition in an organism that constitutes an increased risk of the organism’s subsequently developing the disease X. HNPCC is caused by a disorder (mutation) in a DNA mismatch repair gene that disposes to the acquisition of additional mutations from defective DNA repair processes, and thus is a predisposition to the development of colon cancer. 61

62 Definitions - Clinical Evaluation Terms Sign =def. – A bodily feature of a patient that is observed in a physical examination and is deemed by the clinician to be of clinical significance. (Objectively observable features) Symptom =def. – A experienced bodily feature of a patient that is observed by and observable only by the patient and is of the type that can be hypothesized by a patient to be a realization of a disease. (A restricted family of phenomena including pain, nausea, anger, drowsiness, which are of their nature experienced in the first person) Symptoms are subjective. But this does not mean that there is no objective fact of the matter whether a given symptom exists 62

63 Cirrhosis - environmental exposure Etiological process - phenobarbitol- induced hepatic cell death produces Disorder - necrotic liver bears Disposition (disease) - cirrhosis realized_in Pathological process - abnormal tissue repair with cell proliferation and fibrosis that exceed a certain threshold; hypoxia-induced cell death produces Abnormal bodily features recognized_as Symptoms - fatigue, anorexia Signs - jaundice, splenomegaly Symptoms & Signs used_in Interpretive process produces Hypothesis - rule out cirrhosis suggests Laboratory tests produces Test results - elevated liver enzymes in serum used_in Interpretive process produces Result - diagnosis that patient X has a disorder that bears the disease cirrhosis 63

64 Influenza - infectious Etiological process - infection of airway epithelial cells with influenza virus produces Disorder - viable cells with influenza virus bears Disposition (disease) - flu realized_in Pathological process - acute inflammation produces Abnormal bodily features recognized_as Symptoms - weakness, dizziness Signs - fever Symptoms & Signs used_in Interpretive process produces Hypothesis - rule out influenza suggests Laboratory tests produces Test results - elevated serum antibody titers used_in Interpretive process produces Result - diagnosis that patient X has a disorder that bears the disease flu But the disorder also induces normal physiological processes (immune response) that can results in the elimination of the disorder (transient disease course). 64

65 Huntington’s Disease - genetic Etiological process - inheritance of >39 CAG repeats in the HTT gene produces Disorder - chromosome 4 with abnormal mHTT bears Disposition (disease) - Huntington’s disease realized_in Pathological process - accumulation of mHTT protein fragments, abnormal transcription regulation, neuronal cell death in striatum produces Abnormal bodily features recognized_as Symptoms - anxiety, depression Signs - difficulties in speaking and swallowing Symptoms & Signs used_in Interpretive process produces Hypothesis - rule out Huntington’s suggests Laboratory tests produces Test results - molecular detection of the HTT gene with >39CAG repeats used_in Interpretive process produces Result - diagnosis that patient X has a disorder that bears the disease Huntington’s disease 65

66 HNPCC - genetic pre-disposition Etiological process - inheritance of a mutant mismatch repair gene produces Disorder - chromosome 3 with abnormal hMLH1 bears Disposition (disease) - Lynch syndrome realized_in Pathological process - abnormal repair of DNA mismatches produces Disorder - mutations in proto-oncogenes and tumor suppressor genes with microsatellite repeats (e.g. TGF-beta R2) bears Disposition (disease) - non-polyposis colon cancer realized in Symptoms (including pain) 66

67 Definition: Etiology Etiological Process =def. – A process in an organism that leads to a subsequent disorder. Example: toxic chemical exposure resulting in a mutation in the genomic DNA of a cell; infection of a human with a pathogenic virus; inheritance of two defective copies of a metabolic gene The etiological process creates the physical basis of that disposition to pathological processes which is the disease. 67

68 Definitions - Diagnosis Clinical Picture =def. – A representation of a clinical phenotype that is inferred from the combination of laboratory, image and clinical findings about a given patient. Diagnosis =def. – A representation of a conclusion of an interpretive process that has as input a clinical picture of a given patient and as output an assertion to the effect that the patient has a disease of such and such a type. 68

69 Definitions - Qualities Manifestation of a Disease =def. – A bodily feature of a patient that is (a) a deviation from clinical normality that exists in virtue of the realization of a disease and (b) is observable. Observability includes observable through elicitation of response or through the use of special instruments. Preclinical Manifestation of a Disease =def. – A manifestation of a disease that exists prior to its becoming detectable in a clinical history taking or physical examination. Clinical Manifestation of a Disease =def. – A manifestation of a disease that is detectable in a clinical history taking or physical examination. Phenotype =def. – A (combination of) bodily feature(s) of an organism determined by the interaction of its genetic make-up and environment. Clinical Phenotype =def. – A clinically abnormal phenotype. 69


Download ppt "Biomedical Ontologies: The State of the Art Barry Smith and Werner Ceusters MIE, Sarajevo, August 30 1."

Similar presentations


Ads by Google