Presentation is loading. Please wait.

Presentation is loading. Please wait.

What is an ontology? Barry Smith 1.

Similar presentations


Presentation on theme: "What is an ontology? Barry Smith 1."— Presentation transcript:

1 What is an ontology? Barry Smith http://ontology.buffalo.edu/smith 1

2 You’re interested in which genes control heart muscle development 17,536 results 2

3 attacked time control Puparial adhesion Molting cycle hemocyanin Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Immune response Toll regulated genes Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Immune response Microarray data shows changed expression of thousands of genes. How will you spot the patterns? 3

4 Lab / pathology data EHR data Clinical trial data Family history data Medical image data Microarray data Model organism data Flow cytometry Mass spec Genotype / SNP data How will you find the data you need? 4

5 −Human −Mouse −Rat −Fish −Yeast −E. coli How will you find the compare the data? How will you integrate the data 5

6 :. The GO Idea MouseEcotope GlyProt DiabetInGene GluChem sphingolipid transporter activity

7 :. annotation using common ontologies yields integration of databases MouseEcotope GlyProt DiabetInGene GluChem Holliday junction helicase complex

8 ontologies are legends for data 8

9 they provide a growing set of natural language labels to make the data cognitively accessible to human beings and algorithmically accessible to reasoning systems 9

10 compare: legends for maps 10

11 11

12 :. legends for textbook diagrams

13 :. ontologies as legends for images 13

14 what lesion ? what brain function ? 14

15 legends for literature 15

16 x i = vector of measurements of gene i k = the state of the gene ( as “on” or “off”) θ i = set of parameters of the Gaussian model... ontologies as legends for mathematical equations 16

17 17

18 Pathway diagrams as ontologically annotated dynamic cartoons 18

19 two kinds of annotations 19

20 names of instances 20

21 names of types 21

22 Ontologies are representations of types 22

23 ... types which are instantiated e.g. in the lab or clinic 23

24 multiple kinds of relations between represented types provide a tool for algorithmic reasoning 24

25 Gene Ontology: The Very Top cellular component molecular function biological process 25

26 Gene Ontology: The Very Top continuant cellular component molecular function occurrent biological process 26

27 BFO: The Very Top continuant occurrent biological processes independent continuant cellular component dependent continuant molecular function 27

28 Basic Formal Ontology continuant occurrent independent continuant dependent continuant organism 28

29 Basic Formal Ontology continuant occurrent independent continuant dependent continuant anatomical structure 29

30 Continuants continue to exist through time, preserving their identity while undergoing different sorts of changes independent continuants – objects, things,... dependent continuants – qualities, attributes, shapes, potentialities... 30

31 Qualities temperature blood pressure mass... are continuants they exist through time while undergoing changes 31

32 Qualities temperature / blood pressure / mass... are dimensions of variation within the structure of the entity; a quality is something which can change while its bearer remains one and the same 32

33 Qualities temperature / blood pressure / mass... are dimensions of variation within the structure of the entity; a quality is something which can change while its bearer remains one and the same hence only independent continuants may have qualities 33

34 A Chart representing how John’s temperature changes 34

35 John’s temperature the temperature he has throughout his entire life, cycles through different determinate temperatures from one time to the next John’s temperature is a physiology variable which, in thus changing, exerts an influence on other physiology variables through time 35

36 BFO: The Very Top continuant independent continuant dependent continuant quality occurrent temperature 36

37 Blinding Flash of the Obvious independent continuant dependent continuant quality temperature types instances organism John John’s temperature 37

38 Blinding Flash of the Obvious independent continuant dependent continuant quality temperature types instances organism John John’s temperature 38

39 Blinding Flash of the Obvious temperature types instances organism John John’s temperature 39 inheres_in

40 temperature types instances John’s temperature 40 37ºC37.1ºC37.5ºC37.2ºC37.3ºC37.4ºC instantiates at t 1 instantiates at t 2 instantiates at t 3 instantiates at t 4 instantiates at t 5 instantiates at t 6

41 human types instances John 41 embryofetusadultneonateinfantchild instantiates at t 1 instantiates at t 2 instantiates at t 3 instantiates at t 4 instantiates at t 5 instantiates at t 6

42 lower lever of types does not ‘carry identity’ in OntoClean terms are threshold divisions (hence we do not have sharp boundaries, and we have a certain degree of choice, e.g. in how many subtypes to distinguish, though not in their ordering) 42

43 independent continuant dependent continuant quality temperature types instances organism John John’s temperature 43

44 independent continuant dependent continuant quality temperature organism John John’s temperature occurrent process course of temperature changes John’s temperature history 44

45 independent continuant dependent continuant quality temperature organism John John’s temperature occurrent process life of an organism John’s life 45

46 BFO/GO: The Very Top continuant occurrent biological processes independent continuant cellular component dependent continuant molecular function 46

47 BFO: The Very Top continuantoccurrent independent continuant dependent continuant quality function role disposition 47

48 :. Function - of liver: to store glycogen - of birth canal: to enable transport - of eye: to see - of mitochondrion: to produce ATP - of liver: to store glycogen not optional; reflection of physical makeup of bearer; can malfunction 48

49 :. Role optional: exists because the bearer is in some special natural, social, or institutional set of circumstances in which the bearer does not have to be 49

50 :. Role - bearers can have more than one role person as student / as staff member - roles often form systems of mutual dependence husband / wife first in queue / last in queue doctor / patient host / pathogen 50

51 :. Role of some chemical compound: to serve as analyte in an experiment of a dose of penicillin in this human child: to treat a disease of this bacteria in a primary host: to cause infection 51

52 :. Qualities are categorical features of reality – you just have them Functions, roles and dispositions are potential featires of reality: they are realizable dependent continuants, realized in certain associated processes 52

53 independent continuant dependent continuant role drug role portion of chemical compound this portion of aspirin role of this portion of aspirin occurrent process process of drug adminstration John’s taking this portion of aspirin 53

54 independent continuant dependent continuant role drug role portion of chemical compound this portion of aspirin role of this portion of aspirin occurrent process process of drug adminstration John’s taking this portion of aspirin 54 inheres_in realized_in

55 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry 55

56 The Road to Convergence All ontologies for each given domain (anatomy, chemistry…) should be part of a single suite of interoperable ontologies should use a common top-level core for subdomains with many variants, should follow the strategy of canonical ontologies with extensions should require acceptance of common, tested guidelines on all subscribing ontology developers 56

57 CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) initial OBO Foundry coverage, ontologies automatically semantically coupled GRANULARITY RELATION TO TIME 57

58 Disposition (Internally- Grounded Realizable Entity) disposition =def. a realizable entity which if it ceases to exist, then its bearer is physically changed, and whose realization occurs when this bearer is in some special physical circumstances, in virtue of the bearer’s physical make-up 58

59 Function A Disposition (Internally-Grounded Realizable Entity) that is designed or selected for 59

60 OGMS Ontology for General Medical Science http://code.google.com/p/ogms 60

61 :. Physical Disorder – independent continuant fiat object part 61

62 Big Picture 62

63 A disease is a disposition rooted in a physical disorder in the organism and realized in pathological processes. etiological process produces disorder bears disposition realized_in pathological process produces abnormal bodily features recognized_as signs & symptomsinterpretive process produces diagnosis used_in 63

64 Elucidation of Primitive Terms ‘bodily feature’ - an abbreviation for a physical component, a bodily quality, or a bodily process. disposition - an attribute describing the propensity to initiate certain specific sorts of processes when certain conditions are satisfied. clinically abnormal - some bodily feature that –(1) is not part of the life plan for an organism of the relevant type (unlike aging or pregnancy), –(2) is causally linked to an elevated risk either of pain or other feelings of illness, or of death or dysfunction, and –(3) is such that the elevated risk exceeds a certain threshold level.* *Compare: baldness 64

65 Definitions - Foundational Terms Disorder =def. – A causally linked combination of physical components that is clinically abnormal. Pathological Process =def. – A bodily process that is a manifestation of a disorder and is clinically abnormal. Disease =def. – A disposition (i) to undergo pathological processes that (ii) exists in an organism because of one or more disorders in that organism. 65

66 Dispositions and Predispositions All diseases are dispositions; not all dispositions are diseases. A predisposition is a disposition. Predisposition to Disease of Type X =def. – A disposition in an organism that constitutes an increased risk of the organism’s subsequently developing the disease X. HNPCC is caused by a –disorder (mutation) in a DNA mismatch repair gene that –disposes to the acquisition of additional mutations from defective DNA repair processes, and thus is a –predisposition to the development of colon cancer. 66

67 Cirrhosis - environmental exposure Etiological process - phenobarbitol- induced hepatic cell death –produces Disorder - necrotic liver –bears Disposition (disease) - cirrhosis –realized_in Pathological process - abnormal tissue repair with cell proliferation and fibrosis that exceed a certain threshold; hypoxia-induced cell death –produces Abnormal bodily features –recognized_as Symptoms - fatigue, anorexia Signs - jaundice, splenomegaly Symptoms & Signs used_in Interpretive process produces Hypothesis - rule out cirrhosis suggests Laboratory tests produces Test results - elevated liver enzymes in serum used_in Interpretive process produces Result - diagnosis that patient X has a disorder that bears the disease cirrhosis 67

68 Influenza - infectious Etiological process - infection of airway epithelial cells with influenza virus –produces Disorder - viable cells with influenza virus –bears Disposition (disease) - flu –realized_in Pathological process - acute inflammation –produces Abnormal bodily features –recognized_as Symptoms - weakness, dizziness Signs - fever Symptoms & Signs used_in Interpretive process produces Hypothesis - rule out influenza suggests Laboratory tests produces Test results - elevated serum antibody titers used_in Interpretive process produces Result - diagnosis that patient X has a disorder that bears the disease flu But the disorder also induces normal physiological processes (immune response) that can results in the elimination of the disorder (transient disease course). 68

69 Huntington’s Disease - genetic Etiological process - inheritance of >39 CAG repeats in the HTT gene –produces Disorder - chromosome 4 with abnormal mHTT –bears Disposition (disease) - Huntington’s disease –realized_in Pathological process - accumulation of mHTT protein fragments, abnormal transcription regulation, neuronal cell death in striatum –produces Abnormal bodily features –recognized_as Symptoms - anxiety, depression Signs - difficulties in speaking and swallowing Symptoms & Signs used_in Interpretive process produces Hypothesis - rule out Huntington’s suggests Laboratory tests produces Test results - molecular detection of the HTT gene with >39CAG repeats used_in Interpretive process produces Result - diagnosis that patient X has a disorder that bears the disease Huntington’s disease 69

70 HNPCC - genetic pre-disposition Etiological process - inheritance of a mutant mismatch repair gene –produces Disorder - chromosome 3 with abnormal hMLH1 –bears Disposition (disease) - Lynch syndrome –realized_in Pathological process - abnormal repair of DNA mismatches –produces Disorder - mutations in proto-oncogenes and tumor suppressor genes with microsatellite repeats (e.g. TGF-beta R2) –bears Disposition (disease) - non-polyposis colon cancer –realized in Symptoms (including pain) 70

71 The OBO Foundry Initiative 71

72 A good solution to the data integration problem must be: modular incremental bottom-up evidence-based revisable incorporate a strategy for motivating potential developers and users 72

73 GO is amazingly successful – but covers only three sorts of biological entities: –cellular components –molecular functions –biological processes and does not provide representations of disease-related phenomena 73

74 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry 74

75 OBO Foundry provides tested guidelines enabling new groups to develop the ontologies they need in ways which counteract forking and dispersion of effort an incremental bottoms-up approach to evidence-based terminology practices in medicine that is rooted in basic biology automatic web-based linkage between medical terminologies and biological knowledge resources traffic laws and traffic police 75

76 the strategy establish common rules governing best practices for creating ontologies in coordinated fashion, with an evidence- based pathway to incremental improvement 76

77 The methodology of cross-products compound terms in ontologies to be defined as cross-products of simpler terms: E.g elevated blood glucose is a cross-product of PATO: increased concentration with FMA: blood and CheBI: glucose. = factoring out of ontologies into discipline- specific modules (orthogonality) 77

78 The methodology of cross-products enforcing use of common relations in linking terms drawn from Foundry ontologies serves to ensure that the ontologies are maintained and revised in tandem logically defined relations serve to bind terms in different ontologies together to create a network 78

79 CRITERIA  opennness  common formal language.  collaborative development  evidence-based maintenance  identifiers  versioning  textual and formal definitions CRITERIA 79

80 Orthogonality = modularity one ontology for each domain no need for mappings (which are in any case too expensive, too fragile, too difficult to keep up-to-date as mapped ontologies change) everyone knows where to look to find out how to annotate each kind of data 80

81 Ontologies and research groups using BFO and RO –OBO Foundry (60 biomedical ontologies, including GO, OBI, Protein Ontology, Cell Ontology, IDO … –National Cancer Institute (BiomedGT) –NIF (NIH Neuroscience Information Framework) –Cleveland Clinic Semantic Database –Siemens –AstraZeneca –EU (ACGT Cancer Ontology, RAPS, …) 81

82 Because the ontologies in the Foundry are built as orthogonal modules which form an incrementally evolving network scientists are motivated to commit to developing ontologies because they will need in their own work ontologies that fit into this network users are motivated by the assurance that the ontologies they turn to are maintained by experts 82

83 More benefits of orthogonality helps those new to ontology to find what they need to find models of good practice ensures mutual consistency of ontologies (trivially) and thereby ensures additivity of annotations 83

84 More benefits of orthogonality it rules out the sorts of simplification and partiality which may be acceptable under more pluralistic regimes thereby brings an obligation on the part of ontology developers to commit to scientific accuracy and domain-completeness 84

85 More criteria of a successful standard 1.intelligibility to users, consistent use of terms like ‘term’, ‘class’, ‘entity’, ‘object’ …) 2.track record of lessons learned (GO has 10 years of hard user testing) 3.lots of existing users (ontologies are like telephone networks) 85

86  The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the Basic Formal Ontology (BFO) including the Relation Ontology (RO) http://ifomis.org/bfo http://www.obofoundry.org/ro/ COMMON ARCHITECTURE 86

87 Anatomy Ontology (FMA*, CARO) Environment Ontology (EnvO) Infectious Disease Ontology (IDO*) Biological Process Ontology (GO*) Cell Ontology (CL) Cellular Component Ontology (FMA*, GO*) Phenotypic Quality Ontology (PaTO) Subcellular Anatomy Ontology (SAO) Sequence Ontology (SO*) Molecular Function (GO*) Protein Ontology (PRO*) OBO Foundry Modular Organization top level mid-level domain level Information Artifact Ontology (IAO) Ontology for Biomedical Investigations (OBI) Spatial Ontology (BSPO) Basic Formal Ontology (BFO) 87

88 continuant independent continuant portion of material object fiat object part object aggregate object boundary site dependent continuant generically dependent continuant information artifact specifically dependent continuant quality realizable entity function role disposition spatial region 0D-region 1D-region 2D-region 3D-region BFO:continuant

89 occurrent processual entity process fiat process part process aggregate process boundary processual context spatiotemporal region scattered spatiotemporal region connected spatiotemporal region spatiotemporal instant spatiotemporal interval temporal region scattered temporal region connected temporal region temporal instant temporal interval BFO:occurrent

90 Example: The Cell Ontology

91 Anatomy Ontology (FMA*, CARO) Environment Ontology (EnvO) Infectious Disease Ontology (IDO*) Biological Process Ontology (GO*) Cell Ontology (CL) Cellular Component Ontology (FMA*, GO*) Phenotypic Quality Ontology (PaTO) Subcellular Anatomy Ontology (SAO) Sequence Ontology (SO*) Molecular Function (GO*) Protein Ontology (PRO*) OBO Foundry Modular Organization top level mid-level domain level Information Artifact Ontology (IAO) Ontology for Biomedical Investigations (OBI) Spatial Ontology (BSPO) Basic Formal Ontology (BFO) 91


Download ppt "What is an ontology? Barry Smith 1."

Similar presentations


Ads by Google