Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo

Similar presentations


Presentation on theme: "1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo"— Presentation transcript:

1

2 1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo http://ontology.buffalo.edu/smith

3 Multiple kinds of data in multiple kinds of silos Lab / pathology data Electronic Health Record data Clinical trial data Patient histories Medical imaging Microarray data Protein chip data Flow cytometry Mass spec Genotype / SNP data 2

4 How to find your data? How to find other people’s data? How to reason with data when you find it? How to work out what data does not yet exist? 3

5 4 how solve the problem of data re-use to address NIH mandates? part of the solution must involve: standardized terminologies and coding schemes

6 5 NLM’s proposal: the Unified Medical Language System collection of separate terminologies built by trained experts useful for legacy information retrieval and information integration UMLS Metathesaurus a system of post hoc mappings between overlapping source vocabularies

7 6 SNOMED DEMONS U M L S

8 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U

9 8 for UMLS local usage respected regimentation frowned upon cross-framework consistency not important no concern to establish consistency with basic science different grades of formal rigor, different degrees of completeness, different update policies

10 In the olden days people measured lengths using inches, ulnas, perches, king’s feet, Swiss feet, leagues of Paris, etc., etc. 9

11 Data was not comparable 10 1 2

12 on June 22 1799 everything changed 11

13 we now have the International System of Units 12

14 UMLS Can we create something like the SI system of units for biomedical terminology? 13 1 2

15 Uses of ‘ontology’ in PubMed abstracts 14

16 15

17 By far the most successful: GO (Gene Ontology) 16

18 17

19 Hierarchical view of GO representing relations between represented types 18

20 Gene Ontology $100 mill. invested in literature and database curation using the Gene Ontology (GO) based on the idea of annotation over 11 million annotations relating gene products (proteins) described in the UniProt, Ensembl and other databases to terms in the GO multiple secondary uses – because the ontology was not built to meet one specific set of requirements 19

21 GO provides a controlled system of terms for use in annotating (describing, tagging) data multi-species, multi-disciplinary, open source contributing to the cumulativity of scientific results obtained by distinct research communities 20

22 where in the cell ? what kind of molecular function ? semantic annotation of data what kind of biological process? 21

23 natural language labels + definitions to make the data cognitively accessible to human beings and algorithmically accessible to computers 22

24 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) OBO (Open Biomedical Ontology) Foundry proposal (Gene Ontology in yellow) 23

25 compare: legends for maps 24

26 ontologies are legends for data 25

27 ontologies are legends for databases MouseEcotope GlyProt DiabetInGene GluChem sphingolipid transporter activity 26

28 annotation using common ontologies yields integration of databases MouseEcotope GlyProt DiabetInGene GluChem Holliday junction helicase complex 27

29 annotation using common ontologies can support comparison of data 28

30 The goal: virtual science consistent (non-redundant) annotation cumulative (additive) annotation yielding, by incremental steps, a virtual map of the entirety of reality that is accessible to computational reasoning 29

31 This goal is realizable if we have a common ontology framework data is retrievable data is comparable data is error-checkable data is integratable data is capable of being reasoned with only to the degree that it is annotated using a common controlled vocabulary 30


Download ppt "1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo"

Similar presentations


Ads by Google