Presentation is loading. Please wait.

Presentation is loading. Please wait.

New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical Ontology in Buffalo Part I: The Gene Ontology Barry Smith and Werner Ceusters.

Similar presentations


Presentation on theme: "New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical Ontology in Buffalo Part I: The Gene Ontology Barry Smith and Werner Ceusters."— Presentation transcript:

1 New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical Ontology in Buffalo Part I: The Gene Ontology Barry Smith and Werner Ceusters

2 New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical data is siloed Lab / pathology data Electronic Health Record data Clinical trial data Patient histories Medical imaging Microarray data Protein chip data Flow cytometry Genotype / SNP data 2

3 New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical data is siloed Data in Pittsburgh Data owned by Medicare Data owned by the NIH Data owned by HIV researchers Data owned by the Cleveland Clinic Data owned by regional health organizations Data owned by mouse biologists Data owned by Dr McFritz  NIH mandates for data reusability 3

4 New York State Center of Excellence in Bioinformatics & Life Sciences Ontology: An antidote to silos 4 Department of Philosophy 135 Park Hall University at Buffalo Buffalo NY 14260 Department of Philosophy 135 Park Hall University at Buffalo Buffalo NY 14260 promoting: information retrieval information consistency, and thus continuity and cumulation information integration reasoning

5 New York State Center of Excellence in Bioinformatics & Life Sciences Uses of ‘ontology’ in PubMed abstracts 5

6 New York State Center of Excellence in Bioinformatics & Life Sciences By far the most successful: GO (The Gene Ontology)

7 You’re interested in which genes control heart muscle development 17,536 results 7

8 attacked time control Puparial adhesion Molting cycle hemocyanin Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Immune response Toll regulated genes Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Immune response Microarray data shows changed expression of thousands of genes. How will you spot the patterns? 8

9 You’re interested in which of your hospital’s patient data is relevant to understanding how genes control heart muscle development 9

10 Lab / pathology data EHR data Clinical trial data Family history data Medical imaging Microarray data Model organism data Flow cytometry Mass spec Genotype / SNP data How will you spot the patterns? How will you find the data you need? 10

11 New York State Center of Excellence in Bioinformatics & Life Sciences GO provides a controlled system of 25,000 categories for use in annotating data multi-species (model organism research) multi-disciplinary open source 11

12 12

13 Definitions 13

14 Gene products involved in cardiac muscle development in humans 14

15 Hierarchical view representing relations between represented types 15 The GO categorizations are organized in a way which provides a tool for algorithmic reasoning

16 New York State Center of Excellence in Bioinformatics & Life Sciences $100 mill. invested in literature curation using GO 16 over 11 million annotations relating gene products described in the UniProt, Ensembl and other databases to terms in the GO experimental results reported in 52,000 scientific journal articles manually annoted by expert biologists using GO

17 New York State Center of Excellence in Bioinformatics & Life Sciences One standard method Sjöblöm T, et al. analyzed13,023 genes in 11 breast and 11 colorectal cancers using baseline functional information captured by GO for given gene product types identified 189 genes as being mutated at significant frequency and thus as providing targets for diagnostic and therapeutic intervention. Science. 2006 Oct 13;314(5797):268-74. 17

18 New York State Center of Excellence in Bioinformatics & Life Sciences Uses of GO in studies of: Persistent changes in spinal cord gene expression after recovery from inflammatory hyperalgesia: a preliminary study on pain memory. PMID: 18366630 Spinal cord transcriptional profile analysis reveals protein trafficking and RNA processing as prominent processes regulated by tactile allodynia. PMID: 17069981 Immune system involvement in abdominal aortic aneurisms (PMID 17634102) Biomedical discovery acceleration, with applications to craniofacial development. PMID: 19325874 18

19 New York State Center of Excellence in Bioinformatics & Life Sciences Ontology in Buffalo Part 2: Problems of Clinical Ontologies

20 New York State Center of Excellence in Bioinformatics & Life Sciences Source of all data Reality ! 20

21 New York State Center of Excellence in Bioinformatics & Life Sciences Ultimate goal A digital copy of the world 21

22 New York State Center of Excellence in Bioinformatics & Life Sciences Requirements for this digital copy R1: A faithful representation of reality R2… of everything that is digitally registered, what is generic  scientific theories what is specific  what individual entities exist and how they relate R3… which is computable, in order to … … allow queries over the world’s past and present … make predictions (diagnostic support, early warnings …) … fill in gaps … identify mistakes... 22

23 New York State Center of Excellence in Bioinformatics & Life Sciences … the ultimate crystal ball 23

24 New York State Center of Excellence in Bioinformatics & Life Sciences The ‘binding’ wall How to do it right ? A cartoon of the world 24

25 New York State Center of Excellence in Bioinformatics & Life Sciences “Better Information” must cover … EHR-EMR-ENR-… PHR Various modality-related databases –Lab, imaging, … Textbooks Classification systems Terminologies Ontologies Patient-specific information Scientific “knowledge” 1 2 3 25

26 New York State Center of Excellence in Bioinformatics & Life Sciences Key question How to extend to clinical medicine the standard of quality of the GO and other ontologies based in biological science? 26

27 New York State Center of Excellence in Bioinformatics & Life Sciences NCI Thesaurus (April 2008) 2 27

28 New York State Center of Excellence in Bioinformatics & Life Sciences NCI Thesaurus (April 2008) ? 2 28

29 New York State Center of Excellence in Bioinformatics & Life Sciences Snomed CT (July 2007): “fractured nasal bones” 2 29

30 New York State Center of Excellence in Bioinformatics & Life Sciences Cause: coding / classification confusion A patient with a fractured nasal bone A patient with a broken nose A patient with a fracture of the nose = = 2 30

31 New York State Center of Excellence in Bioinformatics & Life Sciences A patient with a fractured nasal bone A patient with a broken nose A patient with a fracture of the nose = = Cause: coding / classification confusion A patient with a fractured nasal bone A patient with a broken nose A patient with a fracture of the nose = = 2 31

32 New York State Center of Excellence in Bioinformatics & Life Sciences MeSH: some paths from top to Wolfram Syndrome Wolfram Syndrome All MeSH Categories Diseases Category Nervous System Diseases Cranial Nerve Diseases Optic Nerve Diseases Optic Atrophy Optic Atrophies, Hereditary Neurodegenerative Diseases Heredodegenerative Disorders, Nervous System Eye Diseases Eye Diseases, Hereditary Optic Nerve Diseases Male Urogenital Diseases Urologic Diseases Kidney Diseases Diabetes Insipidus Female Urogenital Diseases and Pregnancy Complications Female Urogenital Diseases 2 32

33 New York State Center of Excellence in Bioinformatics & Life Sciences What would it mean if used in the context of a patient ? Wolfram Syndrome All MeSH Categories Diseases Category Nervous System Diseases Cranial Nerve Diseases Optic Nerve Diseases Optic Atrophy Optic Atrophies, Hereditary has Neurodegenerative Diseases Heredodegenerative Disorders, Nervous System Eye Diseases Eye Diseases, Hereditary Optic Nerve Diseases Female Urogenital Diseases and Pregnancy Complications Female Urogenital Diseases Male Urogenital Diseases Urologic Diseases Kidney Diseases Diabetes Insipidus ??? … has 3 ??? 33

34 New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical Ontology in BuffaloPart 3: What we do

35 New York State Center of Excellence in Bioinformatics & Life Sciences The GO is amazingly successful in overcoming silo problems but it covers only generic biological entities of three sorts: –cellular components –molecular functions –biological processes and it does not provide representations of diseases, symptoms, … 35

36 New York State Center of Excellence in Bioinformatics & Life Sciences The core of biomedical ontology in Buffalo – extending the methodology of high quality ontologies to other domains of biology and medicine, and to EHRs and coding systems – combining ontology with referent tracking 36

37 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry 37

38 New York State Center of Excellence in Bioinformatics & Life Sciences NCBO NIH Roadmap Center for Biomedical Computing Collaboration of: Stanford Biomedical Informatics Research Mayo Clinic University at Buffalo National Center for Biomedical Ontology (NCBO) 38

39 New York State Center of Excellence in Bioinformatics & Life Sciences National Center for Ontological Research (NCOR) Army Net-Centric Data Strategy Center of Excellence –Biometrics Ontology –Command and Control Ontology –Universal Core Semantic Layer 39

40 New York State Center of Excellence in Bioinformatics & Life Sciences Current funded biomedical ontology projects Protein Ontology (PRO) (NIH/NIGMS) Infectious Disease Ontology (IDO) (NIH/NIAID) Realism-Based Versioning for Biomedical Ontologies (SNOMED) (NIH/NLM) Ontology for Risks Against Patient Safety (RAPS) (EU) DSM Ontology (to support work on revision of Diagnostic and Statistical Manual of Mental Disorders Cleveland Clinic Semantic Database in Cardiothoracic Surgery 40

41 New York State Center of Excellence in Bioinformatics & Life Sciences IDO Consortium MITRE, Mount Sinai, UTSouthwestern – Influenza IMBB/VectorBase – Vector borne diseases (A. gambiae, A. aegypti, I. scapularis, C. pipiens, P. humanus) Colorado State University – Dengue Fever Duke University – Tuberculosis Cleveland Clinic – Infective Endocarditis University of Michigan – Brucilosis 41

42 New York State Center of Excellence in Bioinformatics & Life Sciences “Better Information” must cover … EHR-EMR-ENR-… PHR Various modality-related databases –Lab, imaging, … Textbooks Classification systems Terminologies Ontologies Patient-specific information Scientific “knowledge” 1 2 3 42

43 New York State Center of Excellence in Bioinformatics & Life Sciences Ontologies Keeping track of what is general (diabetes, malaria, nasal bone, nose …) 43

44 New York State Center of Excellence in Bioinformatics & Life Sciences Referent tracking Keeping track of what is particular (this particular nasal bone, this particular fracture, this particular swimming pool, this particular image …) 44

45 New York State Center of Excellence in Bioinformatics & Life Sciences eyeGENE 45

46 New York State Center of Excellence in Bioinformatics & Life Sciences Ontology for Risks Against Patient Safety 46

47 New York State Center of Excellence in Bioinformatics & Life Sciences REMINE: RT-based adverse event analysis 47


Download ppt "New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical Ontology in Buffalo Part I: The Gene Ontology Barry Smith and Werner Ceusters."

Similar presentations


Ads by Google