How to integrate data Barry Smith. The problem: many, many silos DoD spends more than $6B annually developing a portfolio of more than 2,000 business.

Slides:



Advertisements
Similar presentations
Species-Neutral vs. Multi-Species Ontologies Barry Smith.
Advertisements

On the Future of the NeuroBehavior Ontology and Its Relation to the Mental Functioning Ontology Barry Smith
Goal and Status of the OBO Foundry Barry Smith. 2 Semantic Web, Moby, wikis, crowd sourcing, NLP, etc.  let a million flowers (and weeds) bloom  to.
Universal Core Semantic Layer (UCore SL) An Ontology-Based Supporting Layer for UCore 2.0 Presenter: Barry Smith National Center for Ontological Research.
1 Introduction to Biomedical Ontology Barry Smith University at Buffalo
1 Doing Ontology Over Images Barry Smith. What ontologies are for.
1 The OBO Foundry Towards Gold Standard Terminology Resources in the Biomedical Domain Thomas Bittner (based on a presentation by Barry Smith)
1 Intelligence Ontology: A Strategy for the Future Barry Smith University at Buffalo
1 How Ontologies Create Research Communities Barry Smith
1 Introduction to (Geo)Ontology Barry Smith
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
1 The OBO Foundry 2 A prospective standard designed to guarantee interoperability of ontologies from the very start (contrast.
The Problem of Reusability of Biomedical Data OBO Foundry & HL7 RIM Barry Smith.
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
Underlying Ontologies for Biomedical work - The Relation Ontology (RO) and Basic Formal Ontology (BFO) Thomas Bittner SUNY Buffalo
Using Ontologies to Represent Immunological Networks Lindsay G. Cowell, Anne Lieberman, Anna Maria Masci Duke University Center for Computational Immunology.
1 Logical Tools and Theories in Contemporary Bioinformatics Barry Smith
The Future of Ontology in Buffalo Barry Smith 1.
Room for Lunch: Arlington Room Room for Evening Reception: Grand Prairie Room.
Why a Credit Card Number is Not a Number Barry Smith 1.
1 Ontologie als konkretisierte Darstellung der Wirklichkeit Barry Smith.
The RNA Ontology RNAO Colin Batchelor Neocles Leontis May 2009 Eckart, Colin and Jane In Cambridge.
1 BIOLOGICAL DOMAIN ONTOLOGIES & BASIC FORMAL ONTOLOGY Barry Smith.
CoE Ontology Research Group (ORG) Barry Smith Center of Excellence in Bioinformatics and Life Sciences Ontology Research Group Department of Philosophy.
How to Organize the World of Ontologies Barry Smith 1.
New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical Ontology in Buffalo Part I: The Gene Ontology Barry Smith and Werner Ceusters.
The Core Infectious Disease Ontology. Purpose: To make infectious disease-relevant data deriving from different sources comparable and computable Across.
1 How Ontologies Create Research Communities Barry Smith
The OBO Foundry approach to ontologies and standards with special reference to cytokines Barry Smith ImmPort Science Talk / Discussion June 17, 2014.
Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015.
UCore SL Training Event March 17, 2010 Presenters Barry Smith, , Lowell Vizenor, ,
Limning the CTS Ontology Landscape Barry Smith 1.
Ontological Engineering Barry Smith Computers and Information in Engineering Conference, Buffalo August 19,
Computational Biology and Informatics Laboratory Development of an Application Ontology for Beta Cell Genomics Based On the Ontology for Biomedical Investigations.
The CROP (Common Reference Ontologies for Plants) Initiative Barry Smith September 13,
Ontology of Sensors: Some Examples from Biology
Ontological realism as a strategy for integrating ontologies Ontology Summit February 7, 2013 Barry Smith 1.
Intelligence Ontology A Strategy for the Future Barry Smith University at Buffalo
Introduction to Ontology Barry Smith August 11, 2012.
Imports, MIREOT Contributors: Carlo Torniai, Melanie Courtot, Chris Mungall, Allen Xiang.
High Level Architecture Overview and Rules Thanks to: Dr. Judith Dahmann, and others from: Defense Modeling and Simulation Office phone: (703)
Ontological Engineering Barry Smith Computers and Information in Engineering Conference, Buffalo August 19,
Ontology for Federation and Integration of Systems Cross-track A2 Summary Anatoly Levenchuk & Cory Casanave Co-chairs 1 Ontology Summit 2012
The Chronious Ontology Suite: Methodology and Design Principles Luc Schneider[1], Mathias Brochhausen[1,2] [1] Institute for Formal Ontology and Medical.
Horizontal Integration of Warfighter Intelligence Data A Shared Semantic Resource for the Intelligence Community Barry Smith, University at Buffalo, NY,
Building Ontologies with Basic Formal Ontology Barry Smith May 27, 2015.
What is an ontology? Barry Smith 1.
Alan Ruttenberg PONS R&D Task force Alan Ruttenberg Science Commons.
Introduction to Biomedical Ontology for Imaging Informatics Barry Smith, PhD, FACMI University at Buffalo May 11, 2015.
Towards an Ontology of Military Plans and Planning Barry Smith National Center for Ontological Research, Buffalo.
Barry Smith August 26, 2013 Ontology: A Basic Introduction 1.
2 3 where in the body ? where in the cell ?
About ontologies Melissa Haendel. And who am I that I am giving you this talk? Melissa Haendel Anatomist, developmental neuroscientist, molecular biologist,
Ontology and the Semantic Web Barry Smith August 26,
Joint Doctrine Ontology
Need for common standard upper ontology
What developers need to know about ontologies? Barry Smith 1.
Lecture 3 BFO: A Standard Upper Level Ontology. 2 The idea of ontological realism Before we build a data model we need to look at the reality we are trying.
Introduction to Biomedical Ontology for Imaging Informatics Barry Smith, PhD, FACMI University at Buffalo May 11, 2015.
Information Artifact Ontology: General Background Barry Smith 1.
1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo
1 Ontology (Science) vs. Ontology (Engineering) Barry Smith University at Buffalo
Immunology Ontology Rho Meeting October 10, 2013.
OBO Foundry Principles BFO RO Barry Smith 1. OBO Foundry Principles  open  common formal language (OBO Format, OWL DL, CL)  commitment to collaboration.
Big Data that might benefit from ontology technology, but why this usually fails Barry Smith National Center for Ontological Research 1.
Basic Formal Ontology Barry Smith August 26, 2013.
Building Ontologies with Basic Formal Ontology Barry Smith May 27, 2015.
Distributed Common Ground System – Army (DCGS-A)
Why do we need upper ontologies? What are their purported benefits?
OBO Foundry Update: April 2010
Presentation transcript:

How to integrate data Barry Smith

The problem: many, many silos DoD spends more than $6B annually developing a portfolio of more than 2,000 business systems and Web services these systems are poorly integrated deliver redundant capabilities, make data hard to access, foster error and waste prevent secondary uses of data Based on FY11 Defense Information Technology Repository (DITPR) data 2

what is missing here 3

Syntactic and semantic interoperability Syntactic interoperability = systems can exchange messages (realized by XML). Semantic interoperability = messages are interpreted in the same way by senders and receivers. When meanings are specified via natural- language strings, experience shows that this is not a viable route to achieving semantic interoperability. 4

Instance data vs. data about types instances: Bill Clinton Bill Clinton’s dog the planet Earth types: human being dog plant

DoD Enterprise Ontology Dennis Wisnosky, Chief Architect & Chief Technical Officer, Business Mission Area, Office of the Deputy Chief Management Officer, US Department of Defense

Instance data vs. data about types instances: Iraq Basra Abu Ghraib types: country city prison

compare: legends for maps maps vs. legends for maps 8

compare: legends for maps common legends allow (cross-border) integration 9

The Gene Ontology MouseEcotope GlyProt DiabetInGene GluChem sphingolipid transporter activity 10

The Gene Ontology MouseEcotope GlyProt DiabetInGene GluChem Holliday junction helicase complex 11

The Gene Ontology MouseEcotope GlyProt DiabetInGene GluChem sphingolipid transporter activity 12

Common legends help human beings use and understand complex representations of reality help human beings create useful complex representations of reality help computers process complex representations of reality help glue data together But common legends serve these purposes only if the legends are developed in a coordinated, non-redundant fashion 13

International System of Units 14

RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry 15

CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RNAO, PRO) Molecular Function (GO) Molecular Process (GO) rationale of OBO Foundry coverage GRANULARITY RELATION TO TIME 16

RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT COMPLEX OF ORGANISMS Family, Community, Deme, Population Organ Function (FMP, CPRO) Population Phenotype Population Process ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Population-level ontologies 17

RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Environment Ontology environments 18

19 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT COMPLEX OF ORGANISMS Family, Community, Deme, Population Organ Function (FMP, CPRO) Population Phenotype Population Process ORGAN AND ORGANISM Organism (NCBI Taxonomy) (FMA, CARO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cell Com- ponent (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) E N V I R O N M E N T

OBO Foundry approach being applied in the following biology domains 20 NIF StandardNeuroscience Information Framework ISF OntologiesIntegrated Semantic Framework OGMS and ExtensionsOntology for General Medical Science IDO ConsortiumInfectious Disease Ontology cROPCommon Reference Ontologies for Plants

What do ontology annotations do? make data retrievable even by those not involved in their creation allow integration of data deriving from heterogeneous sources break down the walls of roach motels 21

Applying the annotations approach to military data via Semantic Enhancement data remain in their original state (is treated at ‘arms length’) ‘tagged’ using interoperable ontologies created in coordinated way allows flexible response to new needs, adjustable in real time can be as complete as needed, lossless, long-lasting because flexible and responsive big bang for buck – measurable benefit even from first small investments The strategy works only to the degree that it rests on shared governance and training 22

Benefits of the Approach Does not interfere with the source content Enables content to evolve in a cumulative fashion as it accommodates new kinds of data Does not depend on the data resources and can be developed independently from them in an incremental and distributed fashion Provides a more consistent, homogeneous, and well- articulated presentation of the content which originates in multiple internally inconsistent and heterogeneous systems 23

Benefits of the Approach Makes management and exploitation of the content more cost-effective Allows graceful integration with other government initiatives and brings the system closer to the federally mandated net-centric data strategy Creates incrementally an integrated content that is effectively searchable and that provides content to which more powerful analytics can be applied 24

Building the Shared Semantic Resource Methodology of distributed incremental development Training Governance Common Architecture of Ontologies to support consistency, non-redundancy, modularity  Upper Level Ontology (BFO)  Mid-Level Ontologies  Low Level Ontologies 25

Goal: To realize Horizontal Integration(HI) of intelligence data HI =Def. the ability to exploit multiple data sources as if they are one  Problem: the data coming onstream are out of our control  Any strategy for HI must be agile in the sense that it can be quickly extended to new zones of emerging data according to need 28

Army Intelligence and Information Warfare Directorate (I2WD) Create an agile strategy for building ontologies within a Shared Semantic Resource (SSR) and apply and extend these ontologies to annotate new source data as they come onstream Problem: Given the immense and growing variety of data sources, the development methodology must be applied by multiple different groups: How to manage collaboration? 29

Why do large-scale ontology projects fail? focus on vocabularies, lexicons, with no logical structure, no attention to life cycle failure of housekeeping yields redundancy and therefore forking the same data is annotated in different ways by users of different ontology fragments data is siloed as before HOW TO BUILD THE NEEDED LOGIC INTO THE ARCHITECTURE OF THE ONTOLOGIES? 30

MeSH (Medical Subject Headings) MeSH Descriptors Anthropology, Education, Sociology and Social Phenomena Social Sciences Political Systems National Socialism National Socialism is_a Political Systems National Socialism is_a Anthropology...

Examples of Principles All terms in all ontologies should be singular nouns Same relations between terms should be reused in every ontology Reference ontologies should be based on single inheritance All definitions should be of the form an S = Def. a G which Ds where ‘G’ (for: genus) is the parent term of S (for: species) in the corresponding reference ontology

Anatomy Ontology (FMA*, CARO) Environment Ontology (EnvO) Infectious Disease Ontology (IDO*) Biological Process Ontology (GO*) Cell Ontology (CL) Cellular Component Ontology (FMA*, GO*) Phenotypic Quality Ontology (PaTO) Subcellular Anatomy Ontology (SAO) Sequence Ontology (SO*) Molecular Function (GO*) Protein Ontology (PRO*) Extension Strategy + Modular Organization 33 top level mid-level domain level Information Artifact Ontology (IAO) Ontology for Biomedical Investigations (OBI) Spatial Ontology (BSPO) Basic Formal Ontology (BFO)

Ontologies are built as orthogonal modules which form an incrementally evolving network developers and SMEs are motivated to commit to developing ontologies because they will need in their own work ontologies that fit into this network users are motivated by the assurance that the ontologies they turn to are maintained by experts 34

More benefits of orthogonality helps those new to ontology to find what they need to find models of good practice ensures mutual consistency of ontologies (trivially) and thereby ensures additivity of annotations 35

More benefits of orthogonality No need to reinvent the wheel for each new domain Can profit from storehouse of lessons learned Can more easily reuse what is made by others Can more easily reuse training Can more easily inspect and criticize results of others’ work Leads to innovations (e.g. Mireot, Ontofox) in strategies for combining ontologies 36