Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ontology and the Semantic Web Barry Smith August 26, 2013 1.

Similar presentations


Presentation on theme: "Ontology and the Semantic Web Barry Smith August 26, 2013 1."— Presentation transcript:

1 Ontology and the Semantic Web Barry Smith August 26, 2013 1

2 Ontologies are computer-tractable representations of types in specific areas of reality are more and less general (upper and lower ontologies) – upper = organizing ontologies – lower = domain ontologies 2

3 FMA Pleural Cavity Pleural Cavity Interlobar recess Interlobar recess Mesothelium of Pleura Mesothelium of Pleura Pleura(Wall of Sac) Pleura(Wall of Sac) Visceral Pleura Visceral Pleura Pleural Sac Parietal Pleura Parietal Pleura Anatomical Space Organ Cavity Organ Cavity Serous Sac Cavity Serous Sac Cavity Anatomical Structure Anatomical Structure Organ Serous Sac Mediastinal Pleura Mediastinal Pleura Tissue Organ Part Organ Subdivision Organ Subdivision Organ Component Organ Component Organ Cavity Subdivision Organ Cavity Subdivision Serous Sac Cavity Subdivision Serous Sac Cavity Subdivision part_of is_a Foundational Model of Anatomy 3

4 ontologies = standardized labels designed for use in annotations to make the data cognitively accessible to human beings and algorithmically accessible to computers 4

5 by allowing grouping of annotations brain 20 hindbrain 15 rhombomere 10 Query brain without ontology 20 Query brain with ontology 45 Ontologies facilitate retrieval of data 5

6 ontologies = high quality controlled structured vocabularies used for the annotation (description, tagging) of data, images, emails, documents, … 6

7 Ontology’s greatest successes around net-centricity You build a site Others discover the site and they link to it The more they link, the more well known the page becomes (Google …) Your data becomes discoverable Your data becomes more easily discoverable the more you use common vocabularies 7

8 1.Each group creates a controlled vocabulary of the terms commonly used in its domain, and creates an ontology out of these terms using OWL (Web Ontology Language) syntax 4.Binds this ontology to its data and makes these data available on the Web 5.The ontologies are linked e.g. through their use of some common terms 6.These links create links among all the datasets, thereby creating a ‘web of data’ 7.We can all share the same tags – they are called internet addresses The roots of Semantic Technology 8

9 Audio Features Ontology 9

10 10

11 Where we stand today increasing availability of semantically enhanced data and semantic software increasing use of OWL (Web Ontology Language) in attempts to create useful integration of on-line data and information “Linked Open Data” the New Big Thing 11

12 as of September 2010 12

13 The problem: the more this sort of Semantic Technology is successful, they more it fails The original idea was to break down silos via common controlled vocabularies for the tagging of data The very success of the approach leads to the creation of ever new controlled vocabularies – semantic silos – as ever more ontologies are created in ad hoc ways Every organization and sub-organization now wants to have its own “ontology” The Semantic Web framework as currently conceived and governed by the W3C yields minimal standardization 13

14 Divided we fail 14

15 United we also fail 15

16 The problem: many, many silos DoD spends more than $6B annually developing a portfolio of more than 2,000 business systems and Web services these systems are poorly integrated deliver redundant capabilities, make data hard to access, foster error and waste prevent secondary uses of data https://ditpr.dod.mil/https://ditpr.dod.mil/ Based on FY11 Defense Information Technology Repository (DITPR) data 16

17 what is missing here 17

18 Syntactic and semantic interoperability Syntactic interoperability = systems can exchange messages (realized by XML). Semantic interoperability = messages are interpreted in the same way by senders and receivers. In UCore, meanings are specified via natural- language strings. Experience shows that this is not a viable route to achieving semantic interoperability. 18

19 How to avoid the problem of semantic siloes Distributed Development of a Shared Semantic Resource Pilot testing to demonstrate feasibility for I2WD 19

20 An alternative solution: Semantic Enhancement A distributed incremental strategy of coordinated annotation – data remain in their original state (is treated at ‘arms length’) – ‘tagged’ using interoperable ontologies created in tandem – allows flexible response to new needs, adjustable in real time – can be as complete as needed, lossless, long-lasting because flexible and responsive – big bang for buck – measurable benefit even from first small investments The strategy works only to the degree that it rests on shared governance and training 20

21 compare: legends for maps 21

22 compare: legends for maps common legends allow (cross-border) integration 22

23 The Gene Ontology MouseEcotope GlyProt DiabetInGene GluChem sphingolipid transporter activity 23

24 The Gene Ontology MouseEcotope GlyProt DiabetInGene GluChem Holliday junction helicase complex 24

25 The Gene Ontology MouseEcotope GlyProt DiabetInGene GluChem sphingolipid transporter activity 25

26 Common legends help human beings use and understand complex representations of reality help human beings create useful complex representations of reality help computers process complex representations of reality help glue data together But common legends serve these purposes only if the legends are developed in a coordinated, non-redundant fashion 26

27 International System of Units 27

28 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry 28

29 CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RNAO, PRO) Molecular Function (GO) Molecular Process (GO) rationale of OBO Foundry coverage GRANULARITY RELATION TO TIME 29

30 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT COMPLEX OF ORGANISMS Family, Community, Deme, Population Organ Function (FMP, CPRO) Population Phenotype Population Process ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Population-level ontologies 30

31 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Environment Ontology environments 31

32 32 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT COMPLEX OF ORGANISMS Family, Community, Deme, Population Organ Function (FMP, CPRO) Population Phenotype Population Process ORGAN AND ORGANISM Organism (NCBI Taxonomy) (FMA, CARO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cell Com- ponent (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) http://obofoundry.org E N V I R O N M E N T

33 33 RELATION TO TIME GRANULARITY CONTINUANT INDEPENDENT COMPLEX OF ORGANISMS Family, Community, Deme, Population Environment of population ORGAN AND ORGANISM Organism (NCBI Taxonomy) (FMA, CARO) Environment of single organism CELL AND CELLULAR COMPONENT Cell (CL) Cell Com- ponent (FMA, GO) Environment of cell MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular environment http://obofoundry.org E N V I R O N M E N T

34 The OBO Foundry based on the idea of annotation = semantic enhancement of data across all of biology $200 mill. spent so far on using the GO to annotate (tag) biomedical research data through manual effort of PhD biologusts 34

35 OBO Foundry approach extended into other domains 35 NIF StandardNeuroscience Information Framework ISF OntologiesIntegrated Semantic Framework OGMS and ExtensionsOntology for General Medical Science IDO ConsortiumInfectious Disease Ontology cROPCommon Reference Ontologies for Plants

36 What these annotations do make data retrievable even by those not involved in their creation allow integration of data deriving from heterogeneous sources break down the walls of roach motels 36

37 Benefits of the Approach Does not interfere with the source content Enables content to evolve in a cumulative fashion as it accommodates new kinds of data Does not depend on the data resources and can be developed independently from them in an incremental and distributed fashion Provides a more consistent, homogeneous, and well-articulated presentation of the content which originates in multiple internally inconsistent and heterogeneous systems 37

38 Benefits of the Approach Makes management and exploitation of the content more cost-effective Allows graceful integration with other government initiatives and brings the system closer to the federally mandated net-centric data strategy Creates incrementally an integrated content that is effectively searchable and that provides content to which more powerful analytics can be applied 38

39 Building the Shared Semantic Resource Methodology of distributed incremental development Training Governance Common Architecture of Ontologies to support consistency, non-redundancy, modularity – Upper Level Ontology (BFO) – Mid-Level Ontologies – Low Level Ontologies 39

40 Goal: To realize Horizontal Integration(HI) of intelligence data HI =Def. the ability to exploit multiple data sources as if they are one  Problem: the data coming onstream are out of our control  Any strategy for HI must be agile in the sense that it can be quickly extended to new zones of emerging data according to need 40

41 I2WD Strategy Create an agile strategy for building ontologies within a Shared Semantic Resource (SSR) and apply and extend these ontologies to annotate new source data as they come onstream ⁻Problem: Given the immense and growing variety of data sources, the development methodology must be applied by multiple different groups ⁻How to manage collaboration? 41

42 Why do large-scale ontology projects fail? focus on vocabularies, lexicons, with no logical structure, no attention to life cycle failure of housekeeping yields redundancy and therefore forking the same data is annotated in different ways by users of different ontology fragments data is siloed as before – HOW TO BUILD THE NEEDED LOGIC INTO THE ARCHITECTURE OF THE ONTOLOGIES? 42

43 Examples of Principles All terms in all ontologies should be singular nouns Same relations between terms should be reused in every ontology Reference ontologies should be based on single inheritance All definitions should be of the form an S = Def. a G which Ds where ‘G’ (for: genus) is the parent term of S (for: species) in the corresponding reference ontology

44 Anatomy Ontology (FMA*, CARO) Environment Ontology (EnvO) Infectious Disease Ontology (IDO*) Biological Process Ontology (GO*) Cell Ontology (CL) Cellular Component Ontology (FMA*, GO*) Phenotypic Quality Ontology (PaTO) Subcellular Anatomy Ontology (SAO) Sequence Ontology (SO*) Molecular Function (GO*) Protein Ontology (PRO*) Extension Strategy + Modular Organization 44 top level mid-level domain level Information Artifact Ontology (IAO) Ontology for Biomedical Investigations (OBI) Spatial Ontology (BSPO) Basic Formal Ontology (BFO)

45 Ontologies are built as orthogonal modules which form an incrementally evolving network scientists are motivated to commit to developing ontologies because they will need in their own work ontologies that fit into this network users are motivated by the assurance that the ontologies they turn to are maintained by experts 45

46 More benefits of orthogonality helps those new to ontology to find what they need to find models of good practice ensures mutual consistency of ontologies (trivially) and thereby ensures additivity of annotations 46

47 More benefits of orthogonality No need to reinvent the wheel for each new domain Can profit from storehouse of lessons learned Can more easily reuse what is made by others Can more easily reuse training Can more easily inspect and criticize results of others’ work Leads to innovations (e.g. Mireot, Ontofox) in strategies for combining ontologies 47


Download ppt "Ontology and the Semantic Web Barry Smith August 26, 2013 1."

Similar presentations


Ads by Google