Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligence Ontology: A Strategy for the Future

Similar presentations


Presentation on theme: "Intelligence Ontology: A Strategy for the Future"— Presentation transcript:

1 Intelligence Ontology: A Strategy for the Future
Barry Smith University at Buffalo

2 Semantic Web, wikis, statistical textmining, etc.
let a million flowers bloom how create broad-coverage semantic annotation systems which will enable sharing of gigantic bodies of heterogeneous data?

3 let a million flowers (weeds) bloom
how create broad-coverage semantic annotation systems which will enable sharing of gigantic bodies of heterogeneous data?

4 let a million microtheories bloom
what about Cyc?

5 why does Cyc not do the job?
#$Configuration   A specialization of both #$StaticSituation and #$SpatialThing-Localized. Each instance of #$Configuration is a static situation consisting of two or more #$PartiallyTangible things of certain types standing in a certain type of spatial relationship (or set of relationships). This (set of) spatial relationship(s) characterizes the #$Configuration's _type_ in the sense that any group of objects of the appropriate types standing in that relationship (or those relationships) correspond to a #$Configuration of that type; and each of these objects, in turn, is said to be configured (see #$objectConfigured) in the (individual) #$Configuration. why does Cyc not do the job?

6 why does Cyc not do the job?
(speculations) Cyc doesn’t care about consistency between microtheories so no progressive cumulation from an established core too little concern for consistency with basic science (common sense should not wear the trousers) no perspicuous policies for updating built by outsiders why does Cyc not do the job?

7 Unified Medical Language System (National Library of Medicine)
built by trained experts massively useful for information retrieval and information integration good versioning and term-ID policies creates out of literature a semantically searchable space Unified Medical Language System (National Library of Medicine)

8 for UMLS local usage respected regimentation frowned upon
mappings between ‘synonyms’ full of noise is_synonymous_with is not transitive no cross-framework consistency no concern to establish consistency with basic science different grades of formal rigor, different degrees of completeness, different update policies for UMLS

9 with UMLS-based annotations
we can know what data we have (via term searches) we can map between data at single granularities (via ‘synonyms’) how do we combine data across granularities? how do we resolve logical conflicts ? how do we know what data we don’t have ? how do we reason with data ? with UMLS-based annotations

10 no evolutionary path towards improvement
with UMLS, Cyc, Web 2.0, ...

11 We will be able to use ontologies to help us share data
only if the ontologies represent the world correctly are humanly intelligible and computationally tractable

12 a new approach prospective standardization based on objective measures of what works bring together selected influential groups to agree on good terminology / annotation habits preemptively

13 for science requirements ensure legacy annotation efforts not wasted
create an evolutionary path towards improvement, of the sort we find in science must be a collaborative, community effort to ensure buy-in ensure future-proofing requirements

14 create a consensus core of interoperable domain ontologies
for science create a consensus core of interoperable domain ontologies starting with low hanging fruit and working outwards from there built and validated by trained experts backed by persons of influence in different communities

15 for science geospatial transport religion weather bacteria chemicals
politics law use common rules drawing on best practices for creating ontologies ... and for linking ontologies

16 for science geospatial transport religion weather bacteria chemicals
politics law ... exploiting the division of labor ... relying on champions in dispersed communities to spread the words

17 ontology of documents ontology of provenance ontology of names ontology of numbers (IDs) ontology of signatures ontology of identity ...

18 that people should use the core to annotate their data

19 and set up feedback mechanisms
annotators discover they need more terms more relations between terms to correct existing relations the ontology gets better as it is used

20 This process leads to improvements and extensions of the ontology
which in turn leads to better annotations  a virtuous cycle of improvement in the quality and reach of both future annotations and the ontology itself This process

21 a growing computer-interpretable map of reality within which major databases are automatically integrated in semantically searchable form The result

22 This solution is already being implemented in the domain of biomedicine

23 create a shared portal (low regimentation)
First step (2003) create a shared portal (low regimentation)

24

25 Second step (2004) reform efforts initiated to link OBO ontologies together and to ensure orthogonality

26 The OBO Foundry http://obofoundry.org/
Third step (2006) The OBO Foundry Why suddenly the switch from deep-hell black background to heavenly white ? Is that an intended connotation or just a copy and paste coincidence?

27 some groups commit to working together and to following common rules

28 a family of interoperable gold standard biomedical reference ontologies to serve the annotation of
scientific literature model organism databases clinical data experimental results The OBO Foundry

29 A subset of OBO ontologies, whose developers have agreed in advance to accept a common set of principles designed to ensure tight connection to the biomedical basic sciences compatibility interoperability formal robustness support for logic-based reasoning

30 A prospective standard
designed to guarantee interoperability of ontologies from the very start (contrast to: post hoc mapping) established March 2006 12 initial candidate OBO ontologies – focused primarily on basic science domains several being constructed ab initio A prospective standard

31 Ontology Scope URL Custodians Cell Ontology (CL)
cell types from prokaryotes to mammals obo.sourceforge.net/cgi- bin/detail.cgi?cell Jonathan Bard, Michael Ashburner, Oliver Hofman Chemical Entities of Bio- logical Interest (ChEBI) molecular entities ebi.ac.uk/chebi Paula Dematos, Rafael Alcantara Common Anatomy Refer- ence Ontology (CARO) anatomical structures in human and model organisms (under development) Melissa Haendel, Terry Hayamizu, Cornelius Rosse, David Sutherland, Foundational Model of Anatomy (FMA) structure of the human body fma.biostr.washington. edu JLV Mejino Jr., Cornelius Rosse Functional Genomics Investigation Ontology (FuGO) design, protocol, data instrumentation, and analysis fugo.sf.net FuGO Working Group Gene Ontology (GO) cellular components, molecular functions, biological processes Gene Ontology Consortium Phenotypic Quality (PaTO) qualities of anatomical structures obo.sourceforge.net/cgi -bin/ detail.cgi? attribute_and_value Michael Ashburner, Suzanna Lewis, Georgios Gkoutos Protein Ontology (PrO) protein types and modifications Protein Ontology Consortium Relation Ontology (RO) relations obo.sf.net/relationship Barry Smith, Chris Mungall RNA Ontology (RnaO) three-dimensional RNA RNA Ontology Consortium Sequence Ontology (SO) properties and features of nucleic sequences song.sf.net Karen Eilbeck

32 Foundry communities include
Transcriptomics (MIAME Working Group) Proteomics (Proteomics Standards Initiative) Metabolomics (Metabolomics Standards Initiative) Genomics and Metagenomics (Genomic Standards Consortium) In Situ Hybridization and Immunohistochemistry (MISFISHIE Working Group) Phylogenetics (Phylogenetics Community) RNA Interference (RNAi Community) Toxicogenomics (Toxicogenomics WG) Environmental Genomics (Environmental Genomics WG) Nutrigenomics (Nutrigenomics WG) Flow Cytometry (Flow Cytometry Community) Foundry communities include

33 Organism-Level Process
CONTINUANT OCCURRENT INDEPENDENT DEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Organism-Level Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function Cellular Process MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function Molecular Process RELATION TO TIME GRANULARITY OBO Foundry coverage (canonical ontologies)

34 CRITERIA The ontology is in, or can be instantiated in, a common formal language. The developers of the ontology agree in advance to collaborate with developers of other OBO Foundry ontology where domains overlap. The ontology should be useful (have a plurality of user communities). CRITERIA

35 UPDATE: The developers of each ontology commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement. ORTHOGONALITY: They commit to working with other Foundry members to ensure that, for any particular domain, there is community convergence on a single controlled vocabulary. CRITERIA

36 for science ORTHOGONALITY
when communities work together to ensure consistency  orthogonality  additivity of annotation frameworks ADDITIVITY: if we annotate a database or body of literature with terms from one high-quality ontology, we should be able to add annotations from a second such ontology without conflicts ORTHOGONALITY

37 CRITERIA IDENTIFIERS: The ontology possesses a unique identifier space within OBO. VERSIONING: The ontology provider has procedures for identifying distinct successive versions. The ontology includes textual definitions for all terms. CRITERIA

38 COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.* * Smith et al., Genome Biology 2005, 6:R46 CRITERIA

39 OBO Relation Ontology Foundational Spatial Temporal Participation is_a
part_of Spatial located_in contained_in adjacent_to Temporal transformation_of derives_from preceded_by Participation has_participant has_agent

40 Create an OIC portal: a list of those ontologies and related artifacts which already exist
2. Find out which groups of ontology developers are willing to commit to working towards interoperability

41 3. Work with these groups in open, on-line and face-to-face, discussions, the records of which are made available on the OIC portal 4. Move towards a suite of authoritative ontologies, one for each domain – stable attractors 5. Make funding depend on use of authoritative ontologies – because these have been shown to work

42 what is a question? a representation of reality with a hole in it

43 They will provide a representation of the context of reality
within which the holes can appear

44 The OIC Foundry ontologies
will be stable, maximally open resources They will be authoritative in light of NCOR’s evaluation measures Will not reveal sensitive capabilities and interests. They will deal with types (of mobile telephone, of spores, of soil ...) Not with instances

45 (from Jen Williams, OWI)
Some problems (from Jen Williams, OWI)

46 Do we have an instruction manual ?

47 create a simple top level framework
Ontology: An Introduction

48

49

50 Too few knowledgeable folks, and fewer cleared.
Computer scientists are teaching people ontology tools and ... Mohammed is_a string Amount of money is_a integer

51 what we need Training events (summer school ...)
to teach people to CREATE ONTOLOGY CONTENT to teach people to USE ONTOLOGY CONTENT

52 Joining up will diminish your fiefdom
  If you give everyone the keys to you kingdom, how will you justify your budget? We can do this already, as a start, with open source resources, plus a few brave champions of good annotation habits Later we humiliate those who do not join in

53 People use different tools
format/language of the ontology is not easy to understand the OIC Foundry should use ontologies which are maximally format- and language-neutral

54 we are to enable sharing of gigantic bodies of heterogeneous data we need all the help we can get our computers need all the help we can get


Download ppt "Intelligence Ontology: A Strategy for the Future"

Similar presentations


Ads by Google