A Standard & Prototype Starting Point for An Open Ontology Repository: The Extended Metadata Registry Project John L. McCarthy XMDR Project Lawrence Berkeley.

Slides:



Advertisements
Similar presentations
OMV Ontology Metadata Vocabulary April 10, 2008 Peter Haase.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
1 eXtended Metadata Registry (XMDR) Two Slides for Ontology Summit Presentation Bruce Bargmeyer Lawrence Berkeley National Laboratory and University of.
August 6, 2009 Joint Ontolog-OOR Panel 1 Ontology Repository Research Issues Joint Ontolog-OOR Panel Discussion Ken Baclawski August 6, 2009.
1 eXtended Metadata Registry (XMDR): Input for Open Ontology Repository OOR Panel - Ontology Registry and Repository Technology & Infrastructure Landscape.
1 Extended Metadata Registries and Semantics April 18, 2007 Bruce Bargmeyer University of California, Berkeley and Lawrence Berkeley National Laboratory.
Direction of Proposals for New Edition (E3) of ISO/IEC 11179
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Environmental Terminology System and Services (ETSS) June 2007.
Semantic Mediation & OWS 8 Glenn Guempel
Development Principles PHIN advances the use of standard vocabularies by working with Standards Development Organizations to ensure that public health.
SDC JE-xxxx. Bruce Bargmeyer EPA/OIRM/EIM Division Tel: (202) WWW URL:
SC32 WG2 Metadata Standards Tutorial Metadata Registries and Big Data WG2 N1945 June 9, 2014 Beijing, China.
U.S. Department of the Interior U.S. Geological Survey NWIS, STORET, and XML National Water Quality Monitoring Council August 20, 2003.
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. XMDR Prototype Day: 21.
LexEVS 6.0 Overview Scott Bauer Mayo Clinic Rochester, Minnesota February 2011.
Environmental Terminology Research in China HE Keqing, HE Yangfan, WANG Chong State Key Lab. Of Software Engineering
1 Collaborative Research, Development and Demonstration Ecoinformatics International Technical Collaboration Copenhagen, Denmark March, Bruce Bargmeyer.
SDC JE-Matsue May 1999 Bruce Bargmeyer U.S. Environmental Protection Agency Tel: (202) WWW URL:
Classification and the Metadata Registry Judith Newton NIST IRS XML Stakeholders/ XML Working Group May 18, 2004.
1 Extended Metadata Registry (XMDR) November 2004 Bruce Bargmeyer +1 (510) ISO/IEC JTC 1/SC 32/WG 2.
Nancy Lawler U.S. Department of Defense ISO/IEC Part 2: Classification Schemes Metadata Registries — Part 2: Classification Schemes The revision.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
U.S. Department of the Interior U.S. Geological Survey NWIS, STORET, and XML Advisory Committee on Water Information September 10, 2003 Kenneth J. Lanfear,
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
The Saguaro Digital Library for Natural Asset Management Dr. Sudha RamSudha Ram Advanced Database Research Group Dept. of MIS The University of Arizona.
Value Set Resolution: Build generalizable data normalization pipeline using LexEVS infrastructure resources Explore UIMA framework for implementing semantic.
Revelytix SICoP Presentation DRM 3.0 with WordNet Senses in a Semantic Wiki Michael Lang February 6, 2007.
FEA Data and Information Reference Model (DRM): the Interoperability Message Presented by Eliot Christian, USGS based on work of ISO/IEC JTC1/SC32 Data.
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. Presentation Title: Day:
1 Advanced Semantic Technologies Prof. Deborah McGuinness and Dr. Patrice Seyed CSCI CSCI ITWS ITWS TA: Justin.
th Open Forum on Metadata Registries, Kobe, Japan1 XMDR Project Overview Frank Olken & Kevin D. Keck Lawrence.
1 eXtended Metadata Registry (XMDR) Interagency/International Cooperation on Ecoinformatics Ispra, Italy January 17, 2006 Bruce Bargmeyer, Lawrence Berkley.
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
Leo Obrst, Fabian Neuhaus MITRE, NIST An Open Ontology Repository: Rationale, Expectations & Requirements Session.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
1 eXtended Metadata Registry (XMDR) Ecoterm Rome, Italy May 17, 2006 Bruce Bargmeyer, Lawrence Berkley National Laboratory University of California Tel:
1 SC 32/WG 2 Tutorial Metadata Registry Standards July 16, 2007 Bruce Bargmeyer University of California, Berkeley and Lawrence Berkley National Laboratory.
A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo.
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
Overview of SC 32/WG 2 Standards Projects Supporting Semantics Management Open Forum 2005 on Metadata Registries 14:45 to 15:30 13 April 2005 Larry Fitzwater.
1 Technical Projects Workgroup Report to Plenary Ecoinformatics International Technical Collaboration April 10, 2008 Research Triangle Park, North Carolina,
Open Ontology Repository Initiative Frank Olken Lawrence Berkeley National Laboratory National Science Foundation presented to CENDI/NKOS.
Mining the Biomedical Research Literature Ken Baclawski.
Metadata Registries Workshop Metadata Registries Workshop U.S. Bureau of Labor Statistics Conference Center April 15-17, 1998.
Tutorial on XML Tag and Schema Registration in an ISO/IEC Metadata Registry Open Forum 2003 on Metadata Registries Tuesday, January 21, 2003; 4:45-5:30.
Extending the MDR for Semantic Web November 20, 2008 SC32/WG32 Interim Meeting Vilamoura, Portugal - Procedure for the Specification of Web Ontology -
ISO/IEC JTC 1/SC 32 Plenary and WGs Meetings Jeju, Korea, June 25, 2009 Jeong-Dong Kim, Doo-Kwon Baik, Dongwon Jeong {kjd4u,
SDC JE-2031 Linda Spencer U.S. EPA January 19, 2000 Open Forum on Metadata Registries Santa Fe, NM.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
Information Architecture The Open Group UDEF Project
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
Terminology Components for Ecoinformatics Sharing Gail Hodge Consultant to USGS BIO/NBII Information International Associates, Inc. 28 January 2004 science.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
ONION Ontologies In Ontology Community of Practice Leader
Data Element Classification ISO/IEC 11179, Part 2
US-EU Research Cooperation Interagency/International Cooperation on Ecoinformatics September 2004 Bruce Bargmeyer +1 (510)
Semantics and the EPA System of Registries Gail Hodge IIa/ Consultant to the U.S. Environmental Protection Agency 18 April 2007.
Update on Ecoinformatics Technical Working Group Activities Larry Fitzwater Computer Scientist US Environmental Protection Agency Rome, Italy – 17 May.
“Sharing and advancing knowledge and experience about standards, technologies and implementations. Sharing and advancing knowledge and experience about.
Linked Open Data for European Earth Observation Products Carlo Matteo Scalzo CTO, Epistematica epistematica.
Financial Industry Business Ontology (FIBO) Monthly Status/review call Wednesday November 2 nd 2011.
Extended Metadata Registries and Semantics (Part 2: Implementation) Karlo Berket Ecoterm IV Environmental Terminology Workshop April 18, 2007 Diplomatic.
Databases and DBMSs Todd S. Bacastow January 2005.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Stanford Medical Informatics
Report on Eighth Open Forum on Metadata Registries, Berlin, April 2005
The Re3gistry software and the INSPIRE Registry
Data, Databases, and DBMSs
Ecoinformatics Technical Projects Workgroup
Presentation transcript:

A Standard & Prototype Starting Point for An Open Ontology Repository: The Extended Metadata Registry Project John L. McCarthy XMDR Project Lawrence Berkeley National Laboratory Open Ontology Repository (OOR) Panel on Rationale, Expectations & Requirements March 27, 2008

page 2 of 15 XMDR Open Ontology Talk-v5.ppt Shared Goals & Challenges Open Ontology Repository Goals –collection of useful ontologies –help facilitate harmonization & synergy –standard representation/characterization? Extended Metadata Registry (XMDR) Project Goals –extend ISO-IEC ed. 2 Metadata Registry Standard for increasingly large & complex databases & software systems particularly for large organizations like EPA, NCI, DOD, … –incorporate & manage evolution of concept information codesets of valid values, terminologies, thesauri, ontologies using a shared metamodel for both metadata & concepts

page 3 of 15 XMDR Open Ontology Talk-v5.ppt XMDR Project Overview & Background Set of collaborative initiatives with shared goals & funding –EPA, NCI, DOD, LBNL, USGS, Ecoterm, UNEP, … (major users) XMDR project at LBNL began in 2003 principals have been meeting in Berkeley since 2004 –ISO-IEC JTC1/SC32/WG2 & ANSI L8 working on ed. 3 Joint Technical Committee 1, Subcommittee 32, Working Group 2 metadata registry standards work began in 1980’s re data dictionaries & codesets Open source reference implementation & testbed system –test implementations of proposed extensions to metamodel add more formal semantic metadata on concepts & relationships to data –assemble semantic metadata from diverse sources & structures terminologies, ontologies, etc. for environment, geography, health, … –explore emerging semantic technologies (e.g., RDF, OWL, CL, …) –demonstrate new capabilities e.g., ontology lifecycle management & harmonization

page 4 of 15 XMDR Open Ontology Talk-v5.ppt Challenge: Gain Common Understanding of meaning between Data Creators and Data Users UsersInformation SystemsData Creation Users EEA USGS DoD EPA environ agriculture climate human health industry tourism soil water air textdata environ agriculture climate human health industry tourism soil water air text ambiente agricultura tiempo salud hunano industria turismo tierra agua aero textdata environ agriculture climate human health industry tourism soil water air textdata Others... ambiente agricultura tiempo salud huno industria turismo tierra agua aero textdata Common interpretation of what data represents

page 5 of 15 XMDR Open Ontology Talk-v5.ppt Inference requires combination of Data, Metadata & Concept Systems IDDateTempHg A B X NameDatatypeDefinitionUnits IDtext Monitoring Station Identifiernot applicable DatedateDateyy-mm-dd Tempnumber Temperature (to 0.1 degree C) degrees Celcius Hgnumber Mercury contamination micrograms per liter Inference Search Query: “find water bodies downstream from Fletcher Creek where chemical contamination was over 10 micrograms per liter between December 2001 and March 2003” Data: Metadata: BiologicalRadioactive Contamination leadcadmium mercury Chemical Concept System (multi-lingual):

page 6 of 15 XMDR Open Ontology Talk-v5.ppt XMDR Goals (continued) Improve representation of relationships between data (e.g., data elements & value domains) and concept structures (e.g., ontologies, taxonomies, thesauri, terminologies, …) Register & manage complex semantic metadata (i.e., concepts) in more formal, systematic ways (e.g., description logic) to facilitate machine processing of semantics in order to –link together data elements & terms across multiple systems –discover relationships among data elements, terms & concepts –create and manage names, definitions, terms, etc. –support software inference, aggregation, and agent services Add more rigorous & formal specification for –concepts and concept systems (including ontologies) –relationships between metamodel components –formal axioms for conceptual & structural relationships Use concepts to unify different types of metadata –evolution requires increasing granularity & details –combine strengths of data dictionaries/registries and ontologies

page 7 of 15 XMDR Open Ontology Talk-v5.ppt Example concept system content currently loaded in XMDR Prototype via Lexgrid (from Mayo Clinic & Harold Solbrig) GEMET Multilingual Environmental Thesaurus National Biological Information Infrastructure biodiversity NCI Thesaurus_06.02d health concepts system ISO4217_1981 currency codes ISO3166_V-10 country codes (only 2 letter codes) Mouse_1.32 anatomy Defense Technology Information Center 1.0 Thesaurus Portions of EPA controlled vocabulary SIC and NAICS industrial classification codes via special purpose scripts Omega ontology

page 8 of 15 XMDR Open Ontology Talk-v5.ppt Additional candidate metadata content to test metamodel expressivity Current Data Element Registries caDSR (full NCI Cancer Data Standards Registry) EDR (EPA Environmental Data Registry) Candidate Additions to Concept Systems and Ontologies NASA SWEET (Semantic Web Earth & Environmental Terminologies) IETF RFC 3066 Language Codes USGS Geographic Names Information System Getty Thesaurus of Geographic Names I.T.I.S. - Integrated Taxonomic Information System Foundational Model of Anatomy EPA Chemical Substance Registry GO (Gene Ontology), ….Agrovoc, …and possibly others OMV Ontology Metadata Vocabulary (European NeON consortium & Stanford NCBO)

page 9 of 15 XMDR Open Ontology Talk-v5.ppt Omega Ontology illustrates challenges of loading large, complex new content Omega is a “terminological ontology” reorganization & synthesis of WordNet & Mikrokosmos adds higher level ontology to organize multiple ontologies Initial mapping and loading of Omega needs to be refined Multiple ontology languages present an additional challenge Entity relationships conform to Concept_System figure Entity ->Attribute conforms to Classification_Scheme figure Omega Attributes mapped to ISO/IEC11179 ed3 Facets (ignoring Omega datatype field) Required a week to process and load Omega Ontology 4 million files, so ~250,000/24 hrs

page 10 of 15 XMDR Open Ontology Talk-v5.ppt XMDR Prototype Modular Architecture: with current open source software selections Registry Store (Subversion) Search & Inference Queries (Jena, SPARQL) XMDR metamodel (OWL & xml schema) Full Text Index XMDR Prototype Architecture REST Style standard XMDR files Asserted LogicIndex Inferred LogicIndex Content Loading & Transformation (Lexgrid & custom) Human User Interface (XML pages & javascript) Metadata Sources concept systems, data elements USERS Web Browsers…..Client Software Application Program Interface (REST) Authentication Service Validation (XML Schema) Mapping Engine Reasoner (Pellet) Text Search (Lucene) Metamodel specs (UML & Editing) (Poseidon, Protege) XMDR data model & exchange format XML, RDF, OWL

page 11 of 15 XMDR Open Ontology Talk-v5.ppt DRAFT – ed. 3 metamodel Consolidated Class Hierarchy see xmdr.org wiki for more diagrams and details

page 12 of 15 XMDR Open Ontology Talk-v5.ppt XMDR Prototype Web Site has downloadable code & content

page 13 of 15 XMDR Open Ontology Talk-v5.ppt Technical Challenges and Issues for XMDR Implementation Testbed Complexity –representation of different types of relationships –non-binary relationships -- e.g., instrumentality (A used to do B to C) –extensibility for unknown future complexities (e.g., Omega)? –incorporate IKL variant of CLIF dialect of ISO Common Logic? Scalability & performance –currently includes tens of thousands of objects & millions of RDF triples –maybe indexing and/or distributed registries will help? External metadata sources, ontologies, terminologies –cannot simply be copied because they are proprietary & evolving Mapping (to data elements as well as between e.g. between concept systems) –wide variety of challeges (e.g., probabilistic & changing mappings) Manage evolving metamodel, concept systems & mappings –additions & changes in both content & structure over time, versioning Harmonize with ODM, MMF, CL, OMV, Web Services –need open source, standards-based approach (vs. proprietary)

page 14 of 15 XMDR Open Ontology Talk-v5.ppt Conclusion: Why should OOR & XMDR projects consider closer collaboration? Potential benefits for OOR Project… –modular, extensible, open source code base –initial set of ontologies & other concept systems –major collaborators (EPA, NCI, DOD, EEA, …) –real-world ontology applications –ISO/IEC standards-based approach –proven administrative metadata & procedures for managing stewardship & evolution of individual items –extensive & extensible OOR metamodel Potential benefits for the XMDR Project –ontology experts, experience and ideas (e.g., Natasha re OMV) –more ontologies to exercise expressivity & tools –help in refining ontology representation & mapping

page 15 of 15 XMDR Open Ontology Talk-v5.ppt Thanks & Acknowledgements Bruce Bargmeyer, principal investigator Frank Olken, initial concepts & metamodel extensions Kevin Keck, initial & current designer & implementor Karlo Berkett, implementation, user interface, data loading Harold Solbrig, Lexgrid, model development, etc! Fred Gey, concept mapping, etc. L8 and SC 32/WG 2 Standards Committees Major XMDR Project Sponsors and Collaborators –National Science Foundation (Grant # ) –U.S. Environmental Protection Agency –Department of Defense –National Cancer Institute –U.S. Geological Survey –And others!