1 SC 32/WG 2 Tutorial Metadata Registry Standards July 16, 2007 Bruce Bargmeyer University of California, Berkeley and Lawrence Berkley National Laboratory.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

1 eXtended Metadata Registry (XMDR) Two Slides for Ontology Summit Presentation Bruce Bargmeyer Lawrence Berkeley National Laboratory and University of.
1 eXtended Metadata Registry (XMDR): Input for Open Ontology Repository OOR Panel - Ontology Registry and Repository Technology & Infrastructure Landscape.
XML in the Emerging U.S. Federal Information Architecture Presented by Eliot Christian, USGS April 30, 2003.
1 Extended Metadata Registries and Semantics April 18, 2007 Bruce Bargmeyer University of California, Berkeley and Lawrence Berkeley National Laboratory.
Direction of Proposals for New Edition (E3) of ISO/IEC 11179
IPY and Semantics Siri Jodha S. Khalsa Paul Cooper Peter Pulsifer Paul Overduin Eugeny Vyazilov Heather lane.
Helping people find content … preparing content to be found Enabling the Semantic Web Joseph Busch.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
SDC JE-xxxx. Bruce Bargmeyer EPA/OIRM/EIM Division Tel: (202) WWW URL:
Future of MDR - ISO/IEC Metadata Registries (MDR) Larry Fitzwater, SC 32 WG 2 Convener Computer Scientist U.S. Environmental Protection Agency May.
Final Report on MFI & MDR Harmonization Hajime Horiuchi May 2010 SC32WG2 N1425.
OpenMDR: Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
WG2 Tutorial ISO/JTC1/SC32 Larry Fitzwater (202) SDC JE-4029.
Report on: Database Futures Study Group & Database Security Study Group Clearwater, Fl Feb JTC1 SC32N1645.
SC32 WG2 Metadata Standards Tutorial Metadata Registries and Big Data WG2 N1945 June 9, 2014 Beijing, China.
OpenMDR: Alternative Methods for Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
1 Future Database Needs SC 32 Study Period February 5, 2007 Bruce Bargmeyer, Lawrence Berkley National Laboratory University of California Tel:
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. XMDR Prototype Day: 21.
© 2010 TASC, Inc. | TASC Proprietary Laura J. Reece, Ph.D. for SOCoP workshop Dec 3, 2010 Standards Activities in Semantics and Ontologies.
A Standard & Prototype Starting Point for An Open Ontology Repository: The Extended Metadata Registry Project John L. McCarthy XMDR Project Lawrence Berkeley.
Environmental Terminology Research in China HE Keqing, HE Yangfan, WANG Chong State Key Lab. Of Software Engineering
1 Collaborative Research, Development and Demonstration Ecoinformatics International Technical Collaboration Copenhagen, Denmark March, Bruce Bargmeyer.
SDC JE-Matsue May 1999 Bruce Bargmeyer U.S. Environmental Protection Agency Tel: (202) WWW URL:
1 eXtended Metadata Registry (XMDR) International Ecoinformatics Technical Collaboration Berkeley, California October 24, 2006 Bruce Bargmeyer, Lawrence.
Interoperability Standards at Levels of Syntax and Semantics Presented by Eliot Christian at the First Meeting of WMO / CBS / ISS / ET-ADRS (Expert Team.
Classification and the Metadata Registry Judith Newton NIST IRS XML Stakeholders/ XML Working Group May 18, 2004.
Architecture for a Database System
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
1 Extended Metadata Registry (XMDR) November 2004 Bruce Bargmeyer +1 (510) ISO/IEC JTC 1/SC 32/WG 2.
Interfacing Registry Systems December 2000.
Cooperating Registries Draft Content for OASIS/ebXML Reg/Rep f2f November 1, 2001 Bruce Bargmeyer (510)
The Final Study Period Report on MFI 6: Model registration procedure SC32WG2 Meeting, Sydney May 26, 2008 H. Horiuchi, Keqing He, Doo-Kwon Baik SC32WG2.
Requirements for Standardization on the Service Registries ISO/IEC JTC1 SC /10/161 A comment to WSSG, JTC1 SC32WG2 N
Tommie Curtis SAIC January 17, 2000 Open Forum on Metadata Registries Santa Fe, NM SDC JE-2023.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Clinical Data Interchange Standards Consortium (CDISC) uses NCIt for its Study Data Tabulation Model (SDTM) and other global data standards for medical.
FEA Data and Information Reference Model (DRM): the Interoperability Message Presented by Eliot Christian, USGS based on work of ISO/IEC JTC1/SC32 Data.
Registry Services Bringing Value to US EPA, States, and Tribes Exchange Network Vendors Meeting April 24, 2007 Cynthia Dickinson EPA/OEI/OIC Data Standards.
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. Presentation Title: Day:
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
th Open Forum on Metadata Registries, Kobe, Japan1 XMDR Project Overview Frank Olken & Kevin D. Keck Lawrence.
1 eXtended Metadata Registry (XMDR) Interagency/International Cooperation on Ecoinformatics Ispra, Italy January 17, 2006 Bruce Bargmeyer, Lawrence Berkley.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
1 eXtended Metadata Registry (XMDR) Ecoterm Rome, Italy May 17, 2006 Bruce Bargmeyer, Lawrence Berkley National Laboratory University of California Tel:
Overview of SC 32/WG 2 Standards Projects Supporting Semantics Management Open Forum 2005 on Metadata Registries 14:45 to 15:30 13 April 2005 Larry Fitzwater.
SDC JE-2027 January 18, 2000 Bruce Bargmeyer Chair, SC 32 – Data Management and Interchange U.S. Environmental Protection Agency Telephone: (202)
1 Registry Services Overview J. Steven Hughes (Deputy Chair) Principal Computer Scientist NASA/JPL 17 December 2015.
Metadata Registries Workshop Metadata Registries Workshop U.S. Bureau of Labor Statistics Conference Center April 15-17, 1998.
Tutorial on XML Tag and Schema Registration in an ISO/IEC Metadata Registry Open Forum 2003 on Metadata Registries Tuesday, January 21, 2003; 4:45-5:30.
SDC JE-xxxx September 1999 Bruce Bargmeyer U.S. Environmental Protection Agency Tel: (202) WWW URL:
Concept Proposal Sixth Open Forum on Metadata Registries Semantic Interoperability between Registries To be held January 20-24, 2003 Bruce Bargmeyer
Data Element Classification ISO/IEC 11179, Part 2
International/Interagency Collaboration – IT for Environmental Information & Environmental Data Exchange Network Copenhagen, Denmark April 25, 2002 Bruce.
US-EU Research Cooperation Interagency/International Cooperation on Ecoinformatics September 2004 Bruce Bargmeyer +1 (510)
WG2 Roadmap Discussion Denise Warzel May 25, 2010 WG2 Convenor SC32 WG2N1424SC32 WG2N1424.
Semantics and the EPA System of Registries Gail Hodge IIa/ Consultant to the U.S. Environmental Protection Agency 18 April 2007.
“Sharing and advancing knowledge and experience about standards, technologies and implementations. Sharing and advancing knowledge and experience about.
National Cancer Institute caDSR Briefing for Small Scale Harmonication Project Denise Warzel Associate Director, Core Infrastructure caCORE Product Line.
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
Concept Presentation Sixth Open Forum on Metadata Registries To be held January 20-24, 2003 Bruce Bargmeyer
OMG Architecture Ecosystem SIG Enterprise Data World 2011.
Agenda Federated Enterprise Architecture Vision
Object Management Group Information Management Metamodel
knowledge organization for a food secure world
Report on Eighth Open Forum on Metadata Registries, Berlin, April 2005
Lifecycle Metadata for Digital Objects
Ecoinformatics Technical Projects Workgroup
Chapter 1: The Database Environment
Presentation transcript:

1 SC 32/WG 2 Tutorial Metadata Registry Standards July 16, 2007 Bruce Bargmeyer University of California, Berkeley and Lawrence Berkley National Laboratory Tel: JTC1 SC32 N1649

2 Topics F Standards development: OMG, ISO (TC 37 & JTC 1/SC 32), W3C, OASIS u Align, Coordinate, Integrate: Standards, Recommendations, Specifications F Semantics Challenges and Future Directions

Align, Coordinate, Integrate Standards E WG 2 doing OK internally:

Align, Coordinate, Integrate Standards 4 WG 1 WG 2 WG 3 WG 4 SC 32? Clearwater meeting a step forward

5 Align, Coordinate, Integrate Standards/Recommendations/Specifications for Semantic Computing ISO/IEC JTC 1/SC 32 Us er s ISO/IEC Metadata Registries Metadata Registry Terminology Thesaurus Taxonomy Data Standards Ontology Structured Metadata Terminology CONCEPT Referent Refers To Symbolizes Stands For “Rose”, “ClipArt Rose” ISO TC 37 Semantic Web W3C Object Management MOF ODM CWM IMM OMG Node Edge Subject Predicate Object Graph RDF

Standards Development Semantics Management and Semantics Services – Semantic Computing 6 OMG W3C ISO/IEC JTC 1 SC 32 Align, Co-develop, Fast Track, PAS Submission … ISO TC 37

Standards Development Semantics Management and Semantics Services – Semantic Computing 7 OMG W3C ISO/IEC JTC 1 SC 32 Align, integrate, co-develop, Fast Track, PAS Submission … Can we coordinate content? W3C

A Success 8 OMG ISO/IEC JTC 1 SC 32 Some text and figures are identical in the two standards. ISO/IEC OMG ODM ISO/IEC – Common Logic OMG Ontology Definition Metamodel

Standards Development Semantics Management and Semantics Services – Semantic Computing 9 ISO/IEC (Edition 3) ISO/IEC JTC 1 SC 32 Ongoing effort

Standards Development Semantics Management and Semantics Services – Semantic Computing 10 Possible effort E3 proposals OMG RFP - MOF? IMM

Standards Development Semantics Management and Semantics Services – Semantic Computing 11 ISO/IEC (Edition 3) ISO/IEC JTC 1 SC 32 Hopeful? OMG IMM &

Other Possibilities F OASIS ebXML Registry F W3C Semantic Web Deployment WG F TC 37 12

Getting the information that we need, when we need it, without afflicting the excellent minds of humans with toil and drudgery The litany: F Too much or too little, irrelevant, not authoritative, out of date F Unknown quality, not trustable, lacks provenance, no certainty measures F Difficult to find, difficult to access, difficult to use F Meaning not clear, relationship to other information not clear F Data creators do not have the same understanding of the data as end users F Recorded data loses much real world meaning, context, relationships F Much of the meaning of data is buried in the processes used to manipulate the data (e.g., in computer code) F Need improvements in efficiency and effectiveness Every time we solve it, we re-create it. The Ageless Information Problem cf: Data, Information, Knowledge, Wisdom

F Improve traditional data management/data administration u Use stronger semantics management and semantics services capabilities F Enable something new u Semantic computing New Semantics Capabilities Proposed for ISO/IEC MDR (Edition 3)

F Processing that takes “meaning” into account u Makes use of concept systems, e.g., thesauri and/or ontologies u Moves some of the “meaning” of data from computer code to managed semantics F Processing that uses (e.g., reasons across) the relations between things not just computing about the things themselves. F Processing that helps to take people out of the computation, reducing the human toil u Semantics “grounding” for data, data discovery, extraction, mapping, translation, formatting, validation, inferencing, … F Delivering higher-level results that are more helpful for the user’s thought and action Semantic Computing: The Nub of It

In The Epic Information Struggle We Have Made Heroic Progress Files Machine Processing Computer Processing Cards Tape Disk

In structuring data and text -- F Structured Data u Columns on cards & tape (possibly comma separated) u Hierarchical (DBMS) u Network u Table (relational DBMS) u Hierarchy (XML) u Graph (RDF) F Semi-structured text u Nrof, trof, LaTeX … u SGML u HTML u XML In The Epic Information Struggle We Have Made Heroic Progress

In documenting data and text (e.g., semantics management) – F Data Standards u Code sets F (Meta)Data Standards u Data element definitions, valid values, value meanings u Metadata registries (MDR, ISO/IEC 11179) u Other standards as presented at this conference F Concept systems (or KOS) u Glossaries u Dictionaries u Thesauri u Taxonomies u Ontologies u Graphs In The Epic Information Struggle We Have Made Heroic Progress

F Improve data management through use of stronger semantics management u Databases u XML data u Other “traditional” data F Enable new wave of semantic computing u Take meaning of data into account u Process across relations as well as properties u May use reasoning engines, e.g., to draw inferences Semantic Management Proposals for Edition 3

Semantics Improve Data Management/Data Administration Object Class Chemopreventive Agent Property NSCNumber Conceptual Domain Agent Data Element Concept Chemopreventive Agent NSC Number Data Element Chemopreventive Agent Name Value Domain NSC Code Context caCORE Representation Code Classification Schemes caDSRTraining Valid Values Cyclooxygenase Inhibitor Doxercalciferol Eflornithine … Ursodiol Source: Denise Warzel, National Cancer Institute Enterprise Vocabulary Services (EVS) Concepts Unite NCI MDR

Semantic Computing Application: Find and process non-explicit data Analgesic Agent Non-Narcotic Analgesic AcetominophenNonsteroidal Antiinflammatory Drug Analgesic and Antipyretic Datril Anacin-3Tylenol For example… Patient data on drugs contains brand names (e.g. Tylenol, Anacin-3, Datril,…); However, want to study patients taking analgesic agents

A Semantics Application: Specify and compute across Relations, e.g., within a food web in an Arctic ecosystem An organism is connected to another organism for which it is a source of food energy and material by an arrow representing the direction of biomass transfer. Source: (from SPIRE)

Semantics Application: Combine Data, Metadata & Concept Systems NameDatatypeDefinitionUnits IDtext Monitoring Station Identifier not applicable DatedateDateyy-mm-dd Tempnumber Temperature (to 0.1 degree C) degrees Celcius Hgnumber Mercury contamination micrograms per liter IDDateTempHg A B X Inference Search Query: “find water bodies downstream from Fletcher Creek where chemical contamination was over 10 micrograms per liter between December 2001 and March 2003” Data: Metadata: BiologicalRadioactive Contamination leadcadmium mercury Chemical Concept system:

Semantics Application: Use data from systems that record the same facts with different terms F Reduce the human toil of drawing information together and performing analysis.

Challenge: Use data from systems that record the same facts with different terms Common Content OASIS/ebXML Registries Common Content ISO Registries Common Content Ontological Registries Common Content CASE Tool Repositories Common Content UDDI Registries Country Identifier Data Element XML Tag Term Hierarchy Attribute Business Specification Table Column Software Component Registries Common Content Database Catalogs Business Object Dublin Core Registries Common Content Coverage

Data Elements DZ BE CN DK EG FR... ZW ISO 3166 English Name ISO Numeric Code ISO Alpha Code Algeria Belgium China Denmark Egypt France... Zimbabwe Name: Context: Definition: Unique ID: 4572 Value Domain: Maintenance Org. Steward: Classification: Registration Authority: Others ISO 3166 French Name L`Algérie Belgique Chine Danemark Egypte La France... Zimbabwe DZA BEL CHN DNK EGY FRA... ZWE ISO Alpha Code Same Fact, Different Terms Algeria Belgium China Denmark Egypt France... Zimbabwe Name: Country Identifiers Context: Definition: Unique ID: 5769 Conceptual Domain: Maintenance Org.: Steward: Classification: Registration Authority: Others Data Element Concept

Challenge: Draw information together from a broad range of studies, databases, reports, etc.

A semantics application: Information Extraction and Use Segment Classify Associate Normalize Deduplicate Discover patterns Select models Fit parameters Inference Report results Actionable Information Decision Support Extraction Engine (E3) XMDR

Extraction Engines F Find concepts and relations between concepts in text, tables, data, audio, video, … F Produce databases (relational tables, graph structures), and other output F Functions: u Segment – find text snippets (boundaries important) u Classify – determines database field for text segment u Association – which text segments belong together u Normalization – put information into standard form u Deduplication – collapse redundant information

Metadata Registries are Useful Registered semantics F For “training” extraction engines F The “Normalize” function can make use of standard code sets that have mapping between representation forms. F The “Classify” function can interact with pre-established concept systems. Provenance F High precision for proper nouns, less precision (e.g., 70%) for other concepts -> impacts downstream processing, Need to track precision

Challenge: Gain Common Understanding of meaning between Data Creators and Data Users Users Information systems Data Creation Users EEA USGS DoD EPA environ agriculture climate human health industry tourism soil water air textdata environ agriculture climate human health industry tourism soil water air text ambiente agricultura tiempo salud hunano industria turismo tierra agua aero textdata environ agriculture climate human health industry tourism soil water air textdata Others... ambiente agricultura tiempo salud huno industria turismo tierra agua aero textdata A common interpretation of what the data represents

F Vocabulary Management is essential for use of semantic technologies u Define concepts and relationships u Harmonize terminology, resolve conflicts u Collaborate with stakeholders F An approach u Select a domain of interest u Enter core concepts and relationships u Engage community in vocabulary review u Harmonize, validate and vet the vocabulary u Enter metadata describing enterprise data u Link concept system to metadata Practical Vocabulary Management

F For vocabulary repository u Register, harmonize, validate, and vet definitions and relations F To register mappings between multiple vocabularies F To register mappings of concepts to data F To provide semantics services F To register and manage the provenance of data (E3) is part of the infrastructure for semantics and data management. These capabilities are proposed for ISO/IEC Edition 3 Use eXtended MDR Capabilities

F Upside u Collaborative n Supports interaction with community of interest n Shared evolution and dissemination n Enables Review Cycle u Standards-based – don’t lock semantics into proprietary technology u Foundation for strategic data centric applications u Lays the foundation for Ontology-based Information Management u Content is reusable for many purposes F Downside u Managing semantics is HARD WORK - No matter how friendly the tools u Needs integration with other components (E3) Use

F Data management and metadata management must evolve to address more complex data structures (relational, object, hierarchies, graphs) u Query capabilities n More than SQL, XQuery, SPARQL u Discovery mechanisms n More than Google u Access, mining, extraction We need stronger semantics management Some Challenges

F Registering and mapping ontologies F Ontology Evolution F Registering Process Ontologies Metadata Registry Support for

Thank You F Acknowledgements u Karlo Berket, LBNL u Kevin Keck, LBNL u John McCarthy, LBNL u Harold Solbrig, Apelon This material is based upon work supported by the National Science Foundation under Grant No , USEPA and USDOD. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation, USEPA or USDOD. 37 Bruce Bargmeyer Lawrence Berkeley National Laboratory & Berkeley Water Center University of California, Berkeley Tel: