Presentation is loading. Please wait.

Presentation is loading. Please wait.

RDA Metadata Semantics Rich Metadata Semantics needed for human AND computer understanding but Mapping metadata schemas to ontologies can be a complicated.

Similar presentations


Presentation on theme: "RDA Metadata Semantics Rich Metadata Semantics needed for human AND computer understanding but Mapping metadata schemas to ontologies can be a complicated."— Presentation transcript:

1 RDA Metadata Semantics Rich Metadata Semantics needed for human AND computer understanding but Mapping metadata schemas to ontologies can be a complicated procedure.... Metadata and Semantics Research Conference, since 2005 Gary Berg-Cross SOCoP, RDA US Advisory Committee

2 Outline of Topics 1.Metadata- many Standards and some Ambitious MD Requirements 2.RDA Metadata-Semantic Discussions & Background 1.Rich Metadata Semantics needed for human AND computer understanding 2.Semantic approaches needed for MD schemas 1.Adding formal semantics to metadata schemas for discovery, and queries, mediation/linking and reasoning use an be a complicated procedure.... 3.Illustrating 2 Semantic approaches 1.Semantic Annotation 2.Example of an Ontological Schema 4.Are we ready for metadata semantics to be widely used? Where are the opportunities? Can we agree on common or domain principles (like modularity or building blocks) or some formal semantic requirements?

3 Recap on (Richer) Metadata Type Structure (includes Linked Data) From: The potential of metadata for linked open data and its value for users and publishers by Anneke Zuiderwijk, Keith Jeffery, Marijn Janssen Different types or degrees of semantics may be appropriate for different tasks LOD needs semantics for context... CERIF provides a “much richer metadata than the standards used commonly with LOD and so improves greatly the experience of the end user (or the advantages of providing metadata.)”

4 Metadata & Standards Evolution from file system names/types & Describing DB Fields to MD Schemas for Exchange Dublin Core attaching categorical tags and descriptions via a MD schema Attempt to make data more human understandable – capture agreed upon MD that affords understanding The MD effort now requires many interacting pieces including Metadata Application Profiles and Workflow like entities

5 Strategy of “Modular” Theory of General and Domain Specific MD (and Ontologies) Standardized Geo-specific metadata Standardized BioMed-specific metadata Standardized EarthScience- specific metadata Trans-Domain (General Consensus) Metadata ID, time.... ISO MD_Keywords: Discipline, Place, Stratum, Temporal, Theme? Independent?? “ Harmonized ” And Packaged Together Modules should be easier to create, validate, understand and maintain They may be substituted for and used and reused for composition Support Interoperabiity

6 There are specific “standards” in domains [ISO 19115:2003] Geographic information -- Metadata [ISO 19115-2:2009] Geographic information -- Metadata -- Part 2: Extensions for imagery and gridded data In OGC’s O&M model Earth Observations generate “products” that have metadata. These are organized into a metadata profile organized as a schema General MD Other MD Support bridging heterogeneity To achieve interoperability Support data integration. OGC Object Types axis axisDirection datum dataType derivedCRSType documentType ellipsoid featureType group Meaning....

7 Some Metadata Challenges (Earth Science from Ilya Zaslavsky, CINERGI* pipeline) Common deficiencies in existing metadata descriptions: 1.Different metadata models and profiles, 1.Different details of requirements mandatory and optional fields (Dublin Core vs ISO) 2.Different meaning of fields and initial purpose/emphasis of data collection 3.Different local interpretations of how these fields should be filled out (eg “authors” and “contacts” are often mixed up). 2.Different classifications of resource types 1.Common resource types are: Organization, Webpage, Collection, Dataset (EPOS -Users, SW services, computing services) 3.Title may be non-descriptive 1.insufficiently unique (“Roads”) 2.meaningful, but opaque naming patterns (eg “AXXX34nn1”) 4.Keywords 1.may be missing or may be too specific to domain 2.may lack references to a thesaurus/CV or are freeform text 5.Info missing such as Abstract, Contact saying “call”, location, time without reference, wrong URL 6.Grouping: a range of metadata records from a single source may be very similar (only differ in one parameter e.g. location) – they may be better discovered as a group of records 7.Duplicates Several metadata records from different catalogs may point to the same physical dataset (or have overlapping susbsets of distributions) Provenance Issue?...... * Community Inventory of EarthCube Resources for Geosciences Interoperability (http://workspace.earthcube.org/cinergi)

8 Are we Ready to Break the MD Bottleneck, make up for deficiencies & satisfy Ambitious MD Requirements? In large part from RDA MD discussing and also the work of Anneke Zuiderwijk, Keith Jeffery, Marijn Janssen and : Duval, Erik, et al. "Metadata principles and practicalities." D-lib Magazine 8.4 (2002): 16. Easy to add, discover, download, access & exchange MD Suitable representation for search, browsing & query Provide the possibility to link metadata. Recommend/advise to link with certain other datasets. Warn if linking two datasets does not make sense. Use a good URI strategy. Use identifiers (but which?). Use well-accepted vocabularies. Use well-accepted thesauri. (Ontologies?) Warn about linking when datasets have temporal aspects. Provide advice. Monitor links between data and make sure that they are still up to date. Make sure that linking is not just spatial, link to other domains as well. Be consistent & support interpretation of data Support bridging heterogeneity To achieve interoperability Be sustainable Researchers do not see value in metadata & its management tools (e.g. relational databases, wikis, etc.) There is perceived cost of adding and maintaining metadata. Support linking of data So how do we satisfy these and create quality Md and/ or extended it? Support data integration. Bridge different MD models e.g. ISO vs DC Fields may have diff meaning 8

9 Broad View of Metadata (Schema) Status & Argument for More Semantics Richness issue Even when done well simple annotations and structured metadata are not rich enough to support ad hoc use & certainly not reasoning based on meaning. There are many MD schemas and a broad challenge is to link/integrate them. “Metadata schemas are created for resources’ identification and description and - most of the times - they do not express rich semantics. Even though the meaning of the metadata information can be processed by humans and its relationship to the described resource can be understood, for machine processing the actual relationships are frequently not obvious. In contrast to metadata schemas, ontologies provide rich constructs to express the meaning of data” Stasinopoulou, Thomais, et al. "Ontology-based metadata integration in the cultural heritage domain." Asian Digital Libraries. Looking Back 10 Years and Forging New Frontiers. Springer Berlin Heidelberg, 2007. 165-175.

10 RDA Background & Outreach on Semantics A growing interest in the topic of semantic interoperability. The centrality of semantic issues was, for example, noted following the 1 st Plenary.following the 1 st Plenary Semantic issues and technologies are already part of the discussion on the RDA Forum. Research communities need to adopt and deploy technologies that help them get the most from their data, understand context, and infer meaning. The semantic web community has much to contribute to an enabling global infrastructure and it would be great to see greater involvement in the RDA. Fran Berman (Professor of Computer Science, RPI, Chair of the Research Data Alliance/U.S.) RDA should take on this issue but how? And who will participate?

11 RDA Metadata and Semantics Intersect Data Foundations and Terminology (WG & IG) Data in Context IG Data Fabric IG Geospatial IG Marine Data Harmonization IG ( ISO 19115 etc.) Broker IG Research Data Provenance....... Semantic Interoperability BoF at RDA P3 3 Presentations to illustrate key concepts of SI & use of ontologies- Gary Berg-Cross & Yann Le Franc Discussed Ontology Design Patterns and Lightweight methods EUON effort What is a quality ontology? 1st European Ontology Network (EUON) Workshop co-located at P4 http://www.eudat.eu/euon/euon-2014-workshop

12 The Need for Some Semantics is (somewhat) Understood 1.MD need to be a first class, processable system, like a conceptual model, easier to use, manage and follow efforts to make data more understandable by computers. 2.Semantics helps address what MD annotations mean 1.What the shared meanings are 2.What the assumptions such as relations between MD items are and 3.How links to other data can be included? http://www.slideshare.net/ISSGC/session-48-principles-of-semantic-metadata-management Principles and Foundations of Ontologies and Semantic Grids - Session 48. July 15th, 2009 Oscar Corcho (Universidad Politécnica de Madrid) Restrictive

13 How do we add Semantics to MD? Depends on Intended Use : Example of Semantic Annotations (HTML -> RDFa) Start with a collection XHTML attributes in a web page Embed RDF annotations in the web pages using things like DC and FOAF vocabularies easily used for most simple annotations -e.g. Creator, title, contact info Becomes From Introduction to Semantic Technologies, Ontologies and the Semantic Web Aug 2010 #39 For data description and context the semantics added can be like a formal, conceptual model For search it can be like a better annotation of keywords using RDF.

14 Beyond Vocabularies: Good Semantics Needs Appropriate Conceptualization of Properties Connect properties like stream flow, level, pollutants, evapotranspiration etc. in a schema Water Body Water Density Unit Grams /cm 3 Water Density For connecting to Chem/BioChem ontologies there might be sub-categories of Physical Features for elements – optical, hardness, color See Dumontier Lab ontologies to represent bio-scientific concepts and relations. http://dumontierlab.com/?page=ontologies hasConstituent hasFeature hasDensity Unit hasUnit Chesapeake Bay IsA Area HasFeature Area Quantity hasQuantity Real Number Sq Miles hasUnit hasValue 14 hasLayer …..

15 Ontology Design Patterns (ODPs) of Semantic Trajectory – Hydro/Ocean Observations as Annotations ODPs (aka microtheories) small, modular, & coherent schemas. Relatively autonomous but conceivably composable with other schemas. Environmental Observations fit into this schema. Fixes may be hydrometric feature observations & at some PoI (and offset Fix) for some point or period of time denoting important activities Observations including time series sets might be applied to something like streamflow or temperature plots or a pollution plume or data from an ocean glider You may query Schema : “Show locations within Gulf of Mexico fishing area with colored dissolved organic matter” Hydro Var & attr/data or value type of Interest Hydro Object or moving device Hydro Obs/Device Paths & POIs Have Geometries including Polygon Areas 15 A Geo-Ontology Design Pattern for Semantic Trajectories COSIT 2013: Yingjie Hu et al.

16 Are we ready for metadata semantics to be widely used? How do we bring current MD practice and semantic practice together? What is a practical MD vision of this enhanced MD? Where are the opportunities? E.g. Is semantic annotation the sweet spot? Do we just expand MD tags to semantic annotations and if so how? What about ontology design patterns (ODPs)? Where are they useful? Thoughts on where to add semantics and its technology to MD in the data/MD cycle? How does it affect how data/md repositories function? Some/considerable confusion about how MD should be integrated into information systems. Can we agree on common or domain principles (like modularity or building blocks), practices and tools to employ ?


Download ppt "RDA Metadata Semantics Rich Metadata Semantics needed for human AND computer understanding but Mapping metadata schemas to ontologies can be a complicated."

Similar presentations


Ads by Google