Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1

Slides:



Advertisements
Similar presentations
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Advertisements

Complexity must become Linear or Decrease Smart data infrastructure: The sixth generation of mediation for data science Peter Fox 1
DCO-VIVO: A Collaborative Data Platform for the Deep Carbon Science Communities Han Wang 1 ( ), Yu Chen 1 Patrick West.
Ontology and Application for Reusable Search Interface Design Plans for Advanced Semantic Technologies Final Project Eric Rozell, Tetherless World Constellation.
McGuinness – Microsoft eScience – December 8, Semantically-Enabled Science Informatics: With Supporting Knowledge Provenance and Evolution Infrastructure.
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
Applying Semantics in Dataset Summarization for Solar Data Ingest Pipelines James Michaelis ( ), Deborah L. McGuinness
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
ToolMatch: Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Patrick West 1 Nancy Hoebelheinrich.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator Session: Managing Ecological Data for Effective Use and Reuse Patrice Seyed.
Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness and Peter Fox CSCI Week 9, October 27, 2008.
Provenance-Aware Faceted Search Deborah L. McGuinness 1,2 Peter Fox 1 Cynthia Chang 1 Li Ding 1.
Beyond a Data Portal: A Collaborative Environment for the Deep Carbon Science Communities Han Wang, Yu Chen, Patrick West, John Erickson, Xiaogang Ma,
Configurable User Interface Framework for Cross-Disciplinary and Citizen Science Presented by: Peter Fox Authors: Eric Rozell, Han Wang, Patrick West,
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
SemantAqua: A Semantically-Enabled Provenance-Aware Water Quality Portal Evan W. Patton, Ping Wang, Jin Guang Zheng, Timothy Lebo, Li Ding, Joanne Luciano,
Provenance Capture in Data Access And Data Manipulation Software Patrick West 1 Peter Fox
References: [1] [2] [3] Acknowledgments:
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness TA Weijing Chen Semantic eScience Week 10, November 7, 2011.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness and Joanne Luciano With Peter Fox and Li Ding CSCI Week 10, November.
Catalog/ ID Selected Logical Constraints (disjointness, inverse, …) Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (properties) Informal.
Semantic Cyberinfrastructure for Knowledge and Information Discovery (SCiKID) Proposal Principle Investigator: Eric Rozell Tetherless World Constellation.
Discovering accessibility, display, and manipulation of data in a data portal Nancy Hoebelheinrich Patrick West 2
A Semantically-Enabled Provenance- Aware Water Quality Portal Joint work with: Jin Guang Zheng, Ping Wang, Evan Patton, Timothy Lebo, Joanne Luciano Deborah.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
Motivations and Challenges: Proper data management hinges on recording and maintaining “steps” applied to create data. Consumers require methods to assess.
NEON non-specialist use case; Science data reuse in a classroom Peter Fox Brian Wee Patrick West 1
Modeling and Representing National Climate Assessment Information using Linked Data Jin Guang Zheng 1 Curt Tilmes 2
NEON non-specialist use case; Science data reuse in a classroom Peter Fox Brian Wee Patrick West 1
DOAP – Description of a Project Ontology DOAP provides us with the ability to represent software, software projects, releases of software, licensing information,
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
1 Semantic Provenance and Integration Peter Fox and Deborah L. McGuinness Joint work with Stephan Zednick, Patrick West, Li Ding, Cynthia Chang, … Tetherless.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
ToolMatch Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Products Patrick West 1 Nancy Hoebelheinrich.
Semantic Technologies and Application to Climate Data M. Benno Blumenthal IRI/Columbia University CDW /04-01.
Resource Discovery for Extreme Scale Collaboration Benno Lee Patrick West 1 William Smith 2
Semantically-Enabled Virtual Observatories: VSTO Highlights for Observational Data Deborah McGuinness Acting Director and Senior Research Scientist Knowledge.
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
The VIRTUAL SOLAR-TERRESTRIAL OBSERVATORY - Exploring paradigms for interdisciplinary data-driven science Peter Fox 1 Don Middleton 2,
DCO-VIVO: A Collaborative Data Platform for the Deep Carbon Science Communities Han Wang 1 ( ), Yu Chen 1 Patrick West.
Information Modeling and Semantic Web Application For National Climate Assessment Jin Guang Zheng 1 Curt Tilmes 2
Deepcarbon.net Xiaogang Ma, Patrick West, John Erickson, Stephan Zednik, Yu Chen, Han Wang, Hao Zhong, Peter Fox Tetherless World Constellation Rensselaer.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
Determining Fitness-For-Use of Ontologies through Change Management, Versioning and Publication Best Practices Patrick West 1 Stephan.
 Key integrating concepts  Groups  Formal Community Groups  Ad-hoc special purpose/ interest groups  Fine-grained access control and membership 
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
Determining Fitness-For-Use of Ontologies through Change Management, Versioning and Publication Best Practices Patrick West 1 Stephan.
Supported by ESIP Semantic Web Cluster A service based on community-built semantic web applications Provide users with the means to match their datasets.
Catalog/ ID Selected Logical Constraints (disjointness, inverse, …) Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (properties) Informal.
Human-Aware Sensor Network Ontology (HASNetO): Semantic Support for Empirical Data Collection Paulo Pinheiro 1, Deborah McGuinness 1, Henrique Santos 1,2.
Ewa Deelman, Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
Author: Akiyoshi Matonoy, Toshiyuki Amagasay, Masatoshi Yoshikawaz, Shunsuke Uemuray.
Information Model Driven Semantic Framework Architecture and Design for Distributed Data Repositories AGU 2011, IN51D-04 December 9, 2011 Peter Fox (RPI)
Social and Personal Factors in Semantic Infusion Projects Patrick West 1 Peter Fox 1 Deborah McGuinness 1,2
Linked Open Data for European Earth Observation Products Carlo Matteo Scalzo CTO, Epistematica epistematica.
TWC Adoption* of RDA DTR and PIT in the Deep Carbon Observatory Data Portal Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox, & the.
A Framework for Earth Science Search Interface Development Design and Implementation of S2S Presented by: Stephan Zednik, Tetherless World Constellation.
The Semantic eScience Framework AGU FM10 IN22A-02 Deborah McGuinness and Peter Fox (RPI) Tetherless World Constellation.
Get the poster at Semantic Visualization Provenance Records:
improve the efficiency, collaborative potential, and
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
Deep Carbon Observatory Data Science Platform
Data types and persistent identifiers in
Modeling Data Set Versioning Operations
ToolMatch Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Products Patrick West1 Nancy
Adoption of RDA DTR and PIT in the Deep Carbon Observatory Data Portal
Modeling Data Set Versioning Operations
Presentation transcript:

Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1 Patrick West 1 Stephan Zednik 1 Peter Arthur Fox 1 1 Rensselaer Polytechnic Institute, th St., Troy, NY, United States Glossary: RPI – Rensselaer Polytechnic Institute TWC – Tetherless World Constellation at Rensselaer Polytechnic Institute VSTO – Virtual Solar-Terrestrial Observatory CEDAR – Coupling Energetics and Dynamics of Atmospheric Regions SeSF – Semantic eScience Framework RESTful– Representational State Transfer References: 1.P. West, E. Rozell, S. Zednik, P. Fox, and D. McGuinness, 2009, Semantically Enabled Temporal Reasoning in a Virtual Observatory, OWL Experiences and Directions, CEUR Workshop Proceedings, vol E. Rozell, P. West, and P. Fox, 2010, Experiences Integrating Temporal Metadata in a Domain Ontology, Technical Report. Sponsors: NSF Office of Cyberinfrastructure (OCI) The Virtual Solar-Terrestrial Observatory (VSTO) Portal at vsto.org provides a set of guided workflows to implement use cases designed for solar-terrestrial physics and upper atmospheric science. Semantics are used in VSTO to model abstract instrument and parameter classifications, providing data access to users without extended domain specific vocabularies. The temporal restrictions used in the workflows are currently possible via RESTful services made to a remote system with access to a SQL-based metadata catalog. In order to provide a greater range of temporal reasoning and search capabilities for the user, we propose an alternative architecture design for the VSTO Portal, where the temporal metadata is integrated in the domain ontology. We achieve this integration by converting temporal metadata from the headers of raw data files into RDF using the OWL-Time vocabulary. This presentation covers our work with semantic temporal metadata, including: our representation using OWL-Time, issues that we have faced in persistent storage, and performance and scalability of semantic query. We conclude with discussions of the significance semantic temporal metadata has in virtual observatories. Abstract Motivations and Use Cases Limitations of relational database representations: There are no mechanisms for inferring relationships given relationships that exit within the relational database. While with an ontology, we can add inferencing to the temporal instances. There are no easy ways of inheriting relationships, as can be done with an ontology and knowledge base. The expressivity of a temporal model represented by a relational database schema is much smaller than an ontological temporal model. Use case #1: Retrieve data where two or more instruments have coincident measurements within a temporal interval. Example: Retrieve any data where the Millstone Hill Fabry Perot Interferometer and the Poker Flat Fabry Perot Interferometer and collecting data simultaneously. Use case #2: Retrieve data in a non-contiguous time interval. Example: Retrieve data for sunspot activity during the month of March (in any year). Temporal Metadata Modeling Temporal Model Evaluation IN31B-1444 Visit our working group page at Get the poster at Research Methodology Design and implementation of temporal models for ontologies Use OWL-Time as a start point Keep the total number of triples over time instances small Evaluation of performance and scalability of Semantic Web tools, in particular, scalable storage and SPARQL querying. Load triples into Virtuoso triple store Generate SPARQL queries based on recurring tasks in the workflow Fig. 1. XML Schema Datatype dateTime StringsFig. 2. Verbose OWL-Time InstancesFig. 3. Discrete Interval Coverage in OWL-Time Fig. 4. OWL-Time Instances with Date Coverages Fig. 1 shows a model that represents the start and end times for VSTO dataset records using only xsd:dateTime strings. Fig. 2 shows a model that represents the start and end times for VSTO dataset records using OWL-Time instances with a granularity of seconds. Fig. 3 show a model that represents the individual observations of VSTO dataset records using a notion of discrete intervals created within the SeSF ontology. All the three models above are not feasible solutions as they require the SPARQL engine to parse at least O(10 6 ) time instances to answer the queries for VSTO workflow with interactive responses (query response in less than 10 seconds). Fig. 4 illustrates a feasible solution for the temporal metadata modeling. It represents the start and end times for a VSTO dataset record using xsd:dateTime strings with a granularity of seconds, and it also includes the exact temporal range for that dataset to a granularity of days (as required by the use cases) using time:DateTimeInterval. This modeling solution only increases the size of the data by approximately a factor of 5. We have achieved interactive response time for all scenarios in the original VSTO workflow by answering various SPARQL queries, which are described in more detail below. Table 1 lists 12 workflow tasks derived from the VSTO Portal. We generated SPARQL queries representing these tasks and executed them in a Virtuoso tripe store loaded with CEDAR datasets, which have approximately 20 million time instances (about 80 million triples). The average execution time of these queries all fall within 1 second, which either improves or is comparable to the task performance using RESTful service calls. The following shows a SPARQL query for Task 1, which takes about 5 seconds with the RESTful service. Instrument 175 is a instrument class called Space Craft, and it has the highest number of days covered. Task NumberTask Description 1Get all years a given instrument has data coverage 2Get all months a given instrument for given year has data coverage 3Get all days a given instrument for given year and month has data coverage 4Get all parameters measured by a given instrument in a given time interval 5Get all years any dataset has data coverage 6Get all months any dataset for given year has data coverage 7Get all days any dataset for given year and month has data coverage 8Get all instruments that have data coverage in a given time interval 9Get all years a given parameter has data coverage 10Get all months a given parameter for given year has data coverage 11Get all days a given parameter for given year and month has data coverage 12Get all instruments that measure a given parameter during a given time interval Table 1. A table of workflow tasks derived from the VSTO Portal PREFIX vsto: PREFIX cedar: PREFIX time: PREFIX xsd: SELECT DISTINCT ?year WHERE { ?dataset vsto:isFromInstrument cedar:cedar_instrument_175. ?dataset vsto:hasDateTimeCoverage ?interval. ?interval time:hasDateTimeDescription ?desc. ?desc time:year ?year. }