Resource Discovery for Extreme Scale Collaboration Benno Lee Patrick West 1 William Smith 2

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Complexity must become Linear or Decrease Smart data infrastructure: The sixth generation of mediation for data science Peter Fox 1
Building and Analyzing Social Networks Web Data and Semantics in Social Network Applications Dr. Bhavani Thuraisingham February 15, 2013.
DCO-VIVO: A Collaborative Data Platform for the Deep Carbon Science Communities Han Wang 1 ( ), Yu Chen 1 Patrick West.
Ontology and Application for Reusable Search Interface Design Plans for Advanced Semantic Technologies Final Project Eric Rozell, Tetherless World Constellation.
S2S and OpenSearch Semantics Applications of a Search Service Ontology Eric Rozell, Tetherless World Constellation ESIP Student Fellow – Discovery Cluster.
McGuinness – Microsoft eScience – December 8, Semantically-Enabled Science Informatics: With Supporting Knowledge Provenance and Evolution Infrastructure.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
Applying Semantics in Dataset Summarization for Solar Data Ingest Pipelines James Michaelis ( ), Deborah L. McGuinness
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
Semantic Web Bootcamp Dominic DiFranzo PhD Student/Research Assistant Rensselaer Polytechnic Institute Tetherless World Constellation.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
ToolMatch: Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Patrick West 1 Nancy Hoebelheinrich.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator Session: Managing Ecological Data for Effective Use and Reuse Patrice Seyed.
Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.
Provenance-Aware Faceted Search Deborah L. McGuinness 1,2 Peter Fox 1 Cynthia Chang 1 Li Ding 1.
Beyond a Data Portal: A Collaborative Environment for the Deep Carbon Science Communities Han Wang, Yu Chen, Patrick West, John Erickson, Xiaogang Ma,
Configurable User Interface Framework for Cross-Disciplinary and Citizen Science Presented by: Peter Fox Authors: Eric Rozell, Han Wang, Patrick West,
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
Global Change Information System: Information Model and Semantic Application Prototypes (GCIS-IMSAP) Status 01/08/2013 Stephan Zednik 1, Curt Tilmes 2,
Catalog/ ID Selected Logical Constraints (disjointness, inverse, …) Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (properties) Informal.
Semantic Cyberinfrastructure for Knowledge and Information Discovery (SCiKID) Proposal Principle Investigator: Eric Rozell Tetherless World Constellation.
References: [1] Branch, B.D., Fosmire, M., The role of interdisciplinary GIS and data curation librarians in enhancing authentic scientific research.
Discovering accessibility, display, and manipulation of data in a data portal Nancy Hoebelheinrich Patrick West 2
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
NEON non-specialist use case; Science data reuse in a classroom Peter Fox Brian Wee Patrick West 1
Modeling and Representing National Climate Assessment Information using Linked Data Jin Guang Zheng 1 Curt Tilmes 2
NEON non-specialist use case; Science data reuse in a classroom Peter Fox Brian Wee Patrick West 1
Tetherless World Constellation Open Government Data Jim Hendler Tetherless World Professor of Computer and Cognitive Science Assistant Dean of Information.
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
1 Semantic Provenance and Integration Peter Fox and Deborah L. McGuinness Joint work with Stephan Zednick, Patrick West, Li Ding, Cynthia Chang, … Tetherless.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
ToolMatch Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Products Patrick West 1 Nancy Hoebelheinrich.
DCO-VIVO: A Collaborative Data Platform for the Deep Carbon Science Communities Han Wang 1 ( ), Yu Chen 1 Patrick West.
VIVO Conference 2013 Panel on VIVO Use-Cases for Collaborative Science: From Researcher Networks to Semantic User Interfaces for Data Patrick West – Tetherless.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Information Modeling and Semantic Web Application For National Climate Assessment Jin Guang Zheng 1 Curt Tilmes 2
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Deepcarbon.net Xiaogang Ma, Patrick West, John Erickson, Stephan Zednik, Yu Chen, Han Wang, Hao Zhong, Peter Fox Tetherless World Constellation Rensselaer.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
Determining Fitness-For-Use of Ontologies through Change Management, Versioning and Publication Best Practices Patrick West 1 Stephan.
 Key integrating concepts  Groups  Formal Community Groups  Ad-hoc special purpose/ interest groups  Fine-grained access control and membership 
Determining Fitness-For-Use of Ontologies through Change Management, Versioning and Publication Best Practices Patrick West 1 Stephan.
Supported by ESIP Semantic Web Cluster A service based on community-built semantic web applications Provide users with the means to match their datasets.
Catalog/ ID Selected Logical Constraints (disjointness, inverse, …) Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (properties) Informal.
1 A Medical Information Management System Using the Semantic Web Technology Networked Computing and Advanced INFORMATION MANAGEMENT, NCM '08. Fourth.
Deep Carbon Observatory Data Science and Data Management Infrastructure Overview and Demonstration Patrick West – Tetherless World Constellation Rensselaer.
Prizms for Data Publication and Management Katie Chastain May 9, 2014.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
Presenting Semantic Data Through “Instance Hubs” Using Authoritative URI Design Schemes Alexei Bulazel 1 ( ), Dominic Difranzo 1 (
TWC Adoption* of RDA DTR and PIT in the Deep Carbon Observatory Data Portal Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox, & the.
A Framework for Earth Science Search Interface Development Design and Implementation of S2S Presented by: Stephan Zednik, Tetherless World Constellation.
Ontology and Application for Reusable Search Interface Design Plans for Advanced Semantic Technologies Final Project Eric Rozell, Tetherless World Constellation.
Poster: EGU Glossary: USGCRP – United States Global Change Research Program NCA – National Climate Assessment GCIS – Global Change Information.
Get the poster at Semantic Visualization Provenance Records:
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
Stephan Zednik, Patrick West, Peter Fox Tetherless World Constellation
Zachary Cleaver Semantic Web.
Stephan Zednik, Patrick West, Peter Fox Tetherless World Constellation
CMSP / OCM Vocabulary Services rpi
RDF Standard Data Model Exchange
Modeling Data Set Versioning Operations
ToolMatch Service: Finding Tools for Your Data & Data for Your Tools ESIP Summer 2014 A Collaboration between ESIP’s: Semantic Web Cluster & Product &
ToolMatch Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Products Patrick West1 Nancy
Adoption of RDA DTR and PIT in the Deep Carbon Observatory Data Portal
Modeling Data Set Versioning Operations
Presentation transcript:

Resource Discovery for Extreme Scale Collaboration Benno Lee Patrick West 1 William Smith 2 Sumit Purohit 2 Karen Schuchardt 2 Alan Chappell 2 ), Peter Fox 1 ), Jesse Weaver 2 ( 1 Rensselaer Polytechnic Institute, 2 Pacific Northwest National Laboratory) The amount of data produced in the practice of science is growing rapidly. Despite the accumulation and demand for scientific data, relatively little is actually made available for the broader scientific community. We surmise that the root of the problem is the perceived difficulty to electronically publish scientific data and associated metadata in a way that makes it discoverable. We propose to exploit Semantic Web technologies and practices to make (meta)data discoverable and easy to publish. We share our experiences in curating metadata to illustrate both the flexibility of our approach and the pain of discovering data in the current research environment. We also make recommendations by concrete example of how data publishers can provide their (meta)data by adding some limited, additional markup to HTML pages on the Web. With little additional effort from data publishers, the difficulty of data discovery/access/sharing can be greatly reduced and the impact of research data greatly enhanced. RDESC Architecture TWC/RPI S2S Faceted Browser Facets on the left allow users to constrain their search based on data resources, GCMD Keywords, Special Measured Parameters, and lat/lon coordinates. The facets changed over time based on the metadata extracted from ingesting the various data resources. RDESC RDF Graphs An example description of a GCMD dataset as a RDF graph, using the initial ontology. The current ontology. Ovals represent classes/concepts, and arrows indicate subClassOf relationships. Classes are colored so that darker classes were established in the ontology prior to lighter classes. An example of a RDF description for an ARM data stream and how the ARM measured property hierarchy is used to link data streams to measured properties of interest Conclusion we have emphasized the importance that data publish- ers provide their (meta)data in a way that makes structural and semantic integration a natural process. This is accomplished by following a shared vocabulary of terms embodied as an ontology, and by expressing metadata as RDF triples that utilize the ontology. Although this can sound daunting, we showed that doing so is actually quite easy in practice. We demonstrated the flexibility of this approach by curating existing metadata into the recommended format. Publishing (meta)data in this (or a similar) way will ameliorate (at least in part) the poor data sharing practices that currently pervade the practice of science No matter what dataset we have ingested we will be able to present the metadata in search and browse interfaces, like S2S above, and provide splash pages for each dataset with the information retrieved from the external system. And as you can see, the metadata retrieved from the various systems can be quite different. Acknowledgments: Eric Rozell, Masters Student at Rensselaer Polytechnic Institute now with Microsoft Sponsors: US Department of Energy Glossary: ARM – Atmospheric Radiation Measurement OWL – Web Ontology Language PNNL – Pacific Northwest National Laboratory RDESC – Resource Discovery for Extreme Scale Collaboration RDFS – Resource Description Language Schema RPI – Rensselaer Polytechnic Institute SPARQL – a RDF query language S2S – a faceted web browser TWC – Tetherless World Constellation at Rensselaer Polytechnic Institute Resources: - site developed fro RDESC project - The RDESC ontology