Persistent Identification of Agents and Objects of Global Change: Progress in the Global Change Information System Peter Fox, RPI Curt Tilmes, NASA Xiaogang.

Slides:



Advertisements
Similar presentations
Providing access to your data: Determining your audience Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
Advertisements

Towards a Common Provenance Model for Research Publications Linyun Fu Xiaogang Ma Patrick West Stace Beaulieu.
Global Change Information System Curt Tilmes, USGCRP/NASA Brian Duggan, Steve Aulenbach, Justin Goldstein, USGCRP/UCAR Andrew Buddenberg, NCA/TSU, NOAA/NCDC/CICS.
Global Change Information System Curt Tilmes NASA GSFC USGCRP ESIP Federation Winter Meeting 2013
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
Office of Science & Technology Policy Executive Office of the President The National Climate Assessment Version 3.0 Kathy Jacobs Assistant Director for.
The National Climate Assessment: Overview Glynis C. Lough, Ph.D. National Climate Assessment US Global Change Research Program National Coordination Office.
NOAA Metadata Update Ted Habermann. NOAA EDMC Documentation Directive This Procedural Directive establishes 1) a metadata content standard (International.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
Global Change Information System (GCIS) Curt Tilmes
Providing Access to Your Data: Tracking Data Usage Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
The Digital Library for Earth System Education: A Community Resource
US Climate Change Science Program Incorporating the US Global Change Research Program and the Climate Change Research Initiative U.S. Climate Change Science.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
Updates from EOSDIS -- as they relate to LANCE Kevin Murphy LANCE UWG, 23rd September
National Climate Assessment Third Report Process Katharine Jacobs, Director National Climate Assessment ESIP Federation Washington, D.C. January
Global Change Information System: Information Model and Semantic Application Prototypes (GCIS-IMSAP) Status 01/08/2013 Stephan Zednik 1, Curt Tilmes 2,
Providing Access to Your Data: Tracking Data Usage Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
Peter Fox NFDP 2013 May 22, 2013, Oxford, UK The Now and Now for Data: Metaphors for Making Data Publically Available.
Advertising your data: Using data portals and metadata registries Nancy Hoebelheinrich Version 1.0 September 2012 Section: Local Data Management Copyright.
The National Climate Assessment Kathy Jacobs, Assistant Director for Climate Assessment and Adaptation Office of Science and Technology Policy North American.
References: [1] [2] [3] Acknowledgments:
Sept 19,  Provides a common set of terminology and definitions  A framework for describing resources and processes  Enables computer based interoperability.
Providing Access to Your Data: Access Mechanisms Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
Climate Change Impacts in the United States Third National Climate Assessment Emily Therese Cloyd May 15, 2014.
Global Change Information System (GCIS) ESIP Federation Winter Meeting,
Center for Satellite Applications and Research (STAR) Review 09 – 11 March 2010 Center for Satellite Applications and Research External Review Alfred M.
Archival Information Packages for NASA HDF-EOS Data R. Duerr, Kent Yang, Azhar Sikander.
ESIP Federation Air Quality Cluster Partner Agencies.
The Digital Library for Earth System Science: Contributing resources and collections Meeting with GLOBE 5/29/03 Holly Devaul.
Modeling and Representing National Climate Assessment Information using Linked Data Jin Guang Zheng 1 Curt Tilmes 2
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
TWC Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Observatory Community Xiaogang (Marshall) Ma, Yu Chen, Han Wang, Patrick West,
Using the Global Change Master Directory (GCMD) to Promote and Discover ESIP Data, Services, and Climate Visualizations Presented by GCMD Staff January.
Deepcarbon.net Xiaogang (Marshall) Ma, Yu Chen, Han Wang, John Erickson, Patrick West, Peter Fox Tetherless World Constellation Rensselaer Polytechnic.
TWC Ontology Development for Provenance Tracing in National Climate Assessment of the US Global Change Research Program Xiaogang Ma a, Jin Guang Zheng.
Brief: Data Science Progress/ Activities and Renewal Plans DCO Executive Committee. Oct. 8-9, Rome (IT) DCO-DS = DCO Data Science.
1 Understanding Cataloging with DLESE Metadata Karon Kelly Katy Ginger Holly Devaul
References: [1] Lebo, T., Sahoo, S., McGuinness, D. L. (eds.), PROV-O: The PROV Ontology. Available via: [2]
Information Modeling and Semantic Web Application For National Climate Assessment Jin Guang Zheng 1 Curt Tilmes 2
NOAA Data Citation Procedural Directive 8 November 2012 DAARWG.
ESIP Semantic Web Products and Services ‘triples’ “tutorial” aka sausage making ESIP SW Cluster, Jan ed.
Deepcarbon.net Xiaogang Ma, Patrick West, John Erickson, Stephan Zednik, Yu Chen, Han Wang, Hao Zhong, Peter Fox Tetherless World Constellation Rensselaer.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
1 Class exercise II: Use Case Implementation Deborah McGuinness and Peter Fox CSCI Week 8, October 20, 2008.
DCO-DS: Moving Forward DCO Synthesis Meeting. Oct , 2015 DCO-DS = DCO Data Science.
David Herring NOAA Climate Program Office May 28, 2013 NOAA Climate.gov A brief overview and highlights of what’s new.
TWC A use case-driven iterative method for building a provenance-aware GCIS ontology Xiaogang Ma a, Jin Guang Zheng a, Justin Goldstein b,c, Linyun Fu.
The Proliferation of Metadata Standards and the Evolution of NASA’s Global Change Master Directory (GCMD) Standard for Uses in Earth Science Data Discovery.
UML - Development Process 1 Software Development Process Using UML.
How Environmental Informatics is Preparing Us for the Era of Big Data AGU FM 2013 GC11F-01 December 09, 2013, MW 3001 Peter
Global Change Master Directory (GCMD) Mission “To assist the scientific community in the discovery of Earth science data, related services, and ancillary.
Providing access to your data: Determining your audience Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
NIH BioCADDIE / Force11 Data Citation Pilot Kickoff Meeting Nine Zero Hotel, Boston MA, 3 February 2016 Introduction: Tim Clark, Maryann Martone and Joan.
Working with Your Archive : Broadening Your User Community Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
Worked example: Global Change Information System Peter Fox, and … others Xinformatics 4400/6400 Week 11, April 19, 2016.
LP DAAC Overview – Land Processes Distributed Active Archive Center Chris Doescher LP DAAC Project Manager (605) Chris Torbert.
International Planetary Data Alliance Registry Project Update September 16, 2011.
Poster: EGU Glossary: USGCRP – United States Global Change Research Program NCA – National Climate Assessment GCIS – Global Change Information.
Linked Data Web that can be processed by machines
Persistent Identifiers Implementation in EOSDIS
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
OpenML Workshop Eindhoven TU/e,
WGISS Connected Data Assets Oct 24, 2018 Yonsook Enloe
A Case Study for Synergistically Implementing the Management of Open Data Robert R. Downs NASA Socioeconomic Data and Applications.
Bird of Feather Session
Towards Executable Provenance Graphs for Reported Results in Research Publications Linyun Fu Xiaogang Ma Patrick West
Data and Information Provenance in NCA4
Presentation transcript:

Persistent Identification of Agents and Objects of Global Change: Progress in the Global Change Information System Peter Fox, RPI Curt Tilmes, NASA Xiaogang (Marshall) Ma, RPI Anne Waple, NOAA Stephan Zednik, RPI Jin Zheng, RPI

The Global Change Research Act and USGCRP USGCRP was mandated by Congress in the Global Change Research Act (GCRA) of 1990 (P.L. 101 – 606) “To provide for development and coordination of a comprehensive and integrated United States Research Program which will assist the Nation and the world to understand, assess, predict, and respond to human-induced and natural processes of global change.” 2

Coordinates Federal research to better understand and prepare the nation for global change Prioritizes and supports cutting edge scientific work in global change Assesses the state of scientific knowledge and the Nation’s readiness to respond to global change Communicates research findings to inform, educate, and engage the global community The Program: U.S. Global Change Research Program 3

Global Change Information System (GCIS) 4 Vision: A unified web based source of authoritative, accessible, usable, and timely information about climate and global change for use by scientists, decision makers, and the public.

5 Global Change Research Act (1990), Section 106 …not less frequently than every 4 years, the Council… shall prepare… an assessment which– integrates, evaluates, and interprets the findings of the Program and discusses the scientific uncertainties associated with such findings; analyzes the effects of global change on the natural environment, agriculture, energy production and use, land and water resources, transportation, human health and welfare, human social systems, and biological diversity; and analyzes current trends in global change, both human- induced and natural, and projects major trends for the subsequent 25 to 100 years.

Previous National Climate Assessments Climate Change Impacts on the United States (2000) Global Climate Change Impacts in the United States (2009) 6 Target date for next NCA:

NCA

Prototype Use Case (UC-1) Name Discover and visit data center website of dataset used to generate report figure. Goal The NCA Report reader sees a figure and wants to know where the data came from. Summary A reader of the NCA is browsing the content via the website. He/she sees a figure and wants to know where the data came from. A reference to the publication in which the figure originated appears in the figure caption. Selecting the link to the source publication displays a page of information about the publication including, if available, the publication DOI. The page also includes references to the datasets cited in the publication. Following each of dataset reference links presents a page of information about the dataset, including links back to the agency/data center webpage describing the dataset in more detail and making the actual data available for order or download. Actors Primary Actor - reader of the NCA Preconditions Reader is viewing the NCA online report Post Conditions Reader visits the data center dataset website Normal Flow 1)System is presenting the NCA report to the reader in a web site. Presentation includes report figure with caption that includes reference to source publication. 2)Reader selects publication reference in figure caption 3)System displays information about publication, including DOI (if available). 4)Publication information includes publication dataset citations. 5)Reader selects a dataset cited by the publication. 6)System displays information about dataset including links to agency / data center webpages where more information and (potentially) data download links are available. 7)Reader selects the data center link and is redirected to data center dataset webpage.

NCA links to GCIS entities 9

Key Message & Traceable Account

Key Message vs. “General” Message (early draft)

GCIS 12

GCIS Create an entity from the structured metadata about each thing – tag with related concepts. Identify it with a persistent, controlled identifier. Present with a human readable web page and a machine interface. Represent all relationships between items. 13

14 GCIS and W3C Prov For GCIS, we have agents (people, projects, agencies, data centers, publishers, etc.) who are associated with activities (measuring, deriving, modeling, analyzing, authoring, publishing, archiving, distributing, visualizing, etc. ) the entities (software, data, images, figures, papers, reports, etc.) related to global change. We assign local identifiers to each (so we can persistently resolve them) and capture and represent their relationships. If possible, we link with external authorities: agency data centers, journal publishers, Researcher ID (researcherid.com) or ORCID (orcid.org).

W3C PROV (starting points..) actedOnBehalf ENTITY AGENT wasAttributedTo wasAssociatedWith wasInformedBy wasDerivedFrom wasGeneratedBy used startedAtTime, endedAtTime ACTIVITY Diagram from W3C PROV group and Ivan Herman

Prototype Use Case (UC-2) Name Find Latest Datasets by Keyword Goal Search for datasets associated with the keyword “snow”, list search results by recentness of publication. Summary User story: I want to look for information concerning “snow.” I don’t know if it is a CLEAN word or a GCMD word or don’t even know what GCMD or CLEAN is. How would I do it, and what would I see on my monitor during the process? Assumptions The reader is not assumed to have knowledge regarding the GCMD Keywords (or other) vocabulary. Actors Primary Actor - reader of the NCA Preconditions TBD Post Conditions Reader is presented with a list of datasets associated with the keyword “snow” sorted by dataset publication date. Normal Flow TBD Notes We are looking into two user interface options for dataset selection by keyword 1)As a free-text search where the user inputs “snow”. 2)Present the user a faceted browse interface with a vocabulary faceted which presents the user with terms from a structured vocabulary. The user can manually select the term(s) which match or contain “snow”. We intend to implement prototypes of both.

NCA links to GCIS entities 17

Traceable accounts… 18

19

20

Interagency Information Integration GCIS can use relationships between all relevant information about global change across the agencies: o From observations to datasets to research papers to models to analyses to organizations to people to synthesized reports to human impacts... o Determine agency interdependencies -- An EPA analysis uses a NOAA model dependent on observations from a NASA satellite. o Can present unique interagency metrics "How many papers referenced datasets from a specific satellite?" o Direct users back to agency data centers for more detailed information and the actual content and data.

GCIS Data Mining Structured information with relationships allows integrated data mining, searching, metrics. o What projects provided data used to produce figures that were referenced in the 2013 NCA section about coastal sea level rise impacts? o Which data centers hold data referenced by papers related to forests in the midwest? o Which agencies have people working on projects related to societal impacts of extreme weather events? o Show me the latest papers about health impacts of air quality in California. Which datasets were used in the analysis of air quality in California?

Questions and Comments For more information, visit and

GCIS Benefits NCA web portal, GCIS prototype NCA content available online Searchable, linkable Complete provenance, traceability Links back to source information including agency sources, scenarios, technical input Link to associated and applicable information and tools Ensure authoritative and appealing design and accessibility Incorporates initial indicators of change, impact and response Access to information about NCA process (transparency) Facilitates collaboration across segments of the climate science and applications community Construct, prototype and test the initial framework Use constrained scope and dedicated staff to accomplish a lot in a short time Ensure the system design is extensible and able to grow to meet long term GCIS needs GCIS A single web site can lead back to agency global change information across the program A friendly, accessible entry into global change information for non- scientists Global, persistent, reusable identifiers for each item Integrated data catalog provides interagency metrics, data mining, searching, etc. Interagency relationships allow discovery of interdependencies and increase collaboration opportunities Agency information mapped into a common, consistent model with a standard vocabulary Concept tagging and linking improves search results for agency products 24

25

URI Schema URI for NCA instances consists of 3 parts: domain name, type of instance, identifier – Domain name: data.globalchange.gov – Types: Person, Project, Organization, Publication, etc. – Identifiers: depends on the instance’s type, we will assign a unique id number or construct an identifier base on the instance’s unique property value.

More Examples Personhttp://data.globalchange.gov/person/ Publicationhttp://data.globalchange.gov/publication/doi/ Projecthttp://data.globalchange.gov/project/ACMAP Topichttp://data.globalchange.gov/topic/Human-health Imagehttp://data.globalchange.gov/image/ Figurehttp://data.globalchange.gov/report/ /figure/ Chapterhttp://data.globalchange.gov/report/ /chapter/ Organizationhttp://data.globalchange.gov/organization/NASA Modelhttp://data.globalchange.gov/model/ Datasethttp://data.globalchange.gov/dataset/doi/ Platformhttp://data.globalchange.gov/platform/ Instrumenthttp://data.globalchange.gov/instrument/

GCIS Ontology for NCA (subset) 28

Provenance Modeling Example 29

30 Linked Data Principles 1. Use URIs as names for things. 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards. 4. Include links to other URIs, so that they can discover more things.

Linked Open Data 31

32 Data Identifiers NASA Earth Science Data Systems Working Group and ESIP Federation study resulted in dataset identifiers recommendations, [1] Duerr, et. al. DOI – Digital Object Identifiers provide a well-defined mechanism to attach an identifier to a digital object. Recommendation adopted by NASA for EOSDIS: ers_(DOIs)_for_EOSDIS doi: /MEASURES/GSSTF/DATA308

33 Identifier Resolution doi: /MEASURES/GSSTF/DATA308 A common, persistent, citable reference to that dataset. We build GCIS specific identifiers from those: Then we can resolve it (with content negotiation) on our site, and link it with identifiers for our other resources, including asserting equivalence and linking with the data center responsible for stewardship and distribution of the actual data. We can also refer and link to other repositories of information about those resources.

34 Content Negotiation The server response from the URI depends on what you ask for: A traditional browser will ask for HTML, and receive and render a human readable description of the resource. Web services can request formal, structured XML or RDF metadata about the resource. Our goal is to provide a curated collection of authoritative global change information, but always link back to the data center or publisher responsible for the long term stewardship of the resource.

CLEAN Vocabulary