Instant Karma: Accessing Provenance Information for AMSR-E Science Data Products AMSR-E Science Team Meeting 29 June 2011 UAHuntsville The University of.

Slides:



Advertisements
Similar presentations
Experiment Provenance: Towards Links to Network Measurement Data Mehmet Aktas, Beth Plale, Scott Jensen Data to Insight Center Indiana University.
Advertisements

Presented at AMSR Science Team Meeting September 23-24, 2014 AMSR SIPS STATUS Helen Conover Information Technology & Systems Center The University of Alabama.
AMSR-E SIPS Processing Status Presented by Helen Conover Information Technology and Systems Center at the University of Alabama in Huntsville AMSR-E Joint.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
As computer network experiments increase in complexity and size, it becomes increasingly difficult to fully understand the circumstances under which a.
AMSR-E SIPS Processing Status Presented by Kathryn Regner Information Technology and Systems Center at the University of Alabama in Huntsville AMSR-E Joint.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
NASA World Wind. What is NASA World Wind? A richly interactive 3D planetary visualization tool. Smart client architecture. Portal for NASA data. Integrates.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
Sponsored by the National Science Foundation netKarma Spiral 2 Year-end Project Review Indiana University Beth Plale (PI) School of Informatics and Computing.
AMSR-E SIPS Processing Status Presented by Kathryn Regner Information Technology and Systems Center at the University of Alabama in Huntsville JAXA / AMSR-E.
MAJOR BUSINESS INITIATIVES Gaining Competitive Advantage with IT
September 23-24, 2014Dawn Conway, AMSR-E / AMSR2 TLSCF Lead Software Engineer AMSR-E / AMSR2 Team Lead Science Computing Facility TLSCF at UAH Dr. Roy.
Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA.
AMSR-E Sea Ice Products Thorsten Markus NASA Goddard Space Flight Center Greenbelt, MD
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
1PoDAG XXX: IceBridge October 12, 2011 PoDAG XXX: IceBridge Marilyn Kaminski IceBridge Project Manager October 12, 2011.
Instant Karma Collecting Provenance for AMSR-E Beth Plale Director, Data to Insight Center Indiana University Helen Conover Information Technology and.
Presented at the LANCE User Working Group Meeting April 29, 2015 LANCE AMSR2 UPDATE Sherry Harrison
Global Imagery Browse Services Overview AMSR Science Team Meeting Matt Cechini, GIBS Lead Helen Conover, AMSR-E SIPS.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
1 EOS Clearinghouse Robin Pfister, NASA/GSFC CEOS WGISS May 10-14, 2004.
ATMOSPHERIC SCIENCE DATA CENTER ‘Best’ Practices for Aggregating Subset Results from Archived Datasets Walter E. Baskin 1, Jennifer Perez 2 (1) Science.
September 4 -5, 2013Dawn Conway, AMSR-E / AMSR2 TLSCF Lead Software Engineer AMSR-E / AMSR2 Team Leader Science Computing Facility Current Science Software.
Instant Karma Collecting Provenance for AMSR-E Beth Plale Director, Data to Insight Center Indiana University Helen Conover Information Technology and.
BOE/Work Break Down The following slides provide a high level view of the BOE/WBS structure of the proposed NSIDC DAAC Contract.
A framework to support collaborative Velo: Knowledge Management for Collaborative (Science | Biology) Projects A framework to support collaborative 1.
Archival Information Packages for NASA HDF-EOS Data R. Duerr, Kent Yang, Azhar Sikander.
25 June 2009 Dawn Conway, AMSR-E TLSCF Lead Software Engineer AMSR-E Team Leader Science Computing Facility.
AMSR-E SIPS Processing Status Presented by Helen Conover Information Technology and Systems Center at the University of Alabama in Huntsville AMSR-E Joint.
EOSDIS Status 9/29/2010 Dan Marinelli, NASA GSFC
AN ENHANCED SST COMPOSITE FOR WEATHER FORECASTING AND REGIONAL CLIMATE STUDIES Gary Jedlovec 1, Jorge Vazquez 2, and Ed Armstrong 2 1NASA/MSFC Earth Science.
AMSR-E Cryosphere Science Data Product Metrics Prepared by the ESDIS SOO Metrics Team for the Cryosphere Science Data Review January 11-12, 2006.
MOWG 2 Sept 15-16,1999 David Hancock1 KEY PERSONNEL Dr. Bob Schutz, GLAS Science Team Leader Dr. Jay Zwally, ICESat Project Scientist, GLAS Team Member.
EOS Terra MODIS Land Processing and Distribution Overview Joseph M Glassy, Director, MODIS Software Development at NTSG School of Forestry, Numerical Terradynamics.
ESIP Federation 2004 : L.B.Pham S. Berrick, L. Pham, G. Leptoukh, Z. Liu, H. Rui, S. Shen, W. Teng, T. Zhu NASA Goddard Earth Sciences (GES) Data & Information.
MODIS OCEAN QA Browse Imagery (MQABI Browse Tool) NASA Goddard Space Flight Center Sept 4, 2003
Preservation of NASA’s Earth Observation Data EOSDIS Science Operations, ESDIS Project Code 423 Goddard Space Flight Center, Greenbelt, MD ESIP Federation.
EOSDIS Status 10/16/2008 Dan Marinelli, Science Systems Development Office.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
2015 GLM Annual Science Team Meeting: Cal/Val Tools Developers Forum 9-11 September, 2015 DATA MANAGEMENT For GLM Cal/Val Activities Helen Conover Information.
National Aeronautics and Space Administration Jet Propulsion Laboratory California Institute of Technology Pasadena, California EDGE: The Multi-Metadata.
Presented at AMSR Science Team Meeting September 23-24, 2014 AMSR2 NRT Land, Atmosphere Near real-time Capability for EOS (LANCE) Helen Conover Information.
AMSR-E SIPS Processing Status Kathryn Regner Information Technology and Systems Center at the University of Alabama in Huntsville
Jianchun Qin, Liguang Wu, Michael Theobald, A. K. Sharma, George Serafino, Sunmi Cho, Carrie Phelps NASA Goddard Space Flight Center, Code 902 Greenbelt,
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
Selene Dalecky March 20, 2007 FDsys: GPO’s Digital Content System.
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
VIIRS Product Evaluation at the Ocean PEATE Frederick S. Patt Gene C. Feldman IGARSS 2010 July 27, 2010.
AMSR-E and AMSR-E Validation Status at NSIDC Amanda Leon, NSIDC AMSR-E Lead Joint AMSR-E Science Team Meeting Asheville, NC June 2011.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
KEY PERSONNEL Dr. Bob Schutz, GLAS Science Team Leader Dr. Jay Zwally, ICESat Project Scientist, GLAS Team Member Mr. David Hancock, Science Software Development.
ECS Metadata Considerations for Preservation SiriJodha S. Khalsa National Snow and Ice Data Center.
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
LANCE Processing at the AMSR-E SIPS Presented by Kathryn Regner Information Technology and Systems Center at the University of Alabama in Huntsville Joint.
EO Dataset Preservation Workflow Data Stewardship Interest Group WGISS-37 Meeting Cocoa Beach (Florida-US) - April 14-18, 2014.
The Virtual Heliospheric Observatory and Distributed Data Processing T.W. Narock 1,2, A. Szabo 2, A. Davis 3 1. L3 Communications,
LANCE AMSR-E Status Helen Conover, UAHuntsville, Kathryn Regner, UAHuntsville,
16-17 September 2015 Dawn Conway, AMSR-E / AMSR2 TLSCF Lead Software Engineer AMSR-E / AMSR2 Team Lead Science Computing Facility TLSCF at UAH Dr. Roy.
MODIS SDST, STTG and SDDT MODIS Science Team Meeting (Land Discipline Breakout Session) July 13, 2004 Robert Wolfe Raytheon NASA GSFC Code 922.
Presented at the AMSR Science Team Meeting September 16-17, 2015 LANCE AMSR2 NRT Sherry Harrison Land, Atmosphere Near real- time.
SPDF Science Advisory Group - September 29-30, 2005 Page 12/24/2016 9:09:48 PM Services of the Space Physics Data Facility (SPDF) / Sun-Earth Connection.
ESO and the CMR Life Cycle Process Winter ESIP, Jan 2015 ESDIS Standards Office (ESO) Yonsook Enloe Allan Doyle Helen Conover.
September 11-12, 2012Dawn Conway, AMSR-E TLSCF Lead Software Engineer AMSR-E Team Leader Science Computing Facility New Browse Images Current Science Software.
Persistent Identifiers Implementation in EOSDIS
AGU Paper Number: IN43B-1697 Evolving a NASA Digital Object Identifiers System with Community Engagement Lalit Wanchoo1 and Nathan.
Presentation transcript:

Instant Karma: Accessing Provenance Information for AMSR-E Science Data Products AMSR-E Science Team Meeting 29 June 2011 UAHuntsville The University of Alabama in Huntsville

Instant Karma Overview 229 June 2011 Data Center Operations Provenance Research Earth Science Instant Karma Project Collaboration among – AMSR-E SIPS (MSFC Earth Science Office and UAHuntsville ITSC) – Provenance researchers at Indiana University’s Data to Insight Center – AMSR-E Sea Ice science team (GSFC) AMSR-E Science Team Meeting Asheville, NC Primary goal is to improve the collection, preservation, utility and dissemination of provenance information within the NASA Earth Science community – Using Karma provenance tool – Initial focus on Sea Ice processing

Team Members NameOrganizationRole Michael GoodmanNASA MSFCPrincipal Investigator Thorsten Markus Don Cavalieri Alvaro Ivanoff NASA GSFCCo-Investigator – Science Team Science Algorithm Developer Science Software Developer Helen Conover Bruce Beaumont Anurag Bhaskar Ajinkya Kulkarni Michael McEniry Kathryn Regner Cara Stein UAHuntsville Information Technology & Systems Center Co-Investigator – Project Manager AMSR-E SIPS Implementation Student, Metadata Utilities Provenance Browser Systems Administration AMSR-E SIPS Systems Engineer Provenance Browser Beth Plale Mehmet Aktas Scott Jensen Harsh Joshi Robert Ping Prajakta Purohit Yuan Luo Indiana University Data to Insight Center Co-Investigator – Karma System Karma Database and Services Provenance Collection Services Project Management Provenance Collection Services Karma System Monitoring Dawn ConwayUAHuntsville Earth System Science AMSR-E Science Metadata Consultant 29 June AMSR-E Science Team Meeting Asheville, NC

Key Responsibility at Data Creation Archivists are traditionally responsible for curation (i.e., adding metadata and provenance) but not present at creation of scientific data. Moment of creation is where most knowledge about product is present. To be effective, requires tools in earliest stages of data’s life that help with preservation. 29 June Diagram from Berman et al. “Sustainable Economics for a Digital Planet” AMSR-E Science Team Meeting Asheville, NC Create Long Term Access Archive Distribute Preservation Action

Instant Karma Provenance Research How to’s of provenance capture are becoming better understood Collecting right (and only “right”) provenance is a harder challenge, and one we hope to make progress on in this project Delivering on representation and use case support simultaneously for today’s users and next generation uses is ongoing challenge – Sea ice research, climate, classroom, informatics June AMSR-E Science Team Meeting Asheville, NC

Types of Provenance Information Data lineage: product generation “recipe” (data inputs, software and hardware) Additional knowledge about science algorithms, instrument variations, etc. Lots of information already available, but scattered across multiple locations – Processing system configuration – Dataset and file level metadata – Processing history information – Quality assurance information – Software documentation (e.g., algorithm theoretical basis documents, release notes) – Data documentation (e.g., guide documents, README files) Instant Karma project aims to collate and organize information from multiple sources 629 June 2011 AMSR-E Science Team Meeting Asheville, NC

SIPS-GHRC Processing Architecture Algorithm Packages from the Science Team and Team Lead of the Science Computing Facility Processing automation controlled by SIPS scripts – Pass processing is data driven – L3 product generation is scheduled after nominal availability of input products 7 L2A Brightness Temps L2B Ocean L2B Land L2B Rain Pass Script Ocean Land Snow Sea Ice Daily Script Snow Pentad Script Ocean Weekly Script Ocean Rain Snow Monthly Script Input Data and Metadata File(s) Ancillary File(s) Science Data Metadata Processing History Quality Assurance Browse imagery Custom subsets Delivered Algorithm Package Control Script 29 June 2011 AMSR-E Science Team Meeting Asheville, NC

Sea Ice Product Generation 8 One day’s worth of Level-2A Tbs Ice Mask Snow Melt Mask Sea Ice Products (6, 12, and 25 km) Metadata Processing History Quality Assurance Browse imagery Custom subsets Delivered Algorithm Package Daily Processing Script Sea Ice Algorithms C and FORTRAN Provided by Science Team C and FORTRAN Provided by Science Team Perl with C Provided by SIPS Perl with C Provided by SIPS Bourne shell Provided by SCF Bourne shell Provided by SCF Hardware Platform Operating System, Compilers Software Libraries 29 June 2011 AMSR-E Science Team Meeting Asheville, NC

25 km grid resolution; polar stereographic grid 12.5 km grid resolution; polar stereographic grid 6.25 km grid resolution; polar stereographic grid Brightness temperatures 6, 10, 18, 23, 36, 89 GHz 18, 23, 36, 89 GHz89 GHz Sea ice concentration NASA Team 2 (NT2) Bootstrap – NT2 NASA Team 2 (NT2) Bootstrap – NT2 Snow depth on sea ice 5-day product Except for the snow depth product, all products daily averages using (a) ascending orbits only, (b) descending orbits only, (c) all orbits. AMSR-E Standard Sea Ice Products 29 June AMSR-E Science Team Meeting Asheville, NC

Perspectives on Processing Lineage 1029 June 2011 Data Processing ViewProvenance View AMSR-E Science Team Meeting Asheville, NC

AMSR-E Provenance Use Cases Browse provenance graphs : convey rich information about final data granule details [Use case 1] – Spatial location, time of observation, algorithms employed, input data and ancillary files – Provenance bundle to include pointers to relevant documentation Answer “Something isn’t right” question [Use case 1 variant] -E.g., did not receive data for several days so snow melt mask may be inaccurate. Compare two data granules [Use case 2] -Query system to get list of provenance differences (e.g., versions of software, number and versions of input files) General provenance graph for a given science process, e.g., Sea Ice processing [Use case 3] -Current algorithms and versions, nominal number and versions of input files, pointers to relevant documentation 1129 June 2011 AMSR-E Science Team Meeting Asheville, NC

Karma Provenance Repository Provenance Collection, Storage and Access Instrumented AMSR-E SIPS processing workflow for sea ice in the testbed environment. – Provenance information is captured in experiment run log files – Log files are parsed to generate provenance notifications. – These notifications are then imported into the Karma database. – The Karma Service Query API is used to generate OPM- compatible XML graphs, each corresponding to a processing run. 29 June Ocean Weekly Script Ocean Rain Snow Monthly Script Snow Pentad Script Ocean Land Snow Sea Ice Daily Script L2A Brightness Temps Pass Script L2B Ocean L2B Land L2B Rain Provenance Collection Query API Processing Testbed AMSR-E Science Team Meeting Asheville, NC Provenance Browser

Additional Science-Relevant Provenance and Context Information Harvesting granule information from ECS metadata Also recording processing location associated with each data granule Working with AMSR-E Science Computing Facility to identify algorithm and data product information – Algorithm versions and descriptions – Parameters and data fields – Ancillary files – Flag values and explanations – Pointers to full documentation Defining how to harvest, transmit and display this information 29 June AMSR-E Science Team Meeting Asheville, NC

Provenance Browser for AMSR-E Products Initial Prototype – Interactive browsing of provenance graphs and information about workflows and final data products – Uses Karma Service Query API to extract provenance graphs from the Karma provenance repository. 29 June AMSR-E Science Team Meeting Asheville, NC

Instant Karma Major Milestones 29 June AMSR-E Science Team Meeting Asheville, NC

Project Schedule 29 June 2011 AMSR-E Science Team Meeting Asheville, NC 16

Near Term Plans Begin routine daily Sea Ice processing with provenance collection in Testbed Enhance context metadata available to Karma system Refine Provenance Browser and Karma Query API Develop joint Karma project / SIPS approach for instrumenting additional processing workflows Work with other NASA Earth science provenance projects to develop common approaches to providing provenance information with science data files Begin to engage user community – Present to AMSR-E Science Team – Outreach to Sea Ice community at GSFC – Work with NSIDC DAAC to identify friendly users for beta testing 1729 June 2011 AMSR-E Science Team Meeting Asheville, NC

Conclusions Instant Karma project is on track to incorporate provenance capture into the AMSR-E SIPS Ready to begin testing the AMSR-E Provenance Browser with the wider user community Continuing to work with NASA ESDIS, ESDSWG and ESIP Federation in defining common provenance practices across NASA data systems 29 June 2011 AMSR-E Science Team Meeting Asheville, NC 18

Instant Karma Project Contacts Instant Karma Project Web Site – – AMSR-E Provenance Browser – Contacts – Michael Goodman – Helen Conover – Beth Plale – Thorsten Markus 29 June 2011 AMSR-E Science Team Meeting Asheville, NC 19

Acronyms AMSR-EAdvanced Microwave Scanning Radiometer for EOS APIApplication Programming Interface DAACDistributed Active Archive Center ECSEOSDIS Core System EOSEarth Observing System EOSDISEarth Observing System Data and Information System ESDSWGEarth Science Data Systems Working Group ESIPEarth Science Information Partners GSFCGoddard Space Flight Center ITSCInformation Technology & Systems Center MSFCMarshall Space Flight Center MODISModerate Resolution Imaging Spectroradiometer NASANational Aeronautics and Space Administration NSIDCNational Snow and Ice Data Center OPMOpen Provenance Model SCFScience Computing Facility SIPSScience Investigator-led Processing System UAHuntsvilleUniversity of Alabama in Huntsville XMLeXtensible Markup Language 29 June AMSR-E Science Team Meeting Asheville, NC