Experiences Developing a User-centric Presentation of Provenance for a Web- based Science Data Analysis Tool Stephan Zednik 1, Gregory Leptoukh 2, Peter.

Slides:



Advertisements
Similar presentations
GEOS-5 Simulations of Aerosol Index and Aerosol Absorption Optical Depth with Comparison to OMI retrievals. V. Buchard, A. da Silva, P. Colarco, R. Spurr.
Advertisements

A Tutorial on MODIS and VIIRS Aerosol Products from Direct Broadcast Data on IDEA Hai Zhang 1, Shobha Kondragunta 2, Hongqing Liu 1 1.IMSG at NOAA 2.NOAA.
Gregory Leptoukh, David Lary, Suhung Shen, Christopher Lynnes What’s in a day?
Data Quality Screening Service Christopher Lynnes, Richard Strub, Thomas Hearty, Bruce Vollmer Goddard Earth Sciences Data and Information Sciences Center.
McGuinness – Microsoft eScience – December 8, Semantically-Enabled Science Informatics: With Supporting Knowledge Provenance and Evolution Infrastructure.
1 Peter Fox Xinformatics – ITEC 6961/CSCI 6960/ERTH Week 11, April 26, 2011 Information integration, life- cycle and visualization.
Informatics takes on data and information quality, uncertainty and bias (in atmospheric science) Peter Fox (TWC/RPI), and … Stephan Zednik 1, Gregory Leptoukh.
Trans-Pacific transport of Asian dust and pollution: Accumulation of biomass burning CO in subtropics and dipole structure of transport Junsang Nam 1,
Experiences Developing a User- centric Presentation of A Domain- enhanced Provenance Data Model Cynthia Chang 1, Stephan Zednik 1, Chris Lynnes 2, Peter.
DROUGHT MONITORING THROUGH THE USE OF MODIS SATELLITE Amy Anderson, Curt Johnson, Dave Prevedel, & Russ Reading.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
ToolMatch: Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Patrick West 1 Nancy Hoebelheinrich.
Aircraft spiral on July 20, 2011 at 14 UTC Validation of GOES-R ABI Surface PM2.5 Concentrations using AIRNOW and Aircraft Data Shobha Kondragunta (NOAA),
Application of Satellite Data to Particulate, Smoke and Dust Monitoring Spring 2015 ARSET - AQ Applied Remote Sensing Education and Training – Air Quality.
Experiences Developing a Semantic Representation of Product Quality, Bias, and Uncertainty for a Satellite Data Product Patrick West 1, Gregory Leptoukh.
AeroStat: Online Platform for the Statistical Intercomparison of Aerosols Gregory Leptoukh, NASA/GSFC (P.I.) Christopher Lynnes, NASA/GSFC (Co-I.) Robert.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
References: [1] [2] [3] Acknowledgments:
Occurrence of TOMS V7 Level-2 Ozone Anomalies over Cloudy Areas Xiong Liu, 1 Mike Newchurch, 1,2 and Jae Kim 1,3 1. Department of Atmospheric Science,
Surface Reflectivity from OMI: Effects of snow on OMI NO 2 retrievals Gray O’Byrne 1, Randall Martin 1,2, Joanna Joiner 3, Edward A. Celarier 3 1 Dalhousie.
Summer Institute in Earth Sciences 2009 Comparison of GEOS-5 Model to MPLNET Aerosol Data Bryon J. Baumstarck Departments of Physics, Computer Science,
Discovering accessibility, display, and manipulation of data in a data portal Nancy Hoebelheinrich Patrick West 2
Figure 1 – Image of Amazon Basin There is a very sharp increase in the atmospheric aerosol loading during the biomass burning season, that is observed.
Surface Reflectivity from OMI: Effects of Snow on OMI NO 2 Gray O’Byrne 1, Randall Martin 1,2, Aaron van Donkelaar 1, Joanna Joiner 3, Edward A. Celarier.
Quality, Uncertainty and Bias Representations of Atmospheric Remote Sensing Information Products Peter Fox, and … others Xinformatics 4400/6400 Week 11,
Operational assimilation of dust optical depth Bruce Ingleby, Yaswant Pradhan and Malcolm Brooks © Crown copyright 08/2013 Met Office and the Met Office.
NEON non-specialist use case; Science data reuse in a classroom Peter Fox Brian Wee Patrick West 1
MAPSS and AeroStat: integrated analysis of aerosol measurements using Level 2 Data and Point Data in Giovanni Maksym Petrenko Charles Ichoku (with the.
ESIP Federation 2004 : L.B.Pham S. Berrick, L. Pham, G. Leptoukh, Z. Liu, H. Rui, S. Shen, W. Teng, T. Zhu NASA Goddard Earth Sciences (GES) Data & Information.
Satellite observations of AOD and fires for Air Quality applications Edward Hyer Naval Research Laboratory AQAST June, Madison, Wisconsin 15 June.
1 Semantic Provenance and Integration Peter Fox and Deborah L. McGuinness Joint work with Stephan Zednick, Patrick West, Li Ding, Cynthia Chang, … Tetherless.
ToolMatch Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Products Patrick West 1 Nancy Hoebelheinrich.
NASA and Earth Science Applied Sciences Program
Measuring UV aerosol absorption. Why is aerosol UV absorption important ? Change in boundary layer ozone mixing ratios as a result of direct aerosol forcing.
The Second TEMPO Science Team Meeting Physical Basis of the Near-UV Aerosol Algorithm Omar Torres NASA Goddard Space Flight Center Atmospheric Chemistry.
Characterization of Aerosols using Airborne Lidar, MODIS, and GOCART Data during the TRACE-P (2001) Mission Rich Ferrare 1, Ed Browell 1, Syed Ismail 1,
QA filtering of individual pixels to enable a more accurate validation of aerosol products Maksym Petrenko Presented at MODIS Collection 7 and beyond Retreat.
UV Aerosol Product Status and Outlook Omar Torres and Changwoo Ahn OMI Science Team Meeting Outline -Status -Product Assessment OMI-MODIS Comparison OMI-Aeronet.
Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) Tetherless World.
Provenance in Earth Science Gregory Leptoukh NASA GSFC.
References: [1] Lebo, T., Sahoo, S., McGuinness, D. L. (eds.), PROV-O: The PROV Ontology. Available via: [2]
Fog- and cloud-induced aerosol modification observed by the Aerosol Robotic Network (AERONET) Thomas F. Eck (Code 618 NASA GSFC) and Brent N. Holben (Code.
Visualization and workflows Gregory Leptoukh & Christopher Lynnes NASA GSFC.
Data was collected from various instruments. AOD values come from our ground Radiometer (AERONET) The Planetary Boundary Layer (PBL) height is collected.
Satellite Aerosol Validation Pawan Gupta NASA ARSET- AQ – GEPD & SESARM, Atlanta, GA September 1-3, 2015.
Synergy of MODIS Deep Blue and Operational Aerosol Products with MISR and SeaWiFS N. Christina Hsu and S.-C. Tsay, M. D. King, M.-J. Jeong NASA Goddard.
The minispectrometer is attached to the polarimeter with a fiber optic cable. It measures the intensity at a much wider range of wavelengths—from 350nm.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
Characterization of Aerosol Data Quality from MODIS for Coastal Regions Jacob Anderson Mentor: Gregory Leptoukh.
Rong-Ming Hu and Randall Martin Inspiring Minds. Retrieval of Aerosol Single Scattering Albedo (SSA)  Determined with radiative transfer calculation.
1 Class exercise II: Use Case Implementation Deborah McGuinness and Peter Fox CSCI Week 8, October 20, 2008.
1 Peter Fox Xinformatics Week 10, April 2, 2012 Worked example.
Improving MISR-retrieved Aerosol Properties Using GOCART Simulations Yang Liu, PhD June 3, 2015 St. Louis, MO.
Experiences Developing a Semantic Representation of Product Quality, Bias, and Uncertainty for a Satellite Data Product Patrick West 1, Gregory Leptoukh.
TWC Illuminate Knowledge Elements in Geoscience Literature Xiaogang (Marshall) Ma, Jin Guang Zheng, Han Wang, Peter Fox Tetherless World Constellation.
Page 1© Crown copyright 2006 Modelled & Observed Atmospheric Radiation Balance during the West African Dry Season. Sean Milton, Glenn Greed, Malcolm Brooks,
Integrating satellite fire and aerosol data into air quality models: recent improvements and ongoing challenges Edward Hyer Naval Research Laboratory 6.
SCIAMACHY TOA Reflectance Correction Effects on Aerosol Optical Depth Retrieval W. Di Nicolantonio, A. Cacciari, S. Scarpanti, G. Ballista, E. Morisi,
Determining Fitness-For-Use of Ontologies through Change Management, Versioning and Publication Best Practices Patrick West 1 Stephan.
TWC A use case-driven iterative method for building a provenance-aware GCIS ontology Xiaogang Ma a, Jin Guang Zheng a, Justin Goldstein b,c, Linyun Fu.
An Observationally-Constrained Global Dust Aerosol Optical Depth (AOD) DAVID A. RIDLEY 1, COLETTE L. HEALD 1, JASPER F. KOK 2, CHUN ZHAO 3 1. CIVIL AND.
AEROCOM AODs are systematically smaller than MODIS, with slightly larger/smaller differences in winter/summer. Aerosol optical properties are difficult.
Assimilation of Satellite Derived Aerosol Optical Depth Udaysankar Nair 1, Sundar A. Christopher 1,2 1 Earth System Science Center, University of Alabama.
Satellite Aerosol Comparative Analysis using the Multi-Sensor MAPSS and AeroStat, powered by Giovanni Presented at the Goddard Annual Aerosol Update, at.
Social and Personal Factors in Semantic Infusion Projects Patrick West 1 Peter Fox 1 Deborah McGuinness 1,2
What Are the Implications of Optical Closure Using Measurements from the Two Column Aerosol Project? J.D. Fast 1, L.K. Berg 1, E. Kassianov 1, D. Chand.
Fourth TEMPO Science Team Meeting
Using dynamic aerosol optical properties from a chemical transport model (CTM) to retrieve aerosol optical depths from MODIS reflectances over land Fall.
ToolMatch Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Products Patrick West1 Nancy
Enza Di Tomaso, Jerónimo Escribano, Nick Schutgens,
Presentation transcript:

Experiences Developing a User-centric Presentation of Provenance for a Web- based Science Data Analysis Tool Stephan Zednik 1, Gregory Leptoukh 2, Peter Fox 1, Chris Lynnes 2, Jianfu Pan 3 1. Tetherless World Constellation, Rensselaer Polytechnic Inst., Troy, NY, United States 2. NASA Goddard Space Flight Center, Greenbelt, MD, United States 3. Adnet Systems, Inc. ESSI12 EGU

Giovanni Earth Science Data Visualization & Analysis Tool Developed and hosted by NASA/ Goddard Space Flight Center (GSFC) Multi-sensor and model data analysis and visualization online tool Supports dozens of visualization types Generate dataset comparisons ~1500 Parameters Used by modelers, researchers, policy makers, students, teachers, etc. 2

Data Discovery AssessmentAccessManipulationVisualizationAnalyze Data Usage Workflow 3

Data Discovery AssessmentAccessManipulationVisualizationAnalyze Data Usage Workflow 4 Integration Reformat Re-project Filtering Subset / Constrain

Data Discovery AssessmentAccessManipulationVisualizationAnalyze Data Usage Workflow 5 Integration Planning Precision Requirements Quality Assessment Requirements Intended Use Integration Reformat Re-project Filtering Subset / Constrain

Challenge Giovanni streamlines data processing, performing required actions on behalf of the user –but automation amplifies the potential for users to generate and use results they do not fully understand The assessment stage is integral for the user to understand fitness-for-use of the result –but Giovanni does not assist in assessment We are challenged to instrument the system to help users understand results 6

Anomaly Example: South Pacific Anomaly Anomaly 7

…is caused by an Overpass Time Difference 8

Multi-Sensor Data Synergy Advisor (MDSA) Assemble semantic knowledge base –Giovanni Service Selections –Data Source Provenance (external provenance - low detail) –Giovanni Planned Operations (what service intends to do) Analyze service plan –Are we integrating/comparing/synthesizing? Are similar dimensions in data sources semantically comparable? (semantic diff) How comparable? (semantic distance) –What data usage caveats exist for data sources? Advise regarding general fitness-for-use and data-usage caveats 9

Data Discovery AssessmentAccessManipulationVisualizationAnalyze Re- Assessment Assisting in Assessment 10 Integration Planning Precision Requirements Quality Assessment Requirements Intended Use Integration Reformat Re-project Filtering Subset / Constrain MDSA Advisory Report Provenance & Lineage Visualization

Multi-Domain Knowledgebase 11 Provenance Domain Earth Science Domain Data Processing Domain

Advisor Knowledge Base 12 Advisor Rules test for potential anomalies, create association between service metadata and anomaly metadata in Advisor KB

Advisor Presentation Requirements Present metadata that can affect fitness for use of result In comparison or integration data sources –Make obvious which properties are comparable –Highlight differences (that affect comparability) where present Present descriptive text (and if possible visuals) for any data usage caveats highlighted by expert ruleset Presentation must be understandable by Earth Scientists 13

Advisory Report Tabular representation of the semantic equivalence of comparable data source and processing properties Advise of and describe potential data anomalies/bias 14

Advisory Report (Dimension Comparison Detail) 15

Advisory Report (Expert Advisories Detail) 16

Giovanni Provenance Visualization Requirements Exercise existing provenance visualization tools to show Giovanni processing provenance Visualization tool must support access to multi-domain metadata knowledgebase (not just provenance metadata) –Science metadata adds domain context to entities in the provenance trace Presentation must be understandable by Earth Scientists 17

Domain-integrated Provenance Visualization 18

Domain-integrated detail view 19

Conclusion Advisory Report is not a replacement for proper analysis planning –But provides benefit for all user types summarizing general fitness-for-usage, integrability, and data usage caveat information –Science user feedback has been very positive Provenance trace dumps are difficult to read, especially to non- software engineers –Science user feedback; “Too much information in provenance lineage, I need a simplified abstraction/view” Transparency  Translucency –make the important stuff stand out Semantic Distance / Integrability Index is non-trivial 20

Future Work –Advisor suggestions to correct for potential anomalies –Views/abstractions of provenance based on specific user group requirements –Continued iteration on visualization tools based on user requirements –Present a comparability index / research techniques to quantify comparability 21

References G. Leptoukh, D. Lary, S. Shen, C. Lynnes, Impact of Day Definition on Daily Correlative Studies, MODIS Science Team Meeting, January Zednik, S., Fox, P., & McGuinness, D. (2010). System Transparency, or How I Learned to Worry about Meaning and Love Provenance! 3rd International Provenance and Annotation Workshop, Troy, NY. 22

Links Giovanni Earth Science Data Analysis Tool – (Production site) – bin/G3/gui.cgi?instance_id=MDSA-case1 (MDSA site) bin/G3/gui.cgi?instance_id=MDSA-case1 MDSA – (Project site) PML – (Inference Web) – (PML Primer, 2007) 23

Reference : Hyer, E. J., Reid, J. S., and Zhang, J., 2011: An over-land aerosol optical depth data set for data assimilation by filtering, correction, and aggregation of MODIS Collection 5 optical depth retrievals, Atmos. Meas. Tech., 4, , doi: /amt Title: MODIS Terra C5 AOD vs. Aeronet during Aug-Oct Biomass burning in Central Brazil, South America (General) Statement: Collection 5 MODIS AOD at 550 nm during Aug-Oct over Central South America highly over-estimates for large AOD and in non-burning season underestimates for small AOD, as compared to Aeronet; good comparisons are found at moderate AOD. Region & season characteristics: Central region of Brazil is mix of forest, cerrado, and pasture and known to have low AOD most of the year except during biomass burning season (Example) : (Title) Scatter plot of MODIS AOD and AOD at 550 nm vs. Aeronet from ref. (Hyer et al, 2011) (Description Caption) shows severe over-estimation of MODIS Col 5 AOD (dark target algorithm) at large AOD at 550 nm during Aug-Oct over Brazil. (Constraints) Only best quality of MODIS data (Quality =3 ) used. Data with large scattering angle ( > 170 deg) excluded. (Symbol description) Red Lines define regions of Expected Error (EE), Green is the fitted slope Results: Tolerance= 62% within EE; RMSE=0.212 ; r2=0.81; Slope=1.00 For Low AOD (<0.2) Slope=0.3 (i.e MODIS AOD= one third of Aeronet AOD) For high AOD (> 1.4) Slope=1.54 (Specific explanation) because of uncertainty introduced in AOD retrieval due to hot spot effect which is not taken into account in MODIS retrieval algorithm. Large positive bias in AOD estimate during biomass burning season may be due to wrong assignment of Aerosol absorbing characteristics (a constant Single Scattering Albedo ~ 0.91 is assigned for all seasons, true value is closer to ~ ) ( Dominating factors leading to Aerosol Estimate bias): 1.Large positive bias in AOD estimate during biomass burning season may be due to wrong assignment of Aerosol absorbing characteristics (a constant Single Scattering Albedo ~ 0.91 is assigned for all seasons, while the true value is closer to ~ ) [ Notes or exceptions: Biomass burning regions in Southern Africa do not show as large positive bias as in this case, it may be due to different optical characteristics or single scattering albedo of smoke particles, Aeronet observations of SSA confirm this ] 2. Low AOD is common in non burning season. In Low AOD cases, biases are highly dependent on lower boundary conditions. In general a negative bias is found due to uncertainty in Surface Reflectance Characterization which dominates if signal from atmospheric aerosol is low Aeronet AOD Central South America * Mato Grosso * Santa Cruz * Alta Floresta