Importance of accessible well documented and scientifically relevant data – what we learned from UARS and Aura Mark Schoeberl STC.

Slides:



Advertisements
Similar presentations
1 NASA CEOP Status & Demo CEOS WGISS-25 Sanya, China February 27, 2008 Yonsook Enloe.
Advertisements

Product Quality and Documentation – Recent Developments H. K. Ramapriyan Assistant Project Manager ESDIS Project, Code 423, NASA GFSC
P&G Supplier Portal Training
Data Quality Screening Service Christopher Lynnes, Bruce Vollmer, Richard Strub, Thomas Hearty Goddard Earth Sciences Data and Information Sciences Center.
Evaluation of Web Sites: What Works and What Doesn't Sue Ellen Hansen, Survey Research Operations, University of Michigan Matthew Richardson, ICPSR, University.
Designing Basic Web Sites Week 12 Technical Communication Fall 2003, DAHMEN.
1 ISO – Metadata Next Generation International consensus being built on structured metadata within a broader Geomatics Standard under ISO Technical.
Introduction to the Internet How did the Internet start? Why was the Internet developed? How does Internet handle the traffic? Why WWW changed the Internet.
2.2 Data Group Membership: Reeves (chair), Lloyd, Morse. Reports to SSC. This group will be responsible for preparation of the AMMA-UK Data Protocol, which.
Build a Thermometer Screen Design and Technology – Thermometer Screen Project Name___________ Date ___________.
Exploring STEM Careers Global Precipitation Measurement Mission Developed by the GPM Education Team NASA Goddard Space Flight Center.
HDF 1 NCSA HDF XML Activities Robert E. McGrath Mike Folk National Center for Supercomputing Applications.
User Services Experience GSFC DAAC MODIS Science Team Meeting Greenbelt, MD July 2002
Data Ingest Automation GHRC Status and Plans Helen Conover GHRC DAAC Operations Manager Presented at ESIP Summer Meeting 2015.
A. Minchella (RSAC c/o ESA-ESRIN) on behalf of ESA EOPI Team ESA Advanced Training Course in Land Remote Sensing, 2 July 2013, Athens, Greece Access to.
Orion Project Proposal HTML Tutorial Website. Define.
Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA.
AON Data Questionnaire Results 21 Respondents Last Updated 27 March 2007 First AON PI Meeting Scot Loehrer, Jim Moore.
Important ESDIS 2009 tasks review Kent Yang, Mike Folk The HDF Group April 1st, /1/20151Annual briefing to ESDIS.
Usability Issues Documentation J. Apostolakis for Geant4 16 January 2009.
BT Young Scientists & Technology Exhibition App Risk Management.
Giovanni for AQ Gregory Leptoukh NASA Goddard Space Flight Center Goddard Earth Sciences Data and Information Services Center (GES DISC)
Why do I want to know about HDF and HDF- EOS? Hierarchical Data Format for the Earth Observing System (HDF-EOS) is NASA's primary format for standard data.
EARTH SCIENCE MARKUP LANGUAGE Why do you need it? How can it help you? INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
Climate News Study A member of the general public has found a copy of the (attached) graphic on the New York Times website. They see the tag line as NSIDC.
® Kick off meeting. February 17th, 2011 QUAlity aware VIsualisation for the Global Earth Observation system of systems GEOVIQUA workshop February, the.
Planetary Science Archive PSA User Group Meeting #1 PSA UG #1  July 2 - 3, 2013  ESAC PSA Archiving Standards.
Lesson 8. Test 1 Topics Browser incompatibility Design Tips Site Navigation Browser- safe color Monitor resolution Content Copyright Use of tables vs.
Usability Issues Facing 21st Century Data Archives Joey Mukherjee and David Winningham
PDS Geosciences Node Page 1 Archiving Mars Mission Data Sets with the Planetary Data System Report to MEPAG Edward A. Guinness Dept. of Earth and Planetary.
Managing the Impacts of Change on Archiving Research Data A Presentation for “International Workshop on Strategies for Preservation of and Open Access.
September 4, 2003MODIS Ocean Data Products Workshop, Oregon State University1 Goddard Earth Sciences (GES) Distributed Active Archive Center (DAAC) MODIS.
ESIP Federation 2004 : L.B.Pham S. Berrick, L. Pham, G. Leptoukh, Z. Liu, H. Rui, S. Shen, W. Teng, T. Zhu NASA Goddard Earth Sciences (GES) Data & Information.
MODIS OCEAN QA Browse Imagery (MQABI Browse Tool) NASA Goddard Space Flight Center Sept 4, 2003
Using the Global Change Master Directory (GCMD) to Promote and Discover ESIP Data, Services, and Climate Visualizations Presented by GCMD Staff January.
MOON WALK A WebQuest for 3rd Grade Science Designed by Lu Ann Wendel Moon Walk Created by Lu Ann Wendel 8/03/04.
Internet for Teaching and Learning. Understanding the Web The Web is A collection of publicly accessible pages (web sites) on the Internet All use the.
GES DISC DAAC February 28, 2002HDF-EOS Workshop V1 The Goddard DAAC The Goddard DAAC Presented by:
Sciamachy features and usage with respect to end-users The typical fate of retrieval people dealing with large datasets… C. Frankenberg, SRON team, IUP.
NASA Herschel Science Center - page 1 NHSC Cycle 1 Open Time Proposal Planning Workshop 3-4 June 2010 PACS Accepted Programs & Observations to Date B.
NetCDF file generated from ASDC CERES SSF Subsetter ATMOSPHERIC SCIENCE DATA CENTER Conversion of Archived HDF Satellite Level 2 Swath Data Products to.
Jianchun Qin, Liguang Wu, Michael Theobald, A. K. Sharma, George Serafino, Sunmi Cho, Carrie Phelps NASA Goddard Space Flight Center, Code 902 Greenbelt,
11th OMI Science Team Meeting De Bilt, June The Aura Validation Data Center Bojan R. Bojkov.
Creating Successful Theme-based Web Pages Theme-based web sites provide users with a coherent in-depth narrative on a single topic Nancy N. Soreide NOAA/PMEL.
Page 1 Envisat Validation Workshop, Campaign Database, 12/12/2002 Envisat Validation Workshop Atmospheric Chemistry Validation Team Ground-Based Measurements.
Copyright 2010, The World Bank Group. All Rights Reserved. Testing and Documentation Part II.
Comments from User Services C. Boquist/Code 423 The HDF Group Meeting 1 April 2009.
SEO for Google in Hello I'm Dave Taylor from Webmedia.
Aura HDF-EOS File Format Guidelines: Overview and Status Cheryl Craig.
HDF and HDF-EOS Workshop November 30, 2006 The Homogenization and Reporting of Groundbased Atmospheric Datasets for the Validation.
HDF-EOS Workshop IV September 19-21, 2000 Richard E. Ullman ESDIS Information Architect NASA/ GSFC, Code 423.
2011 ACSI Survey Summary HDF/HDF-EOS Workshop Riverdale, MD April 18, 2012.
Proposal for a Global Network for Beam Instrumentation [BIGNET] BI Group Meeting – 08/06/2012 J-J Gras CERN-BE-BI.
SOLAS and the British Atmospheric Data Centre Charles Kilburn Anne De Rudder.
Greetings, my fellow researchers! I hope this message finds you well. I am writing to request your help in a great undertaking. I request you research.
From Missions to Measurements: an Ocean Discipline Experience.
Obtaining MISR Data Nancy Ritchey Atmospheric Sciences Data Center May 16, 2004.
Air pollutants, such as aerosols and various trace gases, are transported on a hemispheric or global scale. The Task Force on Hemispheric Transport of.
P&G Supplier Portal Training
“Land Surface Study” Scenario – Ted Strub’s Effort Search on “soil moisture brightness temperature” At the bottom of the first page of the results was.
1 Digital Object Identifiers Update ESIP Data Stewardship Committee Meeting May 16, 2016 Presenters: Nate James, ESDIS Lalit Wanchoo, ADNET Systems Inc.
The Big 6 Model for Effective Research While Researching specific topics and how they work you will be using the Big 6 Model for Effective Research to.
AIRS Meeting GSFC, February 1, 2002 ECS Data Pool Gregory Leptoukh.
NASA HDF and HDF-EOS Status Use in EOSDIS
NASA Earth Science Data Stewardship
P&G Supplier Portal Training
MERRA Data Access and Services
Materials Engineering Product Data Management (ePDM)
Getting Going in the Pulsar Search collaboratory (PSC)
Presentation transcript:

Importance of accessible well documented and scientifically relevant data – what we learned from UARS and Aura Mark Schoeberl STC

Why should we be motivated to provide accessible, well documented data Encourages instrument data analysis Leads to scientific improvement of the data Leads to new science discoveries Leads to new observational missions Leads to improvements in current data sets through inter-comparison Increases our general knowledge of the Earth System

Experience with Two Missions UARS – Aura – 2004-present

What was done with UARS UARS data was controlled by a strict protocol. Only the selected PIs could use the data. Protocol covered several years after launch. This upset and angered the community (more on next slide). UARS data was released very slowly and reluctantly. Data was archived on a UARS machine (VAX) with limited tools. The UARS machine was used to process data as well as distribute it. By today’s standards the data set was tiny. Data documentation was not emphasized and descriptions were finally made available through a special JGR issue that came out many years after launch. If you had problems with the data you had to call the PI. The protocol was dissolved a couple years after launch and the data was generally distributed – it is now available through the Goddard DISC. Validation – not much except for balloons, UARS PIs generally did not talk to aircraft or ground people. Archiving – when UARS was decommissioned we archived L1 and L2 data.

UARS: To release (data) or not to release, that is the question.. Releasing the data as soon as possible – Improves the data – Publicizes the data But, releasing the data too soon – Gives the team a bad reputation – Can produce bad science When to decide to release is difficult. You can’t wait forever.. You loose community and sponsor support, yet you want to release good data for your own reputation. My general rule: If the data is good enough to do some science – then release!

What was done with Aura Data was released within 6 months and some within a few weeks. Data was produced in a common format. – Data users group was formed to create common format guidelines, etc. Validation – Produced an extensive validation plan “The Big Book of Aura” which contained technical details on the instruments and algorithms for validators. – Instrument teams participated in validation activity (especially aircraft campaigns) to get familiar with the validation data and techniques and also to help the validators understand the satellite data – Provided a validation center (AVDC) that could segment the “commissioning phase” data over validation sites. Validators were required to share their data before they could get access to Aura data. AVDC also archived team meeting presentations and other documents. Documentation – Insisted on extensive documentation of products including accuracy and precision Access – Products had to be available at the end of the commissioning phase. – NASA data went mostly to the Goddard DISC (except TES). – Access to data was controlled by the DISC and NASA policy – free access to all data sets

Aura Organization Mission HIRDLS MLS OMI TES Mission HIRDLS MLS OMI TES Validation Working Group Data Working Group $$ Aircraft and Ground Based Programs NASA Science Programs Needs NASA Data Centers AVDC Validation Data DataData DataData

How well did it work? Pressure on teams to release as soon as possible. Some resistance to this. Having TES in a different archive system was stupid. – The Langley DAAC did not have the resources that the Goddard DISC has, and getting TES data was initially difficult. Development of the data documents was slow and painful for some investigators The validation missions could have been better targeted to instrument measurements. Part of this was NASA politics. Tried to achieve a balance between campaigns and long term measurements (SHADOZ, Ticosonde, NDACC)

Four Guidelines for Satellite Data from UARS and Aura 1)Data release and availability – Instrument data is released and available, not hidden behind bureaucratic protocols, and some kind of browse system exists. 2)Data format – data must be packed in easily readable self describing formats. 3)Data tools exist - browse, unpack, read 4)Data documentation – data must be well documented and data quality flags clearly explained.

(1) Data release and availability Release data as soon as you can... Data must be easily accessible - If users can’t find the data, they won’t use it – Data sets should be freely available (lesson from UARS) – Registering users: (if you insist on this) should be optional and a positive experience. A catalogue/browse systems should help the users understand the data – The GSFC Giovanni system provides an example of an excellent browse system including time series, image plots, multiple data types and download. But it is limited to atmospheric data. – The catalogue/browse system should link directly to data access ports Ordering and staging data granules using a “market basket” approach is useful only in browse mode – this approach now seems to me archaic. With the advent high speed access, all data should be made available through anonymous ftp sites or the equivalent.

(2) Data Formats/Gridding The variety of data formats and the different mapping systems is somewhat bewildering to the novice. – While NASA uses HDF, NOAA uses GRIB/BUFR and NCAR uses NetCDF, etc. No agreement on formatting standards within U.S. and Europe. – Aura mandated HDF5 as did AVDC. This was a good move although some validators objected until AVDC provided a conversion tool. – Nonetheless, variety formats are used even within EOS and this creates barriers to the data

(3) Data Tools The best data centers provide generalized unpacking and reading tools to extract data for use. – Part of the issue is that the tools must be provided in multiple languages: IDL, Fortran, C, MatLab.. others.* – The NASA DISC provides data read tools in multiple languages – Nevertheless data distributors should provide at least some tools to help users extract the data *GISS Panolpy is a good generalized browse/unpacking tool.

(4) Data documentation Aura experience is that documentation of the data set is key to scientific utility. – Should change as data products are reprocessed – it needs to be a living documents that includes new products, changes in precision, etc. – Documentation should be available on line as HTML. – Documentation should include recommendations to users on how to use the data – or how they might misuse the data. – Documentation should include reference to publications on the data, who to contact with questions, etc. – Readme files – how to use the data need to be up to date – Data Quality flags or equivalent (accuracy, precision) are critical to prevent misuse. Flags or screening data needs to be clear Example of how to use flags would be helpful

Let’s do some example reviews

Score sheet developed from these ideas InstrumentMLSMIPASOMIIASI Available Format/Grid Tools Documentation Total 0 = not available 1 = inadequate 2 = adequate 3 = good but could be improved 4 = excellent

Products nicely labeled Data access clear Data Tools – Browse through Giovanni Excellent documentation MLS CO

Click on “order data” Click on CO Info Description of CO Data Click on “Reader”s

Score Sheet 0 = not available 1 = inadequate 2 = adequate 3 = good but could be improved 4 = excellent InstrumentMLSMIPASGOME IIVIIRS Available4 Simply formatted4 Tools3 Documentation4 Total15

MIPAS CO Availability of MIPAS data MIPAS Earthnet Online – Need to register with ESA and create a research project NERC Earth Observations – NERC web page has “get MIPAS data” – Need to log on to NEODC, no instructions how – Multiple groups working on MIPAS leads to some confusion on the data sets (UK, IFAC, KIT) – Data tools – BEAT and VISAN – extensive read tools but somewhat confusing – Browse tool :

MIPAS Browse Not bad but really limited

Overall Impression of MIPAS No clear path through multiple web information sources – excessive documentation in some places under documentation in others – not clear how these link Bureaucratic barriers to getting the data Some data files in non standard format – requires specialized readers – different readers from different groups Browse system ok but limited to last two years – Varying scales of browse imagery is confusing

Score Sheet 0 = not available 1 = inadequate 2 = adequate 3 = good but could be improved 4 = excellent InstrumentMLSMIPASGOME IIIASI Available42 Simply formatted42 Tools33 Documentation43 Total1510

GOME 2 NO 2 (TEMIS) One click download Reader tool User manual (refers to Sciamachy)

Overall Impression of GOME II TEMIS Good browse Data available via ftp Inconsistent formats (NO 2 is in HDF, CH 2 O is in ASCII) Documentation adequate, but too brief

Score Sheet 0 = not available 1 = inadequate 2 = adequate 3 = good but could be improved 4 = excellent InstrumentMLSMIPASGOME II (TEMIS) IASI Available424 Simply formatted 423 Tools322 Documentation432 Total15911

IASI Want to look at CO, compare to AIRS & TES Data web page Only Level 1c as EPS -> HDF5 Requires NEODC registration

IASI – cont. Registered with NEODC

IASI Cont. Link Broken ARGHHH!!!

Score Sheet 0 = not available 1 = inadequate 2 = adequate 3 = good but could be improved 4 = excellent InstrumentMLSMIPASGOME II (TEMIS) IASI Available4240 Simply formatted 4232 Tools3220 Documentation4322 Total199114

Lessons Learned and Recommendations 1)Data availability – data must be easily available for download – get rid of registration, use anon. FTP site with files organized logically (product, year, month, day, swaths). 2)Data format – data must be packed in easily readable self describing formats - use NetCDF or equivalent, no do-it-yourself formats. 3)Provide data tools (unpack and read tools at minimum). 4)Data must be well documented and data quality flags obvious.

Another General Recommendation Form a data product utilization advisory group. – Provide the data center with advice/feedback on web presence and accessibility – Provide data center advice on priority products – Be a comprehensive data product review and test group Who should be in this group – Scientists familiar with products – but not this instrument – Scientists from adjacent fields facing similar problems – Other data users

Grazie