Developing a NetCDF-4 Interface to HDF5 Data Russ Rew (PI), UCAR Unidata Mike Folk (Co-PI), NCSA/UIUC Ed Hartnett, UCAR Unidata Quincey Kozial, NCSA/UIUC.

Slides:



Advertisements
Similar presentations
Expanding Regridding Capabilities of the Earth System Modeling Framework Andrew Scholbrock University of Colorado – Boulder Robert Oehmke NOAA/CIRES 1.
Advertisements

A Common Data Model In the Middle Tier Enabling Data Access in Workflows … HDF/HDF-EOS Workshop XIV September 29, 2010 Doug Lindholm Laboratory for Atmospheric.
Streaming NetCDF John Caron July What does NetCDF do for you? Data Storage: machine-, OS-, compiler-independent Standard API (Application Programming.
The Future of NetCDF Russ Rew UCAR Unidata Program Center Acknowledgments: John Caron, Ed Hartnett, NASA’s Earth Science Technology Office, National Science.
ARCS Data Analysis Software An overview of the ARCS software management plan Michael Aivazis California Institute of Technology ARCS Baseline Review March.
University of Illinois at Urbana-ChampaignHDF 1McGrath/Yang 2/27/02 Transitioning from HDF4 to HDF5 Robert E. McGrath Kent Yang.
NetCDF Ed Hartnett Unidata/UCAR
HDF 1 NCSA HDF XML Activities Robert E. McGrath Mike Folk National Center for Supercomputing Applications.
1 CF Unleashed: Introduction to Cf/Radial Joe VanAndel National Center for Atmospheric Research 2013/1/8 The National Center for Atmospheric.
Status of netCDF-3, netCDF-4, and CF Conventions Russ Rew Community Standards for Unstructured Grids Workshop, Boulder
CEOS/WGISS 20, Kyev, September 12-16, WTF-CEOP Implementation Plan #1 Status (WTF-CEOP first prototype, by JAXA) September 12, 2005 Osamu Ochiai.
05 December, 2002HDF & HDF-EOS Workshop VI1 SEEDS Standards Process Richard Ullman SEEDS Standards Formulation Team Lead
Developing a NetCDF-4 Interface to HDF5 Data
EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
NetCDF-4 The Marriage of Two Data Formats Ed Hartnett, Unidata June, 2004.
COM vs. CORBA Computer Science at Azusa Pacific University September 19, 2015 Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department.
1 Russ Rew, Ed Hartnett, John Caron UCAR Unidata Program Center Mike Folk, Robert McGrath, Quincey Kozial NCSA and The HDF Group, Inc. Final Project Review,
1 High level view of HDF5 Data structures and library HDF Summit Boeing Seattle September 19, 2006.
HDF5 A new file format & software for high performance scientific data management.
DM_PPT_NP_v01 SESIP_0715_AJ HDF Product Designer Aleksandar Jelenak, H. Joe Lee, Ted Habermann Gerd Heber, John Readey, Joel Plutchak The HDF Group HDF.
Computational Design of the CCSM Next Generation Coupler Tom Bettge Tony Craig Brian Kauffman National Center for Atmospheric Research Boulder, Colorado.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Unidata TDS Workshop TDS Overview – Part I XX-XX October 2014.
Weathertop Consulting, LLC Wednesday, January 14, 2009 IIPS 11A.2 1 A General Purpose System for Server-side Analysis of Earth Science Data Roland Schweitzer.
Mid-Course Review: NetCDF in the Current Proposal Period Russ Rew
HDF Converting between HDF4 and HDF5 MuQun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University of Illinois,
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
SciDAC All Hands Meeting, March 2-3, 2005 Northwestern University PIs:Alok Choudhary, Wei-keng Liao Graduate Students:Avery Ching, Kenin Coloma, Jianwei.
March 17, 2006CIP Status Meeting March 17, 2006 Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration Project Report at CIP AG Meeting.
A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
NetCDF Data Model Issues Russ Rew, UCAR Unidata NetCDF 2010 Workshop
The HDF Group Support for NPP/NPOESS by The HDF Group Mike Folk, Elena Pourmal, Peter Cao The HDF Group November 5, 2009 November 3-5,
User Working Group 2013 Data Access Mechanisms – Status 12 March 2013
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
The CF Conventions: Options for Sustained Support Involving Unidata Russ Rew Unidata Policy Committee May 12, 2008.
The HDF Group Data Interoperability The HDF Group Staff Sep , 2010HDF/HDF-EOS Workshop XIV1.
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package MuQun Yang, Christian Chilan, Albert Cheng, Quincey Koziol, Mike.
NetCDF-4: Software Implementing an Enhanced Data Model for the Geosciences Russ Rew, Ed Hartnett, and John Caron UCAR Unidata Program, Boulder
NetCDF and Scientific Data Durability Russ Rew, UCAR Unidata ESIP Federation Summer Meeting
Aura HDF-EOS File Format Guidelines: Overview and Status Cheryl Craig.
Data File Formats: netCDF by Tom Whittaker University of Wisconsin-Madison SSEC/CIMSS 2009 MUG Meeting June, 2009.
Advances in the NetCDF Data Model, Format, and Software Russ Rew Coauthors: John Caron, Ed Hartnett, Dennis Heimbigner UCAR Unidata December 2010.
HDF-EOS Workshop IV September 19-21, 2000 Richard E. Ullman ESDIS Information Architect NASA/ GSFC, Code 423.
ESMF,WRF and ROMS. Purposes Not a tutorial Not a tutorial Educational and conceptual Educational and conceptual Relation to our work Relation to our work.
SDM Center High-Performance Parallel I/O Libraries (PI) Alok Choudhary, (Co-I) Wei-Keng Liao Northwestern University In Collaboration with the SEA Group.
SDM Center Parallel I/O Storage Efficient Access Team.
SPDF Science Advisory Group - September 29-30, 2005 Page 12/24/2016 9:09:48 PM Services of the Space Physics Data Facility (SPDF) / Sun-Earth Connection.
CF 2.0 Coming Soon? (Climate and Forecast Conventions for netCDF) Ethan Davis ESO Developing Standards - ESIP Summer Mtg 14 July 2015.
1 2.5 DISTRIBUTED DATA INTEGRATION WTF-CEOP (WGISS Test Facility for CEOP) May 2007 Yonsook Enloe (NASA/SGT) Chris Lynnes (NASA)
Lidar Radar Open Software Environment LROSE Mike Dixon Earth Observing Laboratory (EOL) National Center for Atmospheric Research (NCAR) Boulder, Colorado.
Development of a CF Conventions API Russ Rew GO-ESSP Workshop, LLNL
NetCDF: Data Model, Programming Interfaces, Conventions and Format Adapted from Presentations by Russ Rew Unidata Program Center University Corporation.
Update on Unidata Technologies for Data Access Russ Rew
Unidata Infrastructure for Data Services Russ Rew GO-ESSP Workshop, LLNL
Libcf – A CF Convention Library for NetCDF Ed Hartnett Unidata Program Center Boulder Colorado June 11, 2007.
NetCDF Data Model Details Russ Rew, UCAR Unidata NetCDF 2009 Workshop
Other Projects Relevant (and Not So Relevant) to the SODA Ideal: NetCDF, HDF, OLE/COM/DCOM, OpenDoc, Zope Sheila Denn INLS April 16, 2001.
NetCDF-Java version 2.2 Common Data Model John Caron Unidata/UCAR Dec 10, 2004.
Moving from HDF4 to HDF5/netCDF-4
SRNWP Interoperability Workshop
NetCDF 3.6: What’s New Russ Rew
Plans for an Enhanced NetCDF-4 Interface to HDF5 Data
Tad Scheiblich RSI December 2, 2005
Access HDF5 Datasets via OPeNDAP’s Data Access Protocol (DAP)
Requirements on GSICS Plotting Tool to support VISNIR products
Status for Endeavor 6: Improved Scientific Data Access Infrastructure
Hierarchical Data Format (HDF) Status Update
Masaya Takahashi Japan Meteorological Agency
Presentation transcript:

Developing a NetCDF-4 Interface to HDF5 Data Russ Rew (PI), UCAR Unidata Mike Folk (Co-PI), NCSA/UIUC Ed Hartnett, UCAR Unidata Quincey Kozial, NCSA/UIUC John Caron, UCAR Unidata Robert E. McGrath, NCSA/UIUC NASA award AIST

2 Unidata: A Community Endeavor Community of educators and researchers at 120 universities, 30 other institutions, international in scope Managed by the University Corporation for Atmospheric Research Mission: providing data, tools, support, and community leadership for enhanced earth-system education and research Atmospheric science community, expanding to oceanography, hydrology, other geosciences Unidata Program Center: 25 staff, 15 developers

3 Overview What is netCDF? What is HDF5? Why develop a netCDF interface to HDF5? What is the current project status? What still needs to be done? Do we have the necessary resources? What are the prospects for success?

4 NetCDF-3 and HDF5 Standard Data Models for scientific data and data abstractions Standard Interfaces between data providers and data users Standard Libraries for data access from various languages Standard Formats for portable binary data Users need not know about the format Ad hoc standards are useful standards

5 Data Models netCDF-3HDF5 VariablesDatasets DimensionsDataspaces Attributes Coordinates Element typesDatatypes Groups Links References Property Lists

6 Libraries netCDF-3HDF5 one interface levelhigh- and low-level interfaces serial I/Oserial. parallel (MPI) I/O C, C++ Fortran-77, -90Fortran-90 Java (pure)Java (native) Perl Python Ruby IDL Matlab...

7 Formats netCDF-3HDF5 XDRXDR and native direct access efficiently extendible 32-bit file offsets64-bit file offsets chunked access compound structures nested structures compression efficient schema changes virtual file I/O layer

8 Other Characterisitics NetCDF-3HDF5 Availabilityfree Development and maintenance UCAR UnidataNCSA HDF Group Primary fundingNSFNASA, DOE ASCI Advantages popular, simple, lots of tools, multiple implementations powerful, high- performance, storage efficiency, extensibility Primary uses climate, forecast, ocean models, data archives satellite data, computational fluid dynamics, parallel computing

9 Goals of NetCDF/HDF Combination Create netCDF-4, combining desirable characteristics of netCDF-3 and HDF5, while taking advantage of their separate strengths Widespread use and simplicity of netCDF-3 Generality and performance of HDF5 Make netCDF more suitable for high- performance computing Provide simple high-level interface for HDF5 Demonstrate benefits of combination in advanced Earth science modeling efforts

10 NetCDF-4 Features Enabled by HDF5 Large file support Parallel I/O Multiple dynamic dimensions Packed data, compression New data types Dynamic schema modifications Other possibilities: groups, user-defined types, better coordinate support, …

11 Approach Implement netCDF-3 over HDF5, to demonstrate backward compatibility with Programming interface Format Design netCDF-4 interface Implement netCDF-4 over HDF5 to add enhancements made possible with HDF5 Foster continued collaboration between Unidata and NCSA in design, development, testing, and support

12 NetCDF-4 Architecture Access to netCDF-3, netCDF-4, and HDF5 data created through netCDF-4 interface HDF5 Library netCDF-4 Library netCDF-3 Interface

13 User View of NetCDF-4 NetCDF-4 library accesses either the netCDF-3 or HDF5 library to read or write data

14 Current Technical Status Implement netCDF-3 over HDF5, to demonstrate backward compatibility with API and format done Determine needed HDF5 enhancementsdone Prepare netCDF-3 for incorporation with netCDF-4 nearly done Design netCDF-4 interface to add enhancements made possible with HDF5 in progress Implement needed HDF5 enhancements in progress Implement netCDF-4 over enhanced HDF5 not started yet

15 NetCDF-3 Interface Using HDF5 13,000 lines of C code Passes all netCDF-3 tests Demonstrates HDF5 practical for netCDF-4 Identifies HDF5 enhancements needed Shows read/write times and file sizes satisfactory Validates approach to backward compatibility API compatibility: only recompilation and relinking needed for existing netCDF-4 programs Format compatibility: accesses all current netCDF files as well as new HDF5 files transparently

16 NetCDF-3 Enhancements for NetCDF-4 To provide stable foundation for incorporating netCDF-4 smooth transition for current users Automated multi-platform testing Documentation converted to maintainable form, new language-independent Users Guide Added large file support with backward compatibility Added default format interfaces Better Windows and.Net support

17 HDF5 Additions for Supporting NetCDF- 4 HDF5 enhancements numeric type conversions zero-dimensional datasets overflow handling improvements flexible parallel I/O HDF5 design specifications dimension scales for coordinate systems shared object proposal

18 Project Schedule July 2004: version revised documentation, 64-bit file offsets, default format functions October 2004: version use of autotools January 2005: version 3.7.1: netCDF-4 prototype included, support for multiple unlimited dimensions March 2005: version 4.0.0_beta - test relelase July 2005: version first netCDF-4 production release Currently on schedule for a July 2005 release

19 NetCDF-4 Design Issues Issue: support for coordinate systems in netCDF and HDF5 data models? under consideration Issue: addition of HDF5 Groups abstraction to netCDF data model? yes, tentatively subset of HDF5 Group features constrained by backward compatibility with netCDF-3 no Group aliases but try to support Variable aliases and Dimension scoping? Issue: can we just adopt Northwestern/Argonne pnetCDF interface for adding parallel I/O?

20 What remains to be done? Next for netCDF-4: interface additions for multiple unlimited dimensions, group interfaces, dynamic schema modification, new data types, packed data, parallel I/O, compression HDF5 enhancements zero-length attributes shared dimensions creation order access for objects Testing in models (CCSM, WRF, ESMF,...)

21 NetCDF/HDF Budget Funding status: Funding received to-date: $349,496 Funding Expected: $699,793 Variance: $350,297 (to carry us through program) Expenditures: As of May 31: $193,393 Committed but not cleared: $137,715 (NCSA sub- award) Total expenditures: $333,305 Funds remaining: $16,191

22 NetCDF/HDF Expenditures May May % 0% 42% 1% 16% 13% 2% *

23 Budget Notes Budgeted SBO rate is about $13,900 per month Actual SBO rate estimated at $14,700 per month Remaining SBO budget of $207,000 will fund us through July 2005 (without a student) Given late start (July 2003 at Unidata, December 2003 at NCSA), will request no cost extension * Equipment was purchased for this project prior to receipt of contract, applying for exception to transfer expenditure

24 Papers, Posters, Presentations 1. R. Rew, M. Folk, E. Hartnett, and R. McGrath: Plans for an Enhanced NetCDF-4 Interface to HDF5 Data. HDF/HDF-EOS Workshop VII, Silver Springs, September Poster and presentation. 2. M. Folk, R. Rew, K. Yang, R. McGrath: NetCDF-4: Combining netCDF and HDF5 Data. AGU Fall Meeting, San Francisco, December Poster. 3. R. Rew and E. Hartnett: Merging NetCDF and HDF5. 20th International Conference on Interactive Information Processing Systems (IIPS) for Meteorology, Oceanography, and Hydrology, Seattle, January Paper and poster. 4. E. Hartnett: Merging the NetCDF and HDF5 Libraries to Achieve Gains in Performance and Interoperability Earth Science Technology Conference, Palo Alto, June Paper and presentation.

25 Excellent Prospects for Success More software engineering than research NetCDF-4 web site just announced: Unidata and NCSA developers collaborating via , teleconferences On schedule for July 2005 release: e.html Great interest in status of project! Ultimate goal to make earth science researchers more productive...

26 Questions? ? ? ? ? ? ? ?