Presentation is loading. Please wait.

Presentation is loading. Please wait.

Developing a NetCDF-4 Interface to HDF5 Data

Similar presentations


Presentation on theme: "Developing a NetCDF-4 Interface to HDF5 Data"— Presentation transcript:

1 Developing a NetCDF-4 Interface to HDF5 Data
Russ Rew (PI), UCAR Unidata Mike Folk (Co-PI), NCSA/UIUC Ed Hartnett, UCAR Unidata Quincey Kozial, NCSA/UIUC John Caron, UCAR Unidata Robert E. McGrath, NCSA/UIUC NASA award AIST

2 Unidata: A Community Endeavor
Community of educators and researchers at 120 universities, 30 other institutions, international in scope Managed by the University Corporation for Atmospheric Research Mission: providing data, tools, support, and community leadership for enhanced earth-system education and research Atmospheric science community, expanding to oceanography, hydrology, other geosciences Unidata Program Center: 25 staff, 15 developers

3 Overview What is netCDF? What is HDF5?
Why develop a netCDF interface to HDF5? What is the current project status? What still needs to be done? Do we have the necessary resources? What are the prospects for success?

4 Ad hoc standards are useful standards
NetCDF-3 and HDF5 Ad hoc standards are useful standards Standard Data Models for scientific data and data abstractions Standard Interfaces between data providers and data users Standard Libraries for data access from various languages Standard Formats for portable binary data Users need not know about the format

5 Data Models netCDF-3 HDF5 Variables Datasets Dimensions Dataspaces
Attributes Coordinates Element types Datatypes Groups Links References Property Lists

6 Libraries ... netCDF-3 HDF5 one interface level
high- and low-level interfaces serial I/O serial. parallel (MPI) I/O C, C++ Fortran-77, -90 Fortran-90 Java (pure) Java (native) Perl Python Ruby IDL Matlab ...

7 Formats netCDF-3 HDF5 XDR XDR and native direct access
efficiently extendible 32-bit file offsets 64-bit file offsets chunked access compound structures nested structures compression efficient schema changes virtual file I/O layer Purpose: insulate users, applications, and data from machine architectures, format details, additions to data unlike XML, appendable, directly accessible

8 Other Characterisitics
NetCDF-3 HDF5 Availability free Development and maintenance UCAR Unidata NCSA HDF Group Primary funding NSF NASA, DOE ASCI Advantages popular, simple, lots of tools, multiple implementations powerful, high-performance, storage efficiency, extensibility Primary uses climate, forecast, ocean models, data archives satellite data, computational fluid dynamics, parallel computing

9 Goals of NetCDF/HDF Combination
Create netCDF-4, combining desirable characteristics of netCDF-3 and HDF5, while taking advantage of their separate strengths Widespread use and simplicity of netCDF-3 Generality and performance of HDF5 Make netCDF more suitable for high-performance computing Provide simple high-level interface for HDF5 Demonstrate benefits of combination in advanced Earth science modeling efforts

10 NetCDF-4 Features Enabled by HDF5
Large file support Parallel I/O Multiple dynamic dimensions Packed data, compression New data types Dynamic schema modifications Other possibilities: groups, user-defined types, better coordinate support, …

11 Approach Implement netCDF-3 over HDF5, to demonstrate backward compatibility with Programming interface Format Design netCDF-4 interface Implement netCDF-4 over HDF5 to add enhancements made possible with HDF5 Foster continued collaboration between Unidata and NCSA in design, development, testing, and support

12 NetCDF-4 Architecture HDF5 Library netCDF-4 Library netCDF-3 Interface Access to netCDF-3, netCDF-4, and HDF5 data created through netCDF-4 interface

13 User View of NetCDF-4 NetCDF-4 library accesses either the netCDF-3 or HDF5 library to read or write data

14 Current Technical Status
Implement netCDF-3 over HDF5, to demonstrate backward compatibility with API and format done Determine needed HDF5 enhancements Prepare netCDF-3 for incorporation with netCDF-4 nearly done Design netCDF-4 interface to add enhancements made possible with HDF5 in progress Implement needed HDF5 enhancements Implement netCDF-4 over enhanced HDF5 not started yet

15 NetCDF-3 Interface Using HDF5
13,000 lines of C code Passes all netCDF-3 tests Demonstrates HDF5 practical for netCDF-4 Identifies HDF5 enhancements needed Shows read/write times and file sizes satisfactory Validates approach to backward compatibility API compatibility: only recompilation and relinking needed for existing netCDF-4 programs Format compatibility: accesses all current netCDF files as well as new HDF5 files transparently A goal for upward compatibility is to provide some of these benefits by merely recompiling existing netCDF applications.

16 NetCDF-3 Enhancements for NetCDF-4
To provide stable foundation for incorporating netCDF-4 smooth transition for current users Automated multi-platform testing Documentation converted to maintainable form, new language-independent Users Guide Added large file support with backward compatibility Added default format interfaces Better Windows and .Net support

17 HDF5 Additions for Supporting NetCDF- 4
HDF5 enhancements numeric type conversions zero-dimensional datasets overflow handling improvements flexible parallel I/O HDF5 design specifications dimension scales for coordinate systems shared object proposal

18 Currently on schedule for a July 2005 release
Project Schedule Currently on schedule for a July 2005 release July 2004: version revised documentation, 64-bit file offsets, default format functions October 2004: version use of autotools January 2005: version 3.7.1: netCDF-4 prototype included, support for multiple unlimited dimensions March 2005: version 4.0.0_beta - test relelase July 2005: version first netCDF-4 production release

19 NetCDF-4 Design Issues Issue: support for coordinate systems in netCDF and HDF5 data models? under consideration Issue: addition of HDF5 Groups abstraction to netCDF data model? yes, tentatively subset of HDF5 Group features constrained by backward compatibility with netCDF-3 no Group aliases but try to support Variable aliases and Dimension scoping? Issue: can we just adopt Northwestern/Argonne pnetCDF interface for adding parallel I/O?

20 What remains to be done? Next for netCDF-4: interface additions for multiple unlimited dimensions, group interfaces, dynamic schema modification, new data types, packed data, parallel I/O, compression HDF5 enhancements zero-length attributes shared dimensions creation order access for objects Testing in models (CCSM, WRF, ESMF, ...)

21 Papers, Posters, Presentations
R. Rew, M. Folk, E. Hartnett, and R. McGrath: Plans for an Enhanced NetCDF-4 Interface to HDF5 Data. HDF/HDF-EOS Workshop VII, Silver Springs, September Poster and presentation. M. Folk, R. Rew, K. Yang, R. McGrath: NetCDF-4: Combining netCDF and HDF5 Data. AGU Fall Meeting, San Francisco, December Poster. R. Rew and E. Hartnett: Merging NetCDF and HDF5. 20th International Conference on Interactive Information Processing Systems (IIPS) for Meteorology, Oceanography, and Hydrology, Seattle, January Paper and poster. E. Hartnett: Merging the NetCDF and HDF5 Libraries to Achieve Gains in Performance and Interoperability Earth Science Technology Conference, Palo Alto, June Paper and presentation.

22 Excellent Prospects for Success
More software engineering than research NetCDF-4 web site just announced: Unidata and NCSA developers collaborating via , teleconferences On schedule for July 2005 release: e.html Great interest in status of project! Ultimate goal to make earth science researchers more productive ... also, lawyers have settled intellectual property issues

23 Questions? ? ? ? ? ? ? ?


Download ppt "Developing a NetCDF-4 Interface to HDF5 Data"

Similar presentations


Ads by Google