NetCDF Data Model Details Russ Rew, UCAR Unidata NetCDF 2009 Workshop 2009-08-04.

Slides:



Advertisements
Similar presentations
Pointers.
Advertisements

Lists and the Collection Interface Chapter 4. Chapter Objectives To become familiar with the List interface To understand how to write an array-based.
CS 11 C track: lecture 7 Last week: structs, typedef, linked lists This week: hash tables more on the C preprocessor extern const.
Recent Work in Progress
ESCI/CMIP5 Tools - Jeudi 2 octobre CMIP5 Tools Earth System Grid-NetCDF4- CMOR2.0-Gridspec-Hyrax …
Streaming NetCDF John Caron July What does NetCDF do for you? Data Storage: machine-, OS-, compiler-independent Standard API (Application Programming.
The Future of NetCDF Russ Rew UCAR Unidata Program Center Acknowledgments: John Caron, Ed Hartnett, NASA’s Earth Science Technology Office, National Science.
Making earth science data more accessible: experience with chunking and compression Russ Rew January rd Annual AMS Meeting Austin, Texas.
NetCDF An Effective Way to Store and Retrieve Scientific Datasets Jianwei Li 02/11/2002.
Lists and the Collection Interface Chapter 4. Chapter 4: Lists and the Collection Interface2 Chapter Objectives To become familiar with the List interface.
HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September
Object Oriented Databases - Overview
Data Types.
C++ fundamentals.
Introduction to NetCDF Russ Rew, UCAR Unidata ICTP Advanced School on High Performance and Grid Computing 13 April 2011.
Status of netCDF-3, netCDF-4, and CF Conventions Russ Rew Community Standards for Unstructured Grids Workshop, Boulder
Language Evaluation Criteria
Object-Oriented Analysis and Design Iterative Development and the Unified Process.
Developing a NetCDF-4 Interface to HDF5 Data
1 Writing NetCDF Files: Formats, Models, Conventions, and Best Practices Russ Rew, UCAR Unidata June 28, 2007.
Introduction to NetCDF4 MuQun Yang The HDF Group 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD.
Developing a NetCDF-4 Interface to HDF5 Data Russ Rew (PI), UCAR Unidata Mike Folk (Co-PI), NCSA/UIUC Ed Hartnett, UCAR Unidata Quincey Kozial, NCSA/UIUC.
1 Russ Rew, Ed Hartnett, John Caron UCAR Unidata Program Center Mike Folk, Robert McGrath, Quincey Kozial NCSA and The HDF Group, Inc. Final Project Review,
HDF5 A new file format & software for high performance scientific data management.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
A Domain-Specific Modeling Language for Scientific Data Composition and Interoperability Hyun ChoUniversity of Alabama at Birmingham Jeff GrayUniversity.
Mid-Course Review: NetCDF in the Current Proposal Period Russ Rew
December 1, 2005HDF & HDF-EOS Workshop IX P eter Cao, NCSA December 1, 2005 Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration.
The HDF Group HDF5 Datasets and I/O Dataset storage and its effect on performance May 30-31, 2012HDF5 Workshop at PSI 1.
HDF 1 New Features in HDF Group Revisions HDF and HDF-EOS Workshop IX November 30, 2005.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?
Data Types. Data types Data type tells the type of data, that you are going to store in memory. It gives the information to compiler that how much memory.
Integrating netCDF and OPeNDAP (The DrNO Project) Dr. Dennis Heimbigner Unidata Go-ESSP Workshop Seattle, WA, Sept
Advanced Utilities Extending ncgen to support the netCDF-4 Data Model Dr. Dennis Heimbigner Unidata netCDF Workshop August 3-4, 2009.
COP4020 Programming Languages Names, Scopes, and Bindings Prof. Xin Yuan.
1 HDF5 Life cycle of data Boeing September 19, 2006.
NetCDF Data Model Issues Russ Rew, UCAR Unidata NetCDF 2010 Workshop
Chapter 6 Introduction to Defining Classes. Objectives: Design and implement a simple class from user requirements. Organize a program in terms of a view.
Java Basics Opening Discussion zWhat did we talk about last class? zWhat are the basic constructs in the programming languages you are familiar.
The HDF Group Data Interoperability The HDF Group Staff Sep , 2010HDF/HDF-EOS Workshop XIV1.
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
July 20, Update on the HDF5 standardization effort Elena Pourmal, Mike Folk The HDF Group July 20, 2006 SPG meeting, Palisades, NY.
NetCDF-4: Software Implementing an Enhanced Data Model for the Geosciences Russ Rew, Ed Hartnett, and John Caron UCAR Unidata Program, Boulder
NetCDF and Scientific Data Durability Russ Rew, UCAR Unidata ESIP Federation Summer Meeting
Advances in the NetCDF Data Model, Format, and Software Russ Rew Coauthors: John Caron, Ed Hartnett, Dennis Heimbigner UCAR Unidata December 2010.
C LANGUAGE Characteristics of C · Small size
11/8/2007HDF and HDF-EOS Workshop XI, Landover, MD1 Software to access HDF5 Datasets via OPeNDAP MuQun Yang, Hyo-Kyung Lee The HDF Group.
UniMAP Sem2-10/11 DKT121: Fundamental of Computer Programming1 Arrays.
Variables reference, coding, visibility. Rules for making names  permitted character set  maximum length, significant length  case sensitivity  special.
A Draft Standard for the CF Metadata Conventions Russ Rew, Unidata GO-ESSP 2009 Workshop
Unidata Technologies Relevant to GO-ESSP: An Update Russ Rew
CF 2.0 Coming Soon? (Climate and Forecast Conventions for netCDF) Ethan Davis ESO Developing Standards - ESIP Summer Mtg 14 July 2015.
Developing Conventions for netCDF-4 Russ Rew, UCAR Unidata June 11, 2007 GO-ESSP.
Development of a CF Conventions API Russ Rew GO-ESSP Workshop, LLNL
Update on Unidata Technologies for Data Access Russ Rew
The HDF Group Introduction to HDF5 Session 7 Datatypes 1 Copyright © 2010 The HDF Group. All Rights Reserved.
Utilities for netCDF-4 Dr. Dennis Heimbigner Unidata Advanced netCDF Workshop July 25, 2011.
Unidata Infrastructure for Data Services Russ Rew GO-ESSP Workshop, LLNL
Libcf – A CF Convention Library for NetCDF Ed Hartnett Unidata Program Center Boulder Colorado June 11, 2007.
KUKUM-06/07 EKT120: Computer Programming 1 Week 6 Arrays-Part 1.
Copyright © 2010 The HDF Group. All Rights Reserved1 Data Storage and I/O in HDF5.
Other Projects Relevant (and Not So Relevant) to the SODA Ideal: NetCDF, HDF, OLE/COM/DCOM, OpenDoc, Zope Sheila Denn INLS April 16, 2001.
Moving from HDF4 to HDF5/netCDF-4
NetCDF 3.6: What’s New Russ Rew
Plans for an Enhanced NetCDF-4 Interface to HDF5 Data
Unidata & NetCDF BoF Scientific File Formats
MSIS 670 Object-Oriented Software Engineering
Status for Endeavor 6: Improved Scientific Data Access Infrastructure
Introduction to Data Structure
Presentation transcript:

NetCDF Data Model Details Russ Rew, UCAR Unidata NetCDF 2009 Workshop

NetCDF and HDF5 Data Models The netCDF classic data model: simple and flat  Dimensions  Variables  Attributes The netCDF enhanced data model added  More primitive types  Hierarchical groups  User-defined datatypes  Multiple unlimited dimensions The HDF5 data model also has  Hard- and soft-links (providing multiple names for things)  User-defined primitive datatypes  References (pointers to objects and data regions in a file)  Attributes attached to user-defined types  A few other miscellaneous features

The Enhanced NetCDF Data Model Additions to classic netCDF data model Still a subset of HDF5 data model Made possible by adding a few things to HDF5 so netCDF could fit within it Criteria for additions to classic model: handling identified classic limitations HDF5 netCDF enhanced netCDF classic

Classic netCDF data model A file has variables, dimensions, and attributes. Variables also have attributes. Variables may share dimensions, indicating a common grid. One dimension may be of unlimited length. Dimension name: String length: int isUnlimited( ) Attribute name: String type: DataType values: 1D array Variable name: String shape: Dimension[ ] type: DataType array: read( ), … File location: Filename create( ), open( ), … Variables and attributes use one of six primitive data types. DataType PrimitiveType char byte short int float double

Variables versus Attributes For data May be too large for memory May be multidimensional Support partial access Individual values may be changed More data may be appended May have associated attributes Shape specified with shared dimensions Intended for metadata For single values, strings, or small 1-D arrays Accessed atomically (written or read all at once) Typically values don’t change after creation May not have attributes Length specified when created Characteristics of variables:Characteristics of attributes:

Characteristics of the classic data model Strengths  Simple to explain  Good for discussing data representation issues  Efficient implementation is possible  Writing generic applications is practical  For gridded data, good data representations available  Shared dimensions are useful Weaknesses  Multiple variable-length data structures hard to represent  Additional conventions required for earth science, e.g. coordinate systems  Lacks compound data structures  Lacks nested data structures

Enhanced netCDF data model, for netCDF-4 A file has a top-level unnamed group. Each group may contain one or more named subgroups, user-defined types, variables, dimensions, and attributes. Variables also have attributes. Variables may share dimensions, indicating a common grid. One or more dimensions may be of unlimited length. Dimension name: String length: int isUnlimited( ) Attribute name: String type: DataType values: 1D array Variable name: String shape: Dimension[ ] type: DataType array: read( ), … Group name: String File location: Filename create( ), open( ), … Variables and attributes have one of twelve primitive data types or one of four user-defined types. DataType PrimitiveType char byte short int int64 float double unsigned byte unsigned short unsigned int unsigned int64 string UserDefinedType typename: String Compound VariableLength Enum Opaque

Characteristics of the enhanced data model Strengths  Simpler than HDF5, with similar representational power  Completely contains and is backward compatible with classic model  Efficient implementation available  Fixes identified weaknesses of netCDF classic model  Incremental adoption of model features possible Potential weaknesses  Writing generic applications more difficult  Types must be defined and named separately from use, even if not shared  No attributes allowed on compound members

Some details of the enhanced data model No attributes permitted for compound type members (because HDF5 doesn’t allow such attributes): compound wind_vector_type { float eastward; float northward; } Inclusion of user-defined opaque types (why not just use variable-length array of bytes?) Type definitions as first-class objects Type containment in groups, but global scope for use Inheritance through group hierarchy of only dimensions (why not coordinate variables or attributes?)

Natural convention for assigning attributes to members of a compound type types: compound wind_vector_t { float eastward ; float northward ; } compound wind_vector_units_t { string eastward ; string northward ; } variables: wind_vector_t wind(station) ; wind_vector_units_t wind:units = {"m/s", "m/s"} ;

Enhancing a Data Model with Backward Compatibility Benefits  Data in archives don’t have to change  Client program sources don’t have to change  Software can access archived data without being aware of format version Costs  Effort required to support older interfaces and formats  Can’t easily fix mistake in released interfaces  Comprehensive compatibility testing needed Implementation  Evolve data model incrementally  Add or grow abstractions, instead of modifying or removing them  Ensures previous data model is included in enhanced data model

NetCDF-4 classic-model: a transitional format netCDF-3 netCDF-4 classic model netCDF-4 Compatible with existing applications Simplest data model and API Not compatible with some existing applications Enhanced data model and API, more complex, powerful Uses classic API for compatibility Uses netCDF-4/HDF5 storage for compression, chunking, performance To use, just recompile, relink

Concluding remarks Serious use of netCDF enhanced data model just beginning Future adjustments to model, if any, will be made by addition, not modification or deletion of existing features  Preserves previous programming interfaces  Supports access to previous format variants transparently