Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unidata Infrastructure for Data Services Russ Rew GO-ESSP Workshop, LLNL 2006-06-19.

Similar presentations


Presentation on theme: "Unidata Infrastructure for Data Services Russ Rew GO-ESSP Workshop, LLNL 2006-06-19."— Presentation transcript:

1 Unidata Infrastructure for Data Services Russ Rew GO-ESSP Workshop, LLNL 2006-06-19

2 2 Some Current Unidata Infrastructure Projects l LDM for distributing and processing near real-time data l Integrated Data Viewer (IDV) for testing infrastructure in platform-independent data visualization and analysis l NetCDF C-based interfaces for data access l CFIOlib for a CF conventions API (tomorrow) l NetCDF Java for advanced data access infrastructure l Common Data Model for improving interoperability l NcML for metadata annotation and data aggregation l THREDDS Data Server (TDS) for remote access to archives l GALEON for serving netCDF data through OGC Web Coverage Services (WCS)

3 3 LDM-6 for Internet Data Distribution l Implements a peer-to-peer system for reliable, event-driven data distribution l Supports subscriptions to many near real- time data feeds; no data center needed l Data product abstraction is general: model output, observations, text products, satellite data, radar, … l Protocols use persistent connections to achieve low latency l Highly configurable: inject, distribute, capture, filter, and process arbitrary data products l In continuous use by over 160 universities, NOAA, USGS, NASA, internationally, THORPEX global ensembles (TIGGE), … l Candidate for use in new WMO weather information system

4 4 IDV (Integrated Data Viewer) l Freely available 100% Java reference application and framework for visualization and analysis of geoscience data l Provides integrated and time synchronized 2-D and 3-D visualizations of model outputs, observed, and remotely sensed data, using U. of Wisc. VisAD l Handles diverse formats and protocols for local and remote access: GRIB, netCDF, OPeNDAP, ADDE, HTTP, GIS, … l Serves as end-to-end test for many Unidata technologies: THREDDS services, Java netCDF, XML bundles, plug-in architecture, interactive collaboration, …

5 5 NetCDF’s Niche l Simple data model for scientific datasets l Portable, self-describing data l Appendable, sharable, archivable l Direct access for efficient subsetting l Metadata via attribute conventions such as CF l Flexible remote access via OPeNDAP, HTTP, WCS l Lots of applications: NCO, ncbrowse, ncview, IDV, IDL, MATLAB, ArcGIS,... l Language interfaces include C, Java, Fortran, C++, Perl, Python, Ruby,...

6 6 NetCDF-3 Data Model Attribute name: String type: DataType values: 1D array Variable name: String shape: Dimension[ ] type: DataType array: read( ), … File location: Filename create( ), open( ), … Dimension name: String length: int isUnlimited( ) DataType char byte short int float double A file has named variables, dimensions, and attributes. Variables also have attributes. Variables may share dimensions, indicating a common grid. One dimension may be of unlimited length. Variables and attributes have one of six primitive data types.

7 7 Some NetCDF-3 Limitations l Only one shared unlimited dimension l No structures, just scalars and multidimensional arrays l No strings, just arrays of characters l Limited numeric types l No ragged arrays or nested structures l Only ASCII characters in names l Changes to file schema can be expensive l Efficient access requires reads in same order as writes l No built-in compression l Only serial I/O l Flat name space limits scalability

8 8 NetCDF-4 Features to Address Limitations l Multiple unlimited dimensions l Portable structured types l String type l Additional numeric types l Variable-length types for ragged arrays l Unicode names l Efficient dynamic schema changes l Multidimensional tiling (chunking) l Per variable compression l Parallel I/O l Nested scopes using Groups

9 9 NetCDF-4 Data Model (Common Data Access Model) Dimension name: String length: int isUnlimited( ) Attribute name: String type: DataType values: 1D array Variable name: String shape: Dimension[ ] type: DataType array: read( ), … Group name: String File location: Filename create( ), open( ), … DataType PrimitiveType char byte short int int64 float double unsigned byte unsigned short unsigned int unsigned int64 string UserDefinedType typename: String Compound VariableLength Enum Opaque A file has a top-level unnamed group. Each group may contain one or more named subgroups, variables, dimensions, and attributes. Variables also have attributes. Variables may share dimensions, indicating a common grid. One or more dimensions may be of unlimited length. Variables and attributes have one of twelve primitive data types or one of four user-defined types.

10 10 NetCDF-4 Architecture NetCDF Java applications NetCDF-3 applications NetCDF-4 applications HDF5 applications l NetCDF-4 uses HDF5 for storage, high performance u Parallel I/O u Chunking for efficient access in different orders, efficient use of compression u Conversion using “reader makes right” approach l Provides simple netCDF interface to subset of HDF5 l Also supports netCDF classic and 64-bit formats POSIX I/O MPI I/O HDF5netCDF-3 netCDF Java netCDF-4 … NetCDF Java application NetCDF-3 application NetCDF-4 application HDF5 application Java VM

11 11 Status of NetCDF-4 l NetCDF-4.0-alpha14 currently available for testing u Files created with alpha release use unsupported artifacts u We’re seeking feedback on performance and functionality l NetCDF-4.0-beta waiting for HDF5 1.8-beta u Will finalize file format, eliminate necessity for artifacts u Expected within a few weeks of HDF5 1.8-beta release, maybe by August 2006 l HDF5 1.8 currently expected by November 2006 u Has enhancements specifically for netCDF-4: variable creation order, Unicode names, dimension scales, on- the-fly numeric conversions l Plans for netCDF-4.1 and beyond on netCDF-4 web site

12 12 Summary l Unidata’s LDM-6 implements an event-driven architecture for low-latency data distribution l Unidata’s IDV provides a platform-independent visualization and analysis framework and reference application for integrating data from diverse sources l Unidata’s netCDF-4 software preserves backward compatibility and eliminates many limitations of netCDF-3 with only a modest increase in complexity


Download ppt "Unidata Infrastructure for Data Services Russ Rew GO-ESSP Workshop, LLNL 2006-06-19."

Similar presentations


Ads by Google