Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reading HDF family of formats via NetCDF-Java / CDM

Similar presentations


Presentation on theme: "Reading HDF family of formats via NetCDF-Java / CDM"— Presentation transcript:

1 Reading HDF family of formats via NetCDF-Java / CDM
John Caron UCAR/Unidata

2 NetCDF-Java library 100% Java Open Source (LGPL, MIT)
Independent implementation Used as a component in other software (partial) Integrated Data Viewer, THREDDS Data Server (Unidata) Panoply (NASA) ncBrowse (EPIC/NOAA) Java NEXRAD Viewer (NCDC/NOAA) MyWorld GIS (Northwestern) EDC for ArcGIS, ERRDAP (SFSC/NOAA) Live Access Server (PMEL/NOAA) ncWMS (Reading) Matlab plug-in (USGS)

3 Scientific Feature Types
Application Scientific Feature Types Datatype Adapter NetCDF-Java/ CDM architecture NetcdfDataset CoordSystem Builder NetcdfFile THREDDS Catalog.xml I/O service provider OPeNDAP NetCDF-3 NIDS NcML NetCDF-4 GRIB HDF5 GINI Nexrad DMSP

4 Format Readers (IOSP) General: NetCDF, HDF5, HDF4, OPeNDAP
Gridded: GRIB-1, GRIB-2, GEMPAK Radar: NEXRAD 2&3, DORADE, CINRAD, Universal Format Point: BUFR, ASCII Satellite: DMSP, GINI, McIDAS AREA Misc: GTOPO, Lightning, etc Others in development (partial): AVHRR, GPCP, GACP, SRB, SSMI, HIRS (NCDC) Diversity of formats:

5 Line of Code (est)

6 Why all the trouble? ~20-40% C/C++ time spent on portability issues
Platform Independence Linux, Solaris, Windows (Sun) Mac OS X (Apple) AIX, Linux, Windows, z/OS (IBM) HP-UX (Hewlitt-Packard) Progammer productivity Object-Oriented Garbage Collected – no memory leaks Rich libraries Open source Faster than C for some applications

7 Independent implementation
Written entirely from reading HDF4, HDF5 file specifications Helped debug (HDF5), validate file specs File format spec is what will be needed in 100 years to read legacy data OTOH, semantics not always obvious Don’t confuse reference implementation with the file/protocol specification

8 HDF family of formats HDF5/NetCDF-4 HDF4 HDF-EOS
Note: read-only, no parellel I/O, etc

9 HDF5/NetCDF4 Goal is to read all HDF5
Can read all HDF5 files that we have example including references, soft links Complete coverage difficult to guarantee – combinatoric explosion Some esoteric features we are skipping File drivers, external files, slib compression Working on a comprehensive test harness JNI interface to Netcdf4/HDF5 library read every byte and compare

10 HDF4 / HDF-EOS Complete, works against all examples
Tested against 400 sample files (27 Gb) thanks to Ruth Duerr (NSIDC) Spot checked against HDFView Need systematic test to compare reading against the HDF4 C Library

11 Geolocation Primer

12 Swath Float lat(245, 33477); Float lon(245, 33477); Float time(33477);
Float data(245, 33477); Just know that its swath data 245 points cross track 33477 along the track Each scan has a time coordinate

13 Swath Float lat(33477, 245); Float lon(33477, 245); Float time(33477);
Float data(245, 33477);

14 Swath Float lat(999,999); Float lon(999,999); Float time(999);
Float data(999,999);

15 Swath Float v1(999, 999); Float v2(999, 999); Float v3(999);

16 If you write data Don’t rely on variable name conventions
Don’t rely on index ordering Don’t rely on matching index sizes Minimize “you just have to know that…”

17 Dimensions Dimensions d1=999; d2=999; Variables:
float v1(d1=999, d2=999); float v2(d1=999, d2=999); float v3(d2=999); float v4(d2=999,d1=999);

18 Good Variables: float v1(d1=999, d2=999);
v1:standard_name = “Latitude”; float v2(d1=999, d2=999); v2:standard_name = “Longitude”; float v3(d2=999); v3:standard_name = “Time”; float v4(d2=999,d1=999); Data_type = “Swath”; Conventions = “My unique name”;

19 If you write data Unique signature Specify dimensions
Identify georeferencing coordinates Identify data type Units are not optional

20 HDF-EOS, HDF-EOS2 Read “structural metadata” field to obtain more semantics Parse text in “ODL” Data type: Swath, Grid, Point Dimensions Geolocation coordinate variable types: Latitude, Longitude, Time

21 HDF-EOS, HDF-EOS2 Good Not so good Bad
Unique signature, identify coordinates and data type Not so good ODL Not using hdf4/5 constructs Bad No data units No time coordinate units!

22 Better EOS Variables: float v1(999, 999);
v1:standard_name = “Latitude”; v1:dims = “d1 d2”; float v2(999, 999); v2:standard_name = “Longitude”; v2:dims = “d1 d2”; float v3(999); v3:standard_name = “Time”; v3:dims = “d2”; float v4(999,999); v4:dims = “d2 d1”;

23 NPP (i1.4.0.3_NPP_QUAL) Good Not so good Bad
XML better than ODL Not so good Not using hdf4/5 constructs Bad No data units No time coordinate units! Fatal Error: please reboot Metadata not in the same file

24 Summary Netcdf-Java reads entire HDFx family Good for Java-philes
Needs more testing Send example files, $ Dimensions are not optional Keep structural and georeferncing metadata in the same file as the data Can also have specialized external files

25 Contact Google “netcdf java”

26 NetCDF-4 and Common Data Model (Data Access Layer)

27 Dimension primer Float lat(180); Float lon(360); Float alt(20);
Float time(1200); Float data(1200,20,180,360);

28 Unique Name! Float lfip(lfip=180); Float lflop(lflop=180);
Float zorg(zorg=20); Float skdf(skdf=1200); Float dglot(skdf=1200,zorg=20, lfip=180,lflop=180);

29 Float lfip(180); Float lflop(180); Float zorg(20); Float freebish(1200); Float dglot(1200,20,180,180);

30 Float lat(180); Float lon(180); Float alt(20); Float time(1200); Float data(1200,20,180,180);


Download ppt "Reading HDF family of formats via NetCDF-Java / CDM"

Similar presentations


Ads by Google