Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps Mike Folks, The HDF Group Ruth Duerr, NSIDC 1.

Similar presentations


Presentation on theme: "Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps Mike Folks, The HDF Group Ruth Duerr, NSIDC 1."— Presentation transcript:

1 Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps Mike Folks, The HDF Group Ruth Duerr, NSIDC 1

2 Background and basic concept 2

3 3 HDF4 is FLEXIBLE EXTENSIBLE SELF- DESCRIBING I’m Plastic Man!

4 But There’s a cost… 4

5 Complexity! 5

6 6

7 7

8 8 How do we save HDF users from having to deal with all of the complexity under the hood?

9 9 Through the HDF software libraries, either by using the HDF APIs directly or by using HDF tools that depend on the HDF libraries. But what about the future…

10 There is a risk in depending solely on the HDF libraries to access HDF-formatted data over the long term. It is possible, especially in the distant future, that the libraries may not be available. 10

11 Really smart people and software? 11 Maybe future data users and their computers will be so smart that the HDF4 format will be a piece of cake.

12 12 Maybe not.

13 We need an “easy” button 13

14 14 read HDF data with an independent program that does not rely on the HDF API… “If only we could read HDF data with an independent program that does not rely on the HDF API… A possible approach [would be to] extend hdfls to print a hierarchical map of a data file, [and] write ncdump/hdp-like utilities to find, assemble and write out SDSes and vdatas.” “Leveraging HDF Utilities” Christopher Lynnes HDF Workshop X.

15 The project 15

16 HDF4 mapping Problem  The complex internal byte layout of HDF files requires one to use the API to access HDF data.  This makes long-term readability of HDF data dependent on long-term allocation of resources to support HDF software. Proposed solution  Create a map of the layout of data objects in an HDF file, allowing a simple reader to be written to access the data. 16

17 HDF4 mapping project activities 1.Assess and categorize HDF4 data held by NASA  To determine what types of objects to map.  To get an idea of the magnitude of the project. 2.Develop prototype for proof of concept  Develop markup-language based layout specification.  Develop tool to produce layout for an HDF4 file.  Develop and test two independent tools to read HDF4 data based solely on the map files. 17

18 Project activities (continued) 3.Assess results and plan next steps  Present results and options for proceeding to the community.  Assess the likely usefulness of this approach, as well as any desirable modifications  Evaluate the effort required for a full solution that best meets community needs  Submit a proposal for the work needed to provide a full solution 18

19 1. Assess and categorize 19

20 How many NASA HDF4 products? Data CenterHDF4 Products ASF0 GES-DISC236 GHRC54 ASDC63 LP-DAAC67 NSIDC47 ORNL-DAAC2 PO.DAAC22 SDAC0 MrDC95 Total586 20

21 Data characteristics Product Identification  Product Name  Data Level  Archive Location  Product Version Whether the product was multi-file For HDF-EOS products  HDF-EOS version  For point data Number of point data sets Maximum number of levels  For swath data Number of swaths Maximum number of dimensions Organized by time, space, both, or other Whether dimension maps were used  For gridded data Number of grids Max number of dimensions in a grid Number of projections used Whether any grids were indexed HDF Version For raster data  Number of 8-bit rasters  Number of 24-bit rasters  Number of general rasters  Whether any rasters had attributes  Whether any rasters were compressed  Whether any rasters were chunked  Whether there were any palettes For SDS data  Number of SDSs  Maximum number of dimensions  Did any SDS have attributes  Was any SDS annotated  Were dimension scales used  Was compression used and if so what kind  Was chunking used For Vdata  Number of Vdata structures  Did any Vdata have attributes  Did any Vdata fields have attributes  Was compression used and if so what kind  Was chunking used Product Characteristics Examined 21

22 Other results Slightly more than half of the HDF4 products are in HDF-EOS 2 format Grids are the most common HDF-EOS data structures in use No products use a combination of grid, swath, and point data structures 22

23 2. Prototype and proof of concept 23

24 HDF4 mapping prototype workflow 24 HDF4 File “H4.hdf” HDF4 File “H4.hdf” HDF4 Mapping File (XML document) “H4.hdf.map.xml” HDF4 Mapping File (XML document) “H4.hdf.map.xml” hmap linked with HDF4 library hmap linked with HDF4 library Reader 1 (C program) Object Data Groups, Data Objects, Structural and Application Metadata; Locations of Object Data Reader 2 (Perl Script) Reader 2 (Perl Script)

25 Proof-of-concept results The HDF Group created prototype map generation software and a draft map specification Map generator was tested on a wide variety of data products GES-DISC and NSIDC independently wrote software that uses maps to read data files in NSIDC’s and GES-DISC’s archives Summary - the concept is feasible! 25

26 Example map fragment 0 255 10 100 2502 4000 26

27 Next steps 27

28 Effort for full implementation Finalize map file xml specification  compatibility with existing standards NCML, XFDU, PREMIS, ESML, DFDL Implement production quality mapping tool and API Possibly do similar assessment for HDF5 maps. 28

29 Implementation Processes Generate maps for existing archives  GES-DISC approach: append the map XML to the XML files already kept for each file in their archive  NSIDC non-ECS data implementation: add an XML file for each data file in same directory  ROM to add capability to NASA ECS systems in process  Other NASA systems TBD Generate maps for new data  Add map generation as a step in the ingest process using stand alone tool  Request product generation systems to use new API calls that generate maps

30 How you can help Consider what it might take to implement this for your archive Review the materials on the wiki and elsewhere - comment heavily!  Wiki page added to NASA’s ESDC wiki  Project page at The HDF Group website: http://www.hdfgroup.org/projects/hdf4mapping/ 30

31 Thank you. This report is based upon work supported in part by a Cooperative Agreement with the National Aeronautics and Space Administration (NASA) under NASA Award NNX06AC83A. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics and Space Administration. 31


Download ppt "Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps Mike Folks, The HDF Group Ruth Duerr, NSIDC 1."

Similar presentations


Ads by Google