Presentation is loading. Please wait.

Presentation is loading. Please wait.

JPL/Caltech proprietary. Not for public release or redistribution. This document has been reviewed for export control and it does NOT contain controlled.

Similar presentations


Presentation on theme: "JPL/Caltech proprietary. Not for public release or redistribution. This document has been reviewed for export control and it does NOT contain controlled."— Presentation transcript:

1 JPL/Caltech proprietary. Not for public release or redistribution. This document has been reviewed for export control and it does NOT contain controlled technical data. For planning and discussion purposes only. http://smap.jpl.nasa.gov/ Copyright 2014 California Institute of Technology. Government sponsorship acknowledged. SMAP – The Automation of ISO Metadata ESIP – Summer 2014 Copper Mountain, Colorado Barry Weiss Vance Haemmerle Albert Niessner Hook Hua Jet Propulsion Laboratory California Institute of Technology Pasadena, CA July 10, 2014

2 SMAP – ISO Metadata Automation SMAP Requirement for Product Metadata SMAP Level 1 Requirement: SMAP Science Data Product formats shall conform to ISO 19115 “Geographic Information – Metadata”. ISO metadata must conform to these standards: –Provide metadata that conforms to the family of ISO 19115 models –Metadata represented using ISO 19139 compliant serialization –Ultimate ISO goal – a global standard model in a global standard format Major Goal: Generate SMAP products that conform to the ISO requirement, while at the same time: –Ensure that the products that are easy to use –Ensure that the products have consistent design –Provide metadata that are easy to locate BW-12013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

3 SMAP – ISO Metadata Automation Group/Attribute Metadata Structure Employ HDF5 groups and attributes to represent ISO metadata Multiple sub-groups under the HDF5 Metadata group –Groups represent major ISO classes Attributes map directly to attributes in the ISO classes –Reduces deeply nested layers within the HDF5 representation No more than four nested layers –In some instances, the design employs modified names of HDF5 groups or attributes to ease user comprehension of the model. The HDF group/attribute structure provides an alternative representation layer of the ISO metadata model BW-2 2013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

4 SMAP – ISO Metadata Automation The Challenge Missions need to transfer the complete set of the product specific metadata into the ISO model using the appropriate serialization –Most missions generate a large number of products daily –Most missions have latency requirements Need to automate the production of ISO compliant metadata within product generation software stream –The metadata must provide users with relevant information about the associated products –The metadata must ensure ingestion of data products in designated project archives BW-3 2013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

5 SMAP – ISO Metadata Automation SMAP Metadata Handling ISO 19139 XML ISO 19115 Object Representation In Code ISO 19115 Object Representation In Code ISO 19115 Representation In Native HDF5 ISO 19115 Representation In Native HDF5 Read/Write in XML Read/Write in HDF5 ISO 19139 XML Representation In HDF5 ISO 19139 XML Representation In HDF5 Access XML Content Inside HDF5 ISO 19139 Schema and XML Interoperable metadata interchange format ISO 19115 in HDF5 Metadata as hierarchical group/attributes Metadata as embedded ISO 19139 XML stream 2014-07-10BW -4 Data Producer/Consumer Code © California Institute of Technology. Government Sponsorship Acknowledged

6 SMAP – ISO Metadata Automation Two Approaches XSLT/Saxon approach –Generate metadata in group/attribute form, pass to HDF5 elements –Spawn a process that generates hdf5 metadata –Employ XSLT/Saxon to transfer metadata into 19139 compliant serialization –Store 19139 serialized metadata in SMAP products Data binding approach –Generate metadata in group/attribute form, pass to HDF5 elements –Fill ISO elements using precompiled data binding code –Serialize the ISO elements –Store 19139 serialized metadata in SMAP products BW-5 2014-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

7 SMAP – ISO Metadata Automation XSL SMAP Tool Chain BW-6 2013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged Metadata Configuration File SMAP Specific XML Metadata Configuration File SMAP Specific XML Output Configuration File SMAP Specific XML Output Configuration File SMAP Specific XML Complete Group/Attribute Structure in HDF5 XML saxon XSL that maps transform form HDF5 XML to ISO 19139 XML Automated Metadata in ISO 19139 Compliant Serialization SMAP Science Processing Software SMAP Product in HDF5 with Metadata in Group/Attribute Structure SMAP Product in HDF5 with Metadata in Group/Attribute Structure and in ISO 19139 Compliant XML SMAP Product in HDF5 with Metadata in Group/Attribute Structure and in ISO 19139 Compliant XML Curated Series Metadata in ISO 19139 Compliant Serialization merge h5dump Discrete Files SMAP Data Product Executable Software Key

8 SMAP – ISO Metadata Automation XSL Transform Features Requires multiple executables to operate –SMAP SPS software spawns stream that runs h5dump and saxon Requires an XSL transform for each data product the mission delivers –New structures need to be added to each XSL transform file Defines each instantiation of each class distinctly –ISO employs many of the same classes multiple times One example is Lineage/LE_Source –This class details the product pedigree –Use of XSL transform enables distinct definition of each instantiation of the class Designer can provide broader definition of the approach –Code developer can infer necessary XPaths based on design specification BW-7 2014-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

9 SMAP – ISO Metadata Automation Advantages/Disadvantages Advantages –Tailors a specific XSLT for each data product. More likely to adapt quickly to changes that impact some, but not all products Advantageous for projects/missions with a few similar data products –Does not require designer to specify granule at level of XPath Disadvantages –Requires the use of a complex software chain –The need to tailor the XSLT for different products can engender “cut and paste” errors BW-8 2014-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

10 SMAP – ISO Metadata Automation XML Data Binding Represents information in an XML document as an object in memory Leverages the model in an XSD Schema to create classes and interfaces that adhere to the information structure defined by the schema Enables serialization/deserialization of XML instances to/from code Supports expected variations in the ISO model 7/12/11BW -9 Code Synthesis Code Synthesis

11 SMAP – ISO Metadata Automation SMAP Data Binding Flow BW-10 2013-07-09 © California Institute of Technology. Government Sponsorship Acknowledged Metadata Configuration File SMAP Specific XML Metadata Configuration File SMAP Specific XML Output Configuration File SMAP Specific XML Output Configuration File SMAP Specific XML Automated Metadata in ISO 19139 Compliant Serialization SMAP Product in HDF5 with Metadata in Group/Attribute Structure SMAP Product in HDF5 with Metadata in Group/Attribute Structure and in ISO 19139 Compliant XML SMAP Product in HDF5 with Metadata in Group/Attribute Structure and in ISO 19139 Compliant XML Curated Series Metadata in ISO 19139 Compliant Serialization merge SMAP HDF5 Group/Attribute Metadata Generation Code Serialization Method Precompiled Data Binding Methods with ISO XPaths Fill Method Library of Software Objects Discrete Files SMAP Data Product Software Modules Key XPath Metadata Value Object

12 SMAP – ISO Metadata Automation Data Binding Features BW-11 2014-07-10 © California Institute of Technology. Government Sponsorship Acknowledged Requires a build of a very large library in advance of implementation –Library contains wrappers that translate XPaths into the ISO serialization. –Library must encompass the entire model in use –SMAP library for data binding alone is well over 1 GByte Runs entirely within one executable Employs a single model that applies to all executables Approach needs to distinguish among multiple instantiations of the same class –SMAP implementation incorporates indices in the XPath to differentiate between multiple instantiations of the same class –Logic must exercise care to pass the right data value to the appropriate instantiation of a given class Need to pass ephemeris based information to the LE_Source instantiation that represents the ephemeris file

13 SMAP – ISO Metadata Automation Advantages/Disadvantages Advantages –More generalized approach More adaptable to larger number of data products –Avoids cut/paste issues –Automated approach can apply to larger number of products Disadvantages –Requires more design preparation Developer needs to provide complete Xpath for every variable –Incorporation of new or modified metadata elements requires new wrapper software –Careful correlation among multiple instantiations of the same class –Users need to be aware of the size of the library Use appropriate hardware resources to run BW-12 2014-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

14 SMAP – ISO Metadata Automation Recommendations Larger missions with many products, particularly if the product designs are highly varied should consider data binding approach –Initial implementation is more difficult to design and build –Once constructed, can be more flexibly applied to all products –Develop this method early to better locate omissions and errors in the implementation Smaller missions with a few products, or products that are highly similar should consider XSD approach –Easier to implement for a single product –Need not deal with repetitive tailoring of the XSL transform for multiple products BW-13 2014-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

15 SMAP – ISO Metadata Automation Backup BW-14 2013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

16 SMAP – ISO Metadata Automation Metadata Coverage Product metadata – applies to the entire content of a data granule –Mission specific information –Spatial and time boundary information –Data version information – algorithm, Science Processing Software (SPS), Science Data System (SDS) release, HDF5 version –Granule lineage or pedigree Lists of the input that were used to generate a data granule –Technical parameters that apply to the entire data granule Orbit mechanical data Instrument specific information Small tables of calibration and/or algorithmic coefficients Algorithmic parameters and options –Data quality and completeness –References to related documentation Local metadata – applies to particular arrays in the product. –Maxima, minima, units, dimension definitions, identification of statistical methods BW-152013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

17 SMAP – ISO Metadata Automation ISO 19139 Serialization BW-16 2013-07-10 SMAP divides the ISO serialized metadata into two discrete packages: –Dataset metadata XML is auto-generated with each executable instance SMAP software inserts auto-generated metadata into a single attribute in the HDF5 /Metadata Group named “iso_19139_dataset_xml SMAP SDS delivers auto-generated metadata along with each in a separate file to the Data Center –Series metadata XML is curated –Update the XML with each delivery SMAP software inserts curated series metadata into a single attribute in the HDF5 /Metadata Group named “iso_19139_series_xml” SMAP SDS delivers curated series metadata to the Data Center in a separate file before the SDS begins product delivery with each new release © California Institute of Technology. Government Sponsorship Acknowledged

18 SMAP – ISO Metadata Automation XML Schema Definition (XSD) ISO 19139 codifies XML representation of ISO 19115 XSD provides: –A full description of the XML structure –Specification of permissible values in an XML document Enables validation –Can be used to validate the structure and the value of an XML metadata instance Can leverage existing UML and XSD documents of ISO 19115 2014/7/10BW -17 © California Institute of Technology. Government Sponsorship Acknowledged

19 SMAP – ISO Metadata Automation MI_Metadata Class BW-18 2013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

20 SMAP – ISO Metadata Automation DQ_Quality Class BW-19 2013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

21 SMAP – ISO Metadata Automation SMAP LI_Lineage BW-20 2013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

22 SMAP – ISO Metadata Automation CI_Citation Class and Subclasses BW-21 2013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

23 SMAP – ISO Metadata Automation Multiple Instantiations – Lineage in SMAP Products SMAP products employ a large number of input data sets. The Level 1C Radar Product employs the following input data sets: –SMAP Level 1A Radar Product –Spacecraft Ephemeris –Spacecraft Attitude –Spacecraft Antenna Azimuth –Spacecraft Clock to UTC Correlation –Short Term Calibration Data –Long Term Calibration Data –Total Electron Content in the Ionosphere –Digital Elevation Map –Antenna Pattern –Block Floating Point Quantization Decoder Each source requires an instantiation of the LI_Lineage/LE_Source class –In Group/Attribute structure, these elements fall in the /Metadata/Lineage HDF5 Group –Subgroups names reflect the input product described in the group BW-22 2013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

24 SMAP – ISO Metadata Automation Lineage Example BW-23 2013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged Radar Level 1A metadata contain 11 instances of LE_Source Identifier that specifies the Lineage element for each instantiation is in: LE_Source/sourceCitation/CI_Citation/identifier/MD_Identifier/code

25 SMAP – ISO Metadata Automation ISO Metadata Structure Example ISO 19115 Group/Attribute Model for Lineage in the SMAP L1C Radar Product /Metadata/Lineage/ L1A_Radar DOI = http://dx.doi.org/10.5067/smap/radar/data100 creationDate = 2015-05-30 description = Parsed and reformatted SMAP radar telemetry. The Level 1A Product contains both synthetic aperture radar data and real aperture radar data. The product also includes loopback data as well as health and status data. fileName = SMAP_L1A_Radar_00016_A_20150530T160100_R04001_001.h5 identifier = L1A_Radar version = R04001 Ephemeris creationDate = 2015-05-29 description = One or more data products that list the spacecraft trajectory over the same time period as the input Level 1A radar data. fileName = traj_SPK_1505291400_1512291400_1505311200_sci_OD0945_v01.bsp version = 01 AntennaAzimuth creationDate = 2015-05-30 description = One or more data products that specify the azimuth angle of the antenna on the SMAP spacecraft over the same time period as the input Level 1A radar data. fileName = smap_ar_150530153500_150530172515_v01.bc version = 01 Attitude ………. BW-24 © California Institute of Technology. Government Sponsorship Acknowledged 2013-07-10

26 SMAP – ISO Metadata Automation Model Complexity – Locating Algorithm Parameters BW-25 2013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

27 SMAP – ISO Metadata Automation Locating Algorithm Parameters BW-26 2013-07-09 © California Institute of Technology. Government Sponsorship Acknowledged ISO 19115 Group/Attribute Model for Process Step in the SMAP L1C Radar Product Process Step RFI_Threshold = 2.0 FaradayRotationThreshold = 1.4 degrees waterBodyThreshold = 30 percent timeVariableEpoch = J2000 epochJulianDate = 2451545.00 epochUTCDateTime = 2000-01-01T11:58:55.816Z parameterVersionID = 004 algorithmTitle = Soil Moisture Active Passive Synthetic Aperture Radar processing algorithm algorithmVersionID = 007 algorithmDate = 2015-05-31 ………. Provides Direct Access to Critical Metadata Elements within the HDF5 Structure Items in Red are Additional Attributes. Represented in XML as Record/Record Types

28 SMAP – ISO Metadata Automation Global Metadata – ISO 19115 “ Geographic Information - Metadata” from the International Organization for Standardization Provides a standardized means to describe Earth data Provides a means to make products “self descriptive and independently understandable” Incorporates all of the major categories required for a complete set of global metadata for each product granule Incorporates all of the major categories required to generate a complete set of collection metadata. Enables fulfillment of the requirement “to correlate, interoperate and integrate SMAP data products with those generated by disparate sources”. Uses standardized XML serialization to ease portability to the wider user community. Standard specified in ISO 19139. BW-27 2013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

29 SMAP – ISO Metadata Automation CF Convention – Local Metadata The Climate and Forecast (CF) is a highly descriptive metadata convention with a widespread science user community –CF designed specifically designed to fit within attributes in netCDF files. –CF is based upon the Cooperative Ocean/Atmospheric Data Service (COARDS) standard The CF convention includes: –A standard to provide descriptive names for each variable in the product –Standards for the specification of data units for each variable in the product UDUNITS provides a list of supported unit names –Standards for fill values for each variable in the product –Standards to express the range of data for each variable in the product –Standards to express bit flag definitions and define flag values –Standards to specify relationships between spatial and time coordinates for each variable in the product Indicates which particular spatial or temporal coordinates correspond with which dimension axes and indices of a data variable. –Standards to specify statistical methods that were used to calculate each variable in the product Clarifies temporal or spatial intervals that were used to provide statistical results. BW-28 2013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged

30 SMAP – ISO Metadata Automation Dataset Metadata Developed an XSLT that maps the HDF5 group/attribute metadata in each data product granule into a representation that complies with ISO 19139 XML encoding –Near the completion of each executable run, the SMAP software: Dumps the group/attribute metadata into HDF5 XML. Executes the open source Saxon XSLT engine to convert HDF5 XML to ISO 19139 XML. Incorporates the ISO 19139 compliant dataset metadata into an HDF5 attribute in the output data product granule Incorporates the curated ISO 19139 series metadata into a separate HDF5 attribute –The SMAP mission delivers the ISO dataset 19139 compliant metadata to the Data Centers in two forms Embedded in the data product metadata for the user community In a collocated file for Data Center ingestion –The separate file does not travel with the product BW-29 © California Institute of Technology. Government Sponsorship Acknowledged 2013-07-10

31 SMAP – ISO Metadata Automation Curated Series Metadata Systems Engineers curate the series metadata for each data product –Model is ISO 19115 compliant with a few SMAP extensions –Encoding is ISO 19139 compliant –One file represents a specific SMAP data product for each build The SMAP SDS delivers the curated series metadata to ESDIS with each build. –This delivery enables ingestion of data products at the Data Centers SMAP software automatically incorporates the entire series metadata into a single HDF5 attribute in each data product granule BW-30 2013-07-10 © California Institute of Technology. Government Sponsorship Acknowledged


Download ppt "JPL/Caltech proprietary. Not for public release or redistribution. This document has been reviewed for export control and it does NOT contain controlled."

Similar presentations


Ads by Google