Presentation is loading. Please wait.

Presentation is loading. Please wait.

MOIMS Internet Packaging and Registries WG XML Formatted Data Units (XFDU) XML Packaging of Binary and Text Data Lou Reich NASA/CSC MOIMS Plenary May 10,

Similar presentations


Presentation on theme: "MOIMS Internet Packaging and Registries WG XML Formatted Data Units (XFDU) XML Packaging of Binary and Text Data Lou Reich NASA/CSC MOIMS Plenary May 10,"— Presentation transcript:

1 MOIMS Internet Packaging and Registries WG XML Formatted Data Units (XFDU) XML Packaging of Binary and Text Data Lou Reich NASA/CSC MOIMS Plenary May 10, 2004

2 XML Packaging Standard Rationale Physical media  Electronic Transfer No standard language for metadata  XML Homogeneous Remote Procedure Call  CORBA, SOAP Little understanding of long-term preservation  OAIS RM Record formats  Self describing data formats New Requirements describe multiple encodings of a data object better describe the relationships among a set of data objects.

3 Functionality by Release Version 1 should include : Support for XML document and ZIP/JAR type files The capabilities of current SFDU packaging and CCSDS Control Authority Concepts Support for data descriptions, MIME types, self describing formats, and detached data descriptions Flexible Metadata Model (Supports producer view of Metadata types Support for the OAIS RM Information Model Concepts and Types Flexible linkage to Metadata The ability to encapsulate related files/resources into a single file/container The ability to reference both content and metadata resources contained in the same container or at a known URL The ability to allow/reverse multiple transformations on files Behaviors Web Service Interfaces Portable Code

4 Functionality by Release Version 2 Functionality Enabled Behavior –Automatic execution –Scripting (Output of one behavior as input to another behavior) Software Updates Process Definition Relationship Definition Support for SOAP with attachments with no mandated packaging of files into a single object

5 Environment View of XFDU

6 Logical View metadataObject metadataSec metadata objects dataObjSec dataObject ContentUnit ipMapSec data objects structure map behaviorSec xfdu behavior metadata Category Pointers REP DED,SYNTAX, OTHER PDI CONTEXT, PROVENANCE REFERENCE, FIXITY, OTHER DMD DESCRIPTION, OTHER OTHER category class ANY metadataObject

7 DIAGRAM of XFDU XML SCHEMA

8 Expressions of Interest Interest in participation in 2004 Prototype integration Planetary Data System – JPL National Space Science Data Center (NSSDC) Deep Space MS Packaging Prototype – JPL GSFC Library – GSFC ESA – Data Distribution System CNES – Archives (e.g., SIPAD) General Interest But No Current Commitment GMSEC NASA/GSFC Code 581 HEASARC and Virtual Observatory EOSDIS Metadata Clearinghouse (ECHO) International Virtual Observatories

9 Status of XML Formatted Data Unit Structure and Construction Rules Interoperability Profile developed at the RAL Workshop. The Workshop Noted that agreed resources must be committed. Working Group editor and Toolkit prototype lead funding discontinued 11/2003 - 2/2004 No progress in IPR WG during that timeframe A New draft of the XFDU Proposed Recommendation should be approved for TSG Review this Workshop Only prototype and testing activities will be able to improve the current solution

10 Review of IPR Charter

11 Required Resources Lead agency: NASA or CNES editor. Staffing needed: WG lead (NASA 25%) WG deputy (NASA 15%) Recommendations Editors (CNES 30%, NASA 30%) WG Contributors 10% per WG member Testing Coordinator 20% Prototype developers: 50% (NASA 1, CNES 1, ESA 0.5, BNSC 0.x) starting ASAP. Integrators: 25% for 3 months, then 15% continuing, at least 1 per environment (NASA 3+, CNES 2+, ESA 2+)

12 Risks Resources, Resources,Resources Regain Momentum from Working Group shutdown We cannot progress with multi-agency testing efforts Programmatic Risk Management The Packaging Recommendation functionality has been split between two planned releases of the XFDU Packaging Recommendation to allow early prototyping of required capabilities. A wide variety of use cases and testing environments including but not limited to: NASA PDS NASA/EOSDIS Libraries NASA SLE implementations CNES SLE implementations CNES Archive Ingest SIP development ESA Data Distribution System ESA CAOS

13 Registries Packaging partners (PDS, GSFC Library, ESA-DDS, various SLE implementations etc) should give us a good feel for a number of repositories. NASA wants to make XML descriptions of all its data available from a single logical repository. Work in other areas suggests that the ebXML registry will be a good fit to all CCSDS repository requirements. An Open Source implementation is available which some say is sufficiently mature for operational use. NASA/CSC is installing the ebXML s/w and will report back on its experience with this. At the Fall 2004 meeting a joint meeting with the Information Architecture BOF/WG will be essential to avoid duplication of work.

14 Backup Slides

15 MOIMS CCSDS ORGANIZATIONAL VIEW

16 File system Manifest Package Interchange File External Packages Conceptual View of Information Package

17 Information Package Map Logical View of XFDU Package

18 XML SPY DIAGRAM of XFDU XML SCHEMA

19

20 XML Schema for Metadata Linkage

21 XML Schema for Information Object

22 Data/Metadata Linkages Requirements Data Objects that are contained in the manifest are to be encoded in base64 or XML Data Objects that are included by reference from the manifest are to exist as files in the XFDU package or as files with known URIs either in a repository or in a location accessible via URL Metadata objects that are contained in the manifest are to be encoded in base64 or XML Metadata objects that are included by reference from the manifest are to exist as files in the XFDU package or as files with known URIs either in a repository or in a location accessible via URL Information Objects can reference applicable Metadata objects by ID where the name of the referencing attribute is used to classify the Metadata and the schema enables identification of the source of the metadata Allow metadata objects to be treated as data objects to enable direct mapping to the OAIS representation net where each metadata object is an information object containing both data object and representation information.

23 XML Schema for Metadata Linkage

24 XML Schema for Digital Object

25 Development Approach Develop Draft Concept Paper and XML Schemas for internal review Use automated tool (JAXR) to develop JAVA Classes from XML schema Modify XML Schema based on internal review comments and issues based on JAVA class implementations Develop draft CCSDS White Book for Working Group Review Begin staged implementation of API layer and crude GUI of a packaging toolkit Toolkit should provide useful functionality at a very early stage for demonstration to interested parties Present to Working Group for review and prototype commitments Develop specialization of schema that all international prototyping efforts agree to support

26 Technical Drivers Use of XML based technologies Designed to be extensible to include new XML technologies as they emerge Linkage of data and software Direct mapping to OAIS Information Models Support both media and network exchange Support for multiple encoding/compression on individual objects or on entire package Mapping to current SFDU Packaging and Data Description Metadata where possible Maximal use of existing standards and tools from similar efforts

27 Packaging Mechanisms Single XML Document Single XML document Simplest case All Binary must be encoded (base 64 or hex) Can be parsed and validated with standard XML parsers and shipped via standard WWW protocols Impractical with large binary files

28 Multi-file Packaging Approaches Archive Approach Encapsulate entire directory structure and all contained files into a single “file archive”using a common available technique such as ZIP Other “archive formats” such as JAR, show how the inclusion of a well-known file can include related metadata Message Approach Combines SOAP (RPC for the web) and MIME types Uses multi-part MIME/related, as a packaging format mechanism for messages that transfer multiple files Allow use of appropriate compression/encoding techniques for contained files. Use of a common “manifest” or “table of contents” object makes these two approaches symmetric Design Decision:XFDU version 1 must support the ZIP and single document forms. The SOAP/MIME/DIME forms should be prototyped but the underlying protocols may not be stable in the version 1 timeframe.

29 High Level Entities XFDU Schema (1 of 2) Package Header (packHeader): Administrative metadata for the whole XFDU, such as version, operating system, hardware, author, etc, and metadata about transformations and behaviours that must be understood Metadata Section (MetadataSec): This section contain or references all of the metadata for all items in the XFDU package. Multiple metadata objects are allowed so that the metadata can be recorded for each separate item within the XFDU object. The metadata schema allows the package designer to define any metadata model by providing attributes for both metadata categories and a classification scheme for finer definition within categories. The model also provides predefined metadata categories and classes via enumerate attributes that follow the OAIS information model as follows: Descriptive information is intended for the use of Finding Aids such as Catalogs or Search Engines. The Representation Section and its subsections, syntax information (syntaxMd), static semantics (dedMd), and unclassified metadata (otherMd) The classification of the PDI Section - reference, context, provenance, and fixity

30 High Level Entities XFDU Schema (2 of 2) Information Package Map Section (ipMapSec) outlines a hierarchical structure for the original object being encoded, by a series of nested contentUnit elements. Content units contain pointers to the data objects and to the metadata associated with those objects. Data Object Section (dataObjectSec) contains a number of dataObjEntry elements. A Data Object Entry contains some file content and any data required to allow the information consumer to reverse any transformations that have been performed on the object and restore it to the byte stream intended for the original designated community and described by the Representation metadata in the Content Unit Behavior Section (behaviorSec) can be used to associate executable behaviors with content in the XFDU object. A behavior section has an interface definition element that represents an abstract definition of the set of behaviors represented by a particular behavior section. A behavior section also has a behavior mechanism that is a module of executable code that implements and runs the behaviors defined abstractly by the interface definition.

31 Interoperability profile The Profile will indicate that ALL content for both metadata and data files will be referred to using dataObjPtr Transfer mechanism for XFDU : we do not support processing before all the data has come down the wire – assume XFDU file is on local file system before it is opened via HTTP in SOAP with attachment – where the XFDU zip file is an attachment Identifier – uniqueness issues package instance identifier could perhaps use UUID registry for xml Schema could be simple FTP server, with front-end index file Unique name for manifest file MANIFEST/ccsdsxfdu.xml


Download ppt "MOIMS Internet Packaging and Registries WG XML Formatted Data Units (XFDU) XML Packaging of Binary and Text Data Lou Reich NASA/CSC MOIMS Plenary May 10,"

Similar presentations


Ads by Google