Presentation is loading. Please wait.

Presentation is loading. Please wait.

10041267M-1 INGEST OVERVIEW Don Sawyer National Space Science Data Center NASA/GSFC October 13, 1999.

Similar presentations


Presentation on theme: "10041267M-1 INGEST OVERVIEW Don Sawyer National Space Science Data Center NASA/GSFC October 13, 1999."— Presentation transcript:

1 10041267M-1 INGEST OVERVIEW Don Sawyer National Space Science Data Center NASA/GSFC October 13, 1999

2 10041267M-2 OAIS Functional Entities SIP = Submission Information Package AIP = Archival Information Package DIP = Dissemination Information Package SIP Descriptive Info. AIP DIP Administration PRODUCERPRODUCER CONSUMERCONSUMER Queries, orders Result sets MANAGEMENT Ingest Access Data Management Archival Storage Descriptive Info.

3 10041267M-3 OAIS Ingest Functions

4 10041267M-4 Possible Ingest Methodology/Standards Set of interactions all archives might expect to engage in with their data producers — From a data producer view — From an archive view Set of processes all archives use to prepare information for Archival Storage Recommended standard Submission Information Package Recommended standards to ensure information is readily migratable forward in time Recommended standards to ensure adequate Representation Information is obtainable and uniquely identified Recommended standards to ensure adequate Preservation Description Information is obtained

5 10041267M-5 Ingest Papers “The Archive Ingest Process”, by Mike Martin “Ingest Standards (and others) in the OAIS Model” by David Holdsworth (unable to attend) “Persistent Archives for Data Collections” by Reagan Moore “Archive Issues with the Evolution of Data and Information”, by Parmesh Dwivedi and William Callicott

6 10041267M-6 “The Archive Ingest Process” Methodology for archive ingest process — Identifies key OAIS ingest functions — Primarily a ‘data producer’ view Six steps for data producer interaction with archive — Orientation - finding out what archive will expect — Archive Planning - deciding what to archive, when and generally how — Design - determining what needs to be included in the Submission Information Package — Review - the final archive quality check — Delivery - transferring the Submission Information Package to the archive

7 10041267M-7 First 2 Steps - Examples Orientation — Establishes contact with archive — Provides general information to archive — Obtains archive orientation materials — Establishes technical contacts Archive Planning — Prepare a Producer Data Management Plan (PDMP) — Prepare a Submission Agreement (SA) — Plan for updates to PDMP and SA — Keep archive data engineer informed — Participate in planning meetings — Review/sign off archive interface plan

8 10041267M-8 Middle 2 Steps - Examples Archive Design — Review archive standards — Design data products and representation information — Design the data set or collection — Design volumes and volume sets — Design data production process — Plan data validation process — Prepare Preservation Description Information Data set assembly and validation — Create data products — Prepare Volume Components — Execute data validation procedures — Transfer data to final medium

9 10041267M-9 Final 2 Steps - Examples Review — Establish a review committee — Prepare for the review — Conduct review — Correct/document review liens Delivery — Coordinate generation of duplicate copies (in needed) — Data classification (restriction) procedures — Transfer volumes to archive — Update data sets with corrections or enhancements

10 10041267M-10 “Ingest Standards…in OAIS Model” Focus on sufficiency of Representation Information — Needed to free information from underlying media — Preserve against technology obsolescence — “Bit-stream can be preserved indefinitely” Two stage/step process of ingest — Separation of data from the medium — Map to a bit-stream (I.e., make the data object part of the Archival Information Package) Follow this with preservation of the bit-stream in an archival store Form of data between 2 steps is called Underlying Abstract Form

11 10041267M-11 Underlying Abstract Form Information has existence and content separate from medium on which it resides Contains all significant properties of the data Representation information is to enable access to preserved digital object in meaningful way — For complex objects, emulation is likely to be needed — Enables reversal of ingest process to deliver copy of original (assuming appropriate hardware available) — Ingest must have quality assurance procedures to ensure this can happen

12 10041267M-12 “Persistent Archives for Data Collections” Paper will be presented by author (Reagan Moore) in plenary Focus is on data and information models — Needed to manage and federate collections, and — Migrate collections forward in time Persistent archive strategy — Use information model for describing data — Distinguish context needed for data set, for collections, and for user interfaces to collection — Support interoperability across heterogeneous software/hardware systems Decouple collections from access mechanisms Ingest methodology/standards can be based on emerging digital library standards (example - XML DTDs) — Proprietary formats must be transformed to migrateable standards during ingestion

13 10041267M-13 “Archive Issues with the Evolution of Data and Information” Focus is on historical evolution of the ’archive’ Digital explosion is forcing radical changes in methods of archive Access is a major driver - not preservation Data will be lost We may become buried in a forest of data and information Are we approaching a technology singularity?

14 10041267M-14 Ingest Summary Mike’s paper addresses possible steps a data producer would follow to properly prepare information for submission to an archive David’s paper and Reagan’s paper, as regards ingest, address information modeling and the role of representation information in ensuring persistent/migrateable content Parmesh and William’s paper questions whether all ‘our’ efforts will be in vain!


Download ppt "10041267M-1 INGEST OVERVIEW Don Sawyer National Space Science Data Center NASA/GSFC October 13, 1999."

Similar presentations


Ads by Google