Download presentation
Presentation is loading. Please wait.
Published byJacob Kennedy Modified over 10 years ago
1
IBM Haifa Research Lab © 2008 IBM Corporation Contacts: Simona Cohen, Michael Factor, Dalit Naor http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml Preservation DataStores (PDS): A demonstration of Preservation Aware Storage
2
IBM Haifa Research Lab © 2008 IBM Corporation 2http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml PDS Overview OAIS–based preservation-aware storage, media-agnostic, and generic storage to support logical preservation Manage preservation specific metadata –Compute fixity –Update technical provenance –Manage PDI RepInfo –Ensure referential integrity Storlet container –Module container that can execute restricted modules with predefined interfaces for data intensive functions, e.g., transformations, fixity calculations –Optimal scheduling –Update PDS modules, e.g., fixity algorithm, packaging format Managing availability/ data loss –Physically co-locate data and metadata –Cluster related AIPs on the same media unit, based upon their relative importance AIP identifier generation – globally unique identifiers Content Data Object Representation Information Reference ContextFixity Provenance Content Information Preservation Descriptive Information
3
IBM Haifa Research Lab © 2008 IBM Corporation 3http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml PDS Architecture Layered approach is based on open standards: OAIS, XAM, OSD In CASPAR, all layers are utilized. In other embodiments, only some layers can be used Utilizes XAM to provide logical abstraction of containers (XSets) Offloads preservation functionality to storage ingestAIP, accessAIP, transformAIP, getPreservationPolicy
4
IBM Haifa Research Lab © 2008 IBM Corporation 4http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml AIP Representation in the Preservation Engine The basic building-block of an AIP is the Information Object –A single instance of data is a byte stream –Zero or more instances of RepInfo to interpret the data RepInfo is also an AIP –Shared resource with unique ID –RepInfo recursive network –Keep the design simple
5
IBM Haifa Research Lab © 2008 IBM Corporation 5http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml Preservation of an AIP Ingest AIP –Storing an AIP in PDS Bit and logical preservation of an AIP –Bit migrations –Format migrations - data transformation Transformation modules are packed as AIPs and preserved Transformation result is a new version for the original AIP –Migrations are documented as Provenance records –During migrations PDS performs operations on AIP Update PDI (e.g., fixity calculation, additional provenance events) Execute previously loaded storlets Access AIP –Retrieval of an AIP By retrieval time, the original AIP may have several versions and copies
6
IBM Haifa Research Lab © 2008 IBM Corporation Demo
7
IBM Haifa Research Lab © 2008 IBM Corporation 7http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml MST Data Natural Environment Research Council (NERC) Mesosphere-Stratosphere-Troposphere (MST) Radar at Aberystwyth Provides measurements of vertical and horizontal components of the wind Highly complex data Preserved for the atmospheric scientific community Needs long-term preservation for research purposes Community is highly dependent on interoperability
8
IBM Haifa Research Lab © 2008 IBM Corporation 8http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml British Atmospheric Data Centre and Self-Describing Data Formats BADC strongly advocate the use of self-describing files to capture provenance and semantic information Two self-describing file formats compete in the Atmospheric Science Community –NASA AMES (ASCII) –NetCDF (binary) The current user community finds NetCDF more suitable
9
IBM Haifa Research Lab © 2008 IBM Corporation 9http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml 10 years later 20 years later Ingest AIP Load and apply transformation Access and view transformed data View AIP Demo Flow
10
IBM Haifa Research Lab © 2008 IBM Corporation 10http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml pds_view Video:
11
IBM Haifa Research Lab © 2008 IBM Corporation 11http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml Ingest AIP flow in PDS Preservation Engine PDS web service XAM XAM to OSD (VIM) OSD ingest (AIP) XUID AIP ID create XSet write fields (AIP data) commit create OSD objects write (XSet data) Unpack AIP Generate AIP ID Compute fixity Add ingest provenance event store (OID, XUID) mapping persistently store (XUID, AIP ID) mapping persistently
12
IBM Haifa Research Lab © 2008 IBM Corporation 12http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml 10 years later 20 years later Ingest AIP Load and apply transformation Access and view transformed data View AIP Demo Flow
13
IBM Haifa Research Lab © 2008 IBM Corporation 13http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml pds_ingest Video:
14
IBM Haifa Research Lab © 2008 IBM Corporation 14http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml
15
IBM Haifa Research Lab © 2008 IBM Corporation 15http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml Format Change Converting NetCDF to NASA-AMES –The MST NetCDF file version is no longer supported by the NetCDF foundations libraries, and files are not backwards compatible with this NetCDF version –The skills base of the atmospheric scientist in the future is heavily biased towards the use of ASCII-based file formats –The NASA-AMES format has become the preferred data format of the atmospheric scientists community Converting NetCDF to PNG –Provide additional quick image view of the data for users with limited access permissions
16
IBM Haifa Research Lab © 2008 IBM Corporation 16http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml Load transformation flow in PDS Preservation Engine PDS web service loadTransformation (AIP) AIP ID Register transformation Ingest transformation AIP
17
IBM Haifa Research Lab © 2008 IBM Corporation 17http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml Transform AIP flow in PDS PDS web service transformAip (targetAIPID, transformationAIPID) AIP ID Access transformation AIP Access target AIP Execute transformation Ingest result as new AIP, a version of the target AIP Preservation Engine
18
IBM Haifa Research Lab © 2008 IBM Corporation 18http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml 10 years later 20 years later Ingest AIP Load and apply transformation Access and view transformed data View AIP Demo Flow
19
IBM Haifa Research Lab © 2008 IBM Corporation 19http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml pds_xform Video:
20
IBM Haifa Research Lab © 2008 IBM Corporation 20http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml
21
IBM Haifa Research Lab © 2008 IBM Corporation 21http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml Access AIP flow in PDS Preservation Engine PDS web service XAM XAM to OSD (VIM) OSD access (AIP ID) read OSD objects AIP data AIP XSet data open XSet (XUID) read fields Lookup XUID by AIP ID Lookup OSD OID by XUID Validate AIP ID Validate fixity Add access provenance event Package AIP
22
IBM Haifa Research Lab © 2008 IBM Corporation 22http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml Demo Flow 10 years later 20 years later Ingest AIP Load and apply transformation Access and view transformed data View AIP
23
IBM Haifa Research Lab © 2008 IBM Corporation 23http://www.haifa.il.ibm.com/projects/storage/ltdp/index.shtml pds_access Video:
24
IBM Haifa Research Lab © 2008 IBM Corporation Thank you! Haifa preservation team: Simona Cohen, Michael Factor, Aner Hamama, Ealan Henis, Dalit Naor, Petra Reshef, Shahar Ronen
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.