Presentation is loading. Please wait.

Presentation is loading. Please wait.

Prepared by: Lou Reich (CSC) and Don Sawyer (NASA)

Similar presentations


Presentation on theme: "Prepared by: Lou Reich (CSC) and Don Sawyer (NASA)"— Presentation transcript:

1 Prepared by: Lou Reich (CSC) and Don Sawyer (NASA)
ISO “Reference Model For an Open Archival Information System (OAIS)” Tutorial Presentation for DADs Workshop Prepared by: Lou Reich (CSC) and Don Sawyer (NASA) June 22, 1998 1

2 Overview of Tutorial Background Information
OAIS Purpose, Scope and Applicability OAIS Concepts OAIS Responsibilities OAIS Detailed Reference Model – Archival Information Model – Functional Model Analysis of Archive Issues Using OAIS RM – Archive Associations – Migration Summary

3 Background Information

4 Genesis of the Effort (1)
The international context : Multiplication of data holding sites, Rapidly growing needs for the long term preservation of digital data Incomplete understanding of the digital preservation issues Constant technological changes and lack of standards in the area of digital archiving Initial framework : ISO Technical Committee (TC) 20, Aircraft and Space Vehicles, and its Sub-Committee (SC) 13: Space Data and Information Transfer Systems proposal : Define an Archive reference model and its Services categories Address data used in conjunction with space missions Address intermediate and indefinite long term storage of digital data First step before developing specific standards needed to support archive services In connection with the rapid decrease of electronic storage cost and the rapid increase in netwokf connectivity and bandwith 2

5 Genesis of the Effort (2)
Proposal made to Consultative Committee for Space Data Systems (CCSDS) and ISO TC 20/SC 13 Develop a ‘Reference Model’ to establish common terms and concepts Ensure broad participation, including traditional archives (Not restricted to space communities; all participation is welcome!) Focus on data in electronic forms, but recognize that other forms exist in most archives Follow up with additional archive standards efforts as appropriate Impact of both CCSDS and ISO procedures The CCSDS reference model will become an ISO standard

6 Status of the Effort (Updated 1998-10-10)
Reference Model will be submitted as a draft international standard in December, 1998 Current version is White Book 4.0, available under Reference Materials heading, at: Widest possible exposure is desirable now Participation is still welcome Full CCSDS and ISO standard by June, 1999

7 Open Archival Information System (OAIS)
Reference Model standard(s) are developed using a public process and are freely available Information Any type of knowledge that can be exchanged Independent of the forms (i.e., physical or digital) used to represent the information Data are the representation forms of information Archival Information System Hardware, software, and people who are responsible for the acquisition, preservation and dissemination of the information Additional OAIS responsibilities are identified later and are more fully defined in the Reference Model document This reference model is called : Reference Model for an OAIS (Open Archival Information System) These tems have to be explained : In the framework of the archiving reference model elaboration, a glossary has been defined. This is a crucial point for the common understandding of archival concepts 4

8 Document Organization
Introduction Purpose and Scope, Applicability, Rationale, Road Map for Future Work, Document Structure, and Definitions of Terms OAIS Concepts High level view of OAIS functionality and information models OAIS external environment Minimum responsibilities to become an “OAIS” Detailed Models Functional model descriptions and information model perspectives Migration perspectives Media migration, compression, and format conversions Archive Cooperation Criteria to distinguish types of cooperation among archives Annexes Scenarios of existing archives, compatibility with other standards The current version ot the Reference model document is organised as follow The parts in blue will no detailed to day 6

9 Purpose, Scope and Applicability

10 Purpose Framework for understanding and applying concepts needed for long-term digital information preservation Long-term is long enough to be concerned about changing technologies Starting point for model addressing non-digital information Provides set of minimal responsibilities to distinguish an OAIS from other uses of ‘archive’ Framework for comparing architectures and operations of existing and future archives Basis for comparing data models of digital information preserved by archives, including their changes over time Expands consensus on elements and processes needed for long-term digital information preservation Guides the identification of future OAIS standards Does NOT specify any implementation

11 Scope Addresses minimal responsibilities of an OAIS
Addresses a full range of archival functions including Ingest, Archival Storage, Data Management, Access and Dissemination Internal and external interfaces High level services at the interfaces Addresses the data models used to represent information Addresses migration of digital information to new media and new forms Addresses interoperability among OAIS archives Provides illustrative examples for context

12 Applicability Organizations with responsibility to preserve information over the long-term Produce information that may need long-term preservation Consumers of information from long-term archives May be applicable to organizations maintaining temporary archives Rapid technology changes may drive same preservation issues Information held may need long-term preservation Functional and information modeling concepts may be useful The access models are applicable to all entities Standards developers as basis for future standards

13 OAIS Concepts

14 Model View of an OAIS’s Environment
Producer is the role played by those persons, or client systems, who provide the information to be preserved Management is the role played by those who set overall OAIS policy as one component in a broader policy domain Consumer is the role played by those persons, or client systems, who interact with OAIS services to find and acquire preserved information of interest The environment surrounding the OAIS is given by this simple model outside the OAIS are : producers : play the rôle of those who provide the information to preserve management : play the rôle of those who set overall OAIS policy consumers : play the rôle of those who interact with the OAIS services to find information of interest and to access this information OAIS (archive) Producer Consumer Management 7

15 OAIS Information Definition
Information is defined as any type of knowledge that can be exchanged, and this information is always expressed (i.e., represented) by some type of data In general, it can be said that “Data interpreted using its Representation Information yields Information” In order for this Information Object to be successfully preserved, it is critical for an archive to clearly identify and understand the Data Object and its associated Representation Information Interpreted Using its Yields Data Object Representation Information Information Object

16 Information Package Definition
Preservation Description Information Content Information An Information Package is a conceptual container of two types of information called Content Information and Preservation Description Information (PDI)

17 Information Package Variants
Submission Information Package Negotiated between Producer and OAIS Sent to OAIS by a Producer Archival Information Package Information Package used for preservation Includes complete set of Preservation Description Information for the Content Information Dissemination Information Package Includes part or all of one or more Archival Information Packages Sent to a Consumer by the OAIS

18 External Data Flow Diagram
Producer Submission Information Packages OAIS Archival Information Packages Legend = Entity Information Package Data Object = Data Flow = queries This diagram concentrates on the flow of information among producer, Consumers and the OAIS, and does not include flows that involve management query response Dissemination Information Packages orders Consumer 8

19 OAIS Responsibilities

20 OAIS Responsibilities
Accepts Information Packages from information producers Assumes sufficient control to ensure long-term preservation Determines which communities need to be able to understand the preserved information Ensures the information to be preserved is independently understandable to the designated communities Follows documented policies and procedures which ensure the information is preserved against all reasonable contingencies Makes the preserved information available to the Designated Communities in forms understandable to those communities 11

21 Detailed Models Overview

22 Overview of Detailed Models
It was decided to do both a functional and an information model of the OAIS Both models were tasked to: Use the models to better communicate OAIS Concepts Use a well established, formal modeling technique Stay as implementation independent as possible Avoid detailed designs

23 Detailed Models Information Model

24 General Principles Define classes of “information objects’ that illustrate information necessary to enable Long-term storage and access to Archives The class definition should be implementation Independent Use a variant of Object Modeling Technique (OMT) as a notation

25 OMT Notation Overview Class: Multiplicity of Associations:
Class Name Class Exactly one Class Many (zero or more) Aggregation: Assembly Class Class Optional (zero or one) 1+ Class One or more Part -1 Class Part-2 Class Specialization: Association: Association Name Parent Class Class-1 Class-2 Child -1 Class Child-2 Class

26 Information Objects Information Object Representation 1+ interpreted
using Data Physical Digital Bit Sequence

27 Representation Information
The Representation Information accompanying a physical object like a moon rock may give additional meaning, as a result of some analysis, to the physically observable attributes of the rock The Representation Information accompanying a digital object, or sequence of bits, is used to provide additional meaning. It typically maps the bits into commonly recognized data types such as character, integer, and real and into groups of these data types. It associates these with higher level meanings which can have complex inter-relationships that are also described

28 Classes of Representation Information
Structural Information: applied to turn bit sequences into common computer data types, aggregations of these data types, and mapping rules which map from the underlying data-types to the higher level structures needed to understand the Digital Object Semantic Information: includes meanings associated with all the elements of the structural information, operations that may be performed on each data-type, and their inter-relationships.

29 Recursive Nature of Representation Information
Preexisting standards that define primitive data-types Mapping rules that map those primitive data-type into the more complex data-type concept used by the Data Object Other semantic informa-tion that aids in the under-standing of the Data such as a Data Dictionary

30 Sample Representation Net

31 Types of Information Used in OAIS

32 Content Information The information which is the primary object of preservation An instance of Content Information is the information that an archive is tasked to preserve. Deciding what is the Content Information may not be obvious and may need to be negotiated with the Producer The Data Object in the Content Information may be either a Digital Object or a Physical Object (e.g., a physical sample, microfilm)

33 Preservation Description Information
Provenance Information Describes the source of Content Information, who has had custody of it, what is its history Context Information Describes how the Content Information relates to other information outside the Information Package Reference Information Provides one or more identifiers, or systems of identifiers, by which the Content Information may be uniquely identified Fixity Information Protects the Content Information from undocumented alteration

34 Example of Preservation Description Information
Content Information Type Reference Provenance Context Fixity Space Science Data Bibliographic Information Software Package Object Identifier Journal Reference Mission, instrument, and title attribute set ISBN Title Author Name Version number Serial Number Instrument Description Processing History Sensor Description Instrument Instrument mode Processing history Decommunication map Software Interface Specifications Printing history Copyright Position in series Manuscripts References Revision Histroy License holder Registration Calibration history Related data sets Mission Funding history Related References Dewy Decimal System Publishing Data Publisher Help file User Guide Related Software Language CRC Checksum Reed-Solomon coding Author Digital signature Cover Certificate Encryption

35 Descriptive Information
Contain the data that serves as the input to documents or applications called Access Aids. Access Aids can be used by a consumer to locate, analyze, retrieve, or order information from the OAIS.

36 Packaging Information
Information which, either actually or logically, binds and relates the components of the package into an identifiable entity on specific media Examples of Packaging Information include tape marks, directory structures and filenames

37 OAIS Archival Information Package
Package (AIP) Packaging Information Package Descriptor derived from delimited by e.g., Information supporting customer searches for AIP e.g., How to find Content information and PDI on some medium Preservation Description Information (PDI) Content Information further described by The AIP is a basic concept in the OAIS. The AIP contains a piece of information that the OAIS has to be preserve. The AIP isq defined to be composed of two types of information objects : one is called the Content Information (CI) and the other is called the Preservation Description Information (PDI) (terminology under discussion) The purpose in making this distinction is to propote a clear difference between that information (the CI) that is the primary focus of archival preservation and that information which is needed to support the long term preservation or the primary information. The AIP is a generic term, it will be specialised in AIC (Archiving Information Collection) and AIU (Archiving Information Unit) e.g., • Hardcopy document • Document as an electronic file together with its format description • Scientific data set consisting of images and text in three electronic files together with format descriptions e.g., • How the Content Information came into being, who has held it, how it relates to other information, and how its integrity is assured 9

38 AIP Types Based on the difference in Content Object complexity
AIUs contain a single Data Object as the Content Object AICs contain multiple AIPs in their Content Objects Each member of an AIC is an AIP containing Content Information and PDI The AIC contains unique PDI on the collection process

39 Package Descriptors and Access Aids
Package descriptors are needed by an OAIS to provide visibility and access to the OAIS holdings Package Descriptors contain 1 or more Associated Descriptions which describe the AIP Content Information from the point of view of a single Access Aid Some example of Access Aids Include: Finding Aids - assist the consumer in locating information of interest Ordering Aids - allow the consumer to discover the cost of and order AIUs of interest Retrieval Aids - enable authorized users to retrieve the AIU described by the Unit Descriptor from Archival Storage

40 Model of All Data Objects Stored in Data Management

41 Information Model Summary
Presented a model of information objects as containing data objects and representation objects Classified information required for Long-term archiving into 4 classes: Content Information, PDI, Packaging Information and Descriptive Information Described how these classes would be aggregated and related in an AIP to fully describe an instance of Content Information Presented information needed for Access, in addition to that needed for Long-term Preservation Put the Access oriented structures in the context of the other data needed to operate an OAIS

42 Detailed Models Functional View

43 General Principles Highlight the major functional areas important to digital archiving Use functional decomposition to clarify the range of functionality that might be encountered Don't decompose beyond two levels to avoid becoming too implementation dependent Provide a useful set of terms and concepts Do not imply that all archives need to implement all the sub-functions Identify some common services which are likely to be needed, and are assumed to be available, as underlying support

44 Common Services Modern, distributed computing applications assume a number of supporting services Examples of Common Services include: inter-process communication name services temporary storage allocation exception handling security file and directory services

45 OAIS Functional Entities
Data Management DI DI P R O D U C E Requests Ingest Acces other information C O N S U M E R SIP AIP AIP Archival Storage DIP Administration We can see, in this view, that the AIP is a specialized object create from a more genenal object called ‘Information package’ MANAGEMENT SIP = Submission Information Package AIP = Archival Information Package DIP = Dissemination Information Package DI = Descriptive Information 10

46 Functional Entities In An OAIS
Ingest: This entity provides the services and functions to accept Submission Information Packages (SIPs) from Producers and prepare the contents for storage and management within the archive Archival Storage: This entity provides the services and functions for the storage, maintenance and retrieval of Archival Information Packages Data Management: This entity provides the services and functions for populating, maintaining, and accessing both descriptive information which identifies and documents archive holdings and internal archive administrative data. Administration: This entity manages the overall operation of the archive system Access: This entity supports consumers in determining the existence, description, location and availability of information stored in the OAIS and allowing consumers to request and receive information products

47 Ingest Functions Schedule Submission Delivery: negotiates a data submission schedule with the producer Receive Submission: provides the appropriate storage capability or devices to receive a SIP from the producer. The Receive SIP function may represent a legal transfer of custody for the CI in the SIP, and may require that special access controls be placed on the contents Generate Archival Information Package: transforms one or more SIPs into one or more AIPs that conforms to the internal data model of the archive. Generate Descriptive Information: extracts Descriptive Information from the AIPs to populate the data management system. Coordinate Updates: responsible for transferring the AIPs to Archival Storage and the Descriptive Information to Data Management

48 Archival Storage Functions
Receive Data: receives a transfer request and an Archival Information Package from the staging area and moves the data to permanent storage within the archive Manage Storage Hierarchy: positions the contents of the Archival Information Packages (AIPs) on the appropriate media based on directions from ingest (transfer request), administrative policies or usage statistics Refresh Media: provides the capability to reproduce the Archive Holdings over time Error Checking: provides statistically acceptable assurance that no components of the archive information package are corrupted during any internal archive data transfer or transformation Disaster Recovery: provides a mechanism for producing duplicate copies of AIPs (AIUs and AICs) in the archive collection Provide Data: provides copies of stored AIPs to Access

49 Data Management Functions
The Administer Database: responsible for maintaining the integrity of the Data Management database Perform Queries: receives a query request from Access and Dissemination and executes the query to generate a result set} that is transmitted to the requester The Generate Report: receives a report request and executes any queries or other processes necessary to generate the report then supplies the report to the requester Receive Database Update: adds, modifies or deletes information in the Data Management persistent storage Activate Request: maintains a record of subscription requests and periodically compares it to the contents of the archive to determine if all needed data is available. If needed data is available, this function generates a Dissemination Request which is sent to the Access. This function can also generate Dissemination Requests on a periodic basis

50 Access Functions (1 of 2) Prepare Finding Aids: provides tools and products which provide an overview of information products available in the archive system Receive Requests: provides a single user interface to the information holdings of the archive. This interface will normally be via computer network or dial-up link to an on-line service, but might also be implemented in the form of a walk-in facility, printed catalog ordering service, or fax-back type service Coordinate Request: function determines the resources needed to fulfill archive requests and forwards them to those entities for execution. Query and report requests will be fulfilled by data management while dissemination requests will generally require information from data management and archival storage

51 Access Functions (2 of 2) Generate DIP: accepts a dissemination request, retrieves the data from Archival Storage and moves a copy of the data to a staging area for further processing. This function also transmits a report request to Data Management to generate descriptive information. If special processing is required this function accesses data objects in staging storage and applies the requested processes. This function places the completed DIP package in the staging area and notifies both Coordinate Request and the Deliver DIP function that the package is ready for delivery. Deliver DIPs: handles both on-line and off-line deliveries of DIPs to consumers Provide Access Controls: provides a hierarchy of security controls depending on the needs of the archive system. These include restricting access to certain information due to security classifications or copyright restrictions.

52 Administration Functions (1 of 2)
Negotiate Submission Agreement: solicits desirable archival information for inclusion into the OAIS and negotiates submission agreements with data producers Manage System Configuration: provides system engineering for the archive system to systematically control changes to the configuration. This function maintains integrity and tractability of the configuration during all phases of the system life cycle. It also audits system operations, system performance, and system usage and plans for system evolution Physical Access Control: provides mechanisms to restrict or allow physical access (doors, locks, guards) to elements of the archive as determined by archive policies

53 Administration Functions (2 of 2)
Develop Standards and Policies: is responsible for developing and maintaining the archive system data standards. These standards include format standards, documentation standards and the procedures to be followed during the ingestion process. It will also develop policies for Archival Storage hierarchy management and migration policies Audit AIPs: is carried out by the archive data engineers and may also involve an outside committee (e.g., science and technical review). The audit process must verify that the quality of the data meets the requirements of the archive and the review committee Interact with Management: receives and carries out Management policies. These policies include such things as the OAIS charter, scope, resource utilization guidelines, and pricing policies. It also provides OAIS performance information to Management

54 Analysis of Archive Issues Using OAIS RM
Archive Associations

55 Archive Cooperation Users of multiple OAIS archives have reasons to wish for some uniformity or cooperation among the OAISs. Consumers Common finding aids to aid in locating information over several OAIS archives Common Package Descriptor schema for access Common DIP schema for dissemination, or a single global access site. Producers common SIP schema for submission to different archives a single depository for all their products. Managers Cost reduction through sharing of expensive hardware increasing the uniformity and quality of user interactions with the OAIS

56 Categories of Archive Interactions
Independent: no knowledge by one OAIS of Standards implemented at another Cooperating: Potentially common submission standards, and common dissemination standards, but no common access. One archive may make subscription requests for key data at the cooperating archive Federated: Access to all federated OAIS is provided through a common set of access aids that provide visibility into all participating OAISs. Global dissemination and Ingest are options Shared resources: An OAIS in which Management has entered into agreements with other OAISs is to share resources to reduce cost. This requires various standards internal to the archive (such as ingest-storage and access-storage interface standards), but does not alter the community’s view of the archive

57 Cooperating Archives Method B Method A Ing Acc OAIS Ing Acc Ing Acc Adm Adm Adm OAIS OAIS Producer Consumer Adm Adm Adm Adm Adm Adm Ing Acc Method B OAIS Method A Adm Adm Adm The first set of cooperating OAIS merely have an agreement to share at least on common SIP and DIP format to enable the transfer of holdings The second set of cooperating OAIS have standardized their DIP and SIP formats for use by producers and consumers

58 Federated Archives

59 Levels of Autonomy in Associated Archives
No interactions and therefore no association Associations that maintain your autonomy. You have to do certain things to participate, but you can leave the association without notice or impact to you. Associations that bind you by contract. To change the nature of this association you will have to re-negotiate the contract. The amount of autonomy retained depends on how difficult it is to negotiate the changes.

60 Analysis of Archive Issues Using OAIS RM
Migration

61 Digital Migration Digital Migration is defined to be the transfer of digital information, while intending to preserve it, within the OAIS. Focus on the preservation of the full information content Internal OAIS perspective. Three major motivators are seen to drive Digital Migrations of Archival Information Packages within an OAIS: Media Decay Increased Cost Effectiveness New Consumer Service Requirements

62 Digital Migration Approaches
Two basic types of migration in response to motivators: Repackaging Transformation Each of these comes in two basic flavors, ordered by increasing risk of information loss: Physical Repackaging (Replication) Media replacement with no bit changes Digital Repackaging Some bit changes in Packaging Information Reversible Transformations Bit changes in Content Information are reversible by an algorithm Non-reversible Transformations Bit changes in Content Information are not reversible by an algorithm

63 Some Migration Strategies
Use media types with long-lived support Enhances chances that Replication will be useful Minimize use of media format attributes for holding Content Information Allows digital repackaging without having to also do transformations when migrating to new media types Maintain originals if non-reversible transformations are required

64 AIP Versions and Editions
An AIP that undergoes a transformation during migration becomes a new version Edition An AIP that is revised to improve its information content is termed a new edition. This is not a migration. Derivation An AIP that is the result of being derived from one or more other AIPs is termed a derived AIP. This is not a migration.

65 Summary and Request

66 Summary Reference model is to be applicable to all digital archives, and their Producers and Consumers Identifies a minimum set of responsibilities for an archive to claim it is an OAIS Establishes common terms and concepts for comparing implementations, but does not specify an implementation Provides detailed models of both archival functions and archival information Discusses OAIS information migration and interoperability among OAISs

67 Request for Participation
(Updated ) Ultimate success of this effort depends on obtaining adequate review and comment Reference Model Red Book/ISO Draft International Standard (DIS) expected December 1998 Current version is White Book 4.0, available under Reference Materials heading, at: Comments are being actively solicited Send to


Download ppt "Prepared by: Lou Reich (CSC) and Don Sawyer (NASA)"

Similar presentations


Ads by Google