OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.

Slides:



Advertisements
Similar presentations
Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From.
Advertisements

An Introduction to Repositories Thornton Staples Director of Community Strategy and Alliances Director of the Fedora Project.
Fedora Users’ Conference Rutgers University May 14, 2005 Researching Fedora's Ability to Serve as a Preservation System for Electronic University Records.
An Introduction June 17, 2013 Open Archival Information System (OAIS)
Institutional Repositories It’s not Just the Technology New England Archivists Boston College March 11, 2006 Eliot Wilczek University Records Manager Tufts.
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
Fedora 3.0 and METS: A Partnership for the Organization, Presentation and Preservation of Digital Objects Open Repositories Georgia Tech, Atlanta,
ISO & OAI-PMH By Neal Harmeyer, Amy Hatfield, and Brandon Beatty PURDUE UNIVERSITY RESEARCH REPOSITORY.
Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
PAWN: Producer-Archive Workflow Network University of Maryland Institute for Advanced Computer Studies Joseph JaJa, Mike Smorul, Mike McGann.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation Mike Smorul, Joseph JaJa, Yang Wang, and Fritz McCall.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Digital Library Architecture and Technology
DCC Conference, Glasgow November, Digital Archive Policies and Trusted Digital Repositories MacKenzie Smith, MIT Libraries Reagan Moore, San Diego.
Ingest and Dissemination with DAITSS Presented by Randy Fischer, Programmer, Florida Center for Library Automation, University of Florida DigCCurr2007.
Statewide Digitization and the FCLA Digital Archive Priscilla Caplan, Florida Center for Library Automation Statewide Digitization Planners Meeting OCLC,
A Logical Model for Digital Archives Rathachai Chawuthai Information Management CSIM / AIT Draft document 0.1.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
1 A journey of a thousand miles begins with a single step. Chinese Proverb.
Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Paul Bevan DAMS Implementation Manager
OAIS Open Archival Information System. “Content creators, systems developers, custodians, and future users are all potential stakeholders in the preservation.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan, Florida Center for Library Automation DCC Workshop on Long-term Curation within Digital Repositories.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
Use & Access 26 March Use “Proof of Concept” Model for General Libraries & IS faculty Model for General Libraries & IS faculty Test bed for DSpace.
The FCLA Digital Archive Joint Meeting of CSUL Committees, 2005.
Digital Preservation: Current Thinking Anne Gilliland-Swetland Department of Information Studies.
Linked Digital Archive Institutional Repository Rathachai Chawuthai CSIM/SET/AIT.
Archival Workshop on Ingest, Identification, and Certification Standards Certification (Best Practices) Checklist Does the archive have a written plan.
Integrating metadata schema registries with digital preservation systems to support interoperability Michael Day UKOLN, University of Bath, UK
Selene Dalecky March 20, 2007 FDsys: GPO’s Digital Content System.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
M-1 ISO “Reference Model For an Open Archival Information System (OAIS)” ISO “Reference Model For an Open Archival Information System (OAIS)” Presentation.
Fedora and the Preservation of University Electronic Records Project NHPRC Electronic Records Research Grant Kevin L. Glick Manuscripts and Archives, Yale.
M-1 INGEST OVERVIEW Don Sawyer National Space Science Data Center NASA/GSFC October 13, 1999.
DAITSS and the Florida Digital Archive Priscilla Caplan Florida Center for Library Automation iPRES 2006.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
The OAIS Reference Model and Trustworthy Repositories Josh Lubell Manufacturing Engineering Laboratory NIST
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
The OAIS model SEEDS meeting May 5 th, 2015, Lausanne Bojana Tasic.
Cedars work on metadata Michael Day UKOLN, University of Bath Cedars Workshop Manchester, February 2002.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Data Management and Digital Preservation Carly Dearborn, MSIS Digital Preservation & Electronic Records Archivist
Preservation Functionality in a Digital Archive Erik Oltmans Koninklijke Bibliotheek Raymond J. van Diessen IBM Business Consulting Services Hilde van.
OAIS (archive) Producer Management Consumer. Representation Information Data Object Information Object Interpreted using its Yields.
2/26/2004 Dan Swaney 1 Preservation Metadata and the OAIS Information Model A Metadata Framework to Support the Preservation of Digital Objects A review.
OAIS (archive) OAIS (archive) Producer Management Consumer.
R2R ↔ NODC Steve Rutz NODC Observing Systems Team Leader May 12, 2011 Presented by L. Pikula, IODE OceanTeacher Course Data Management for Information.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Joint Meeting of CSUL Committees,
Preservation Planning
Metadata Issues in Long-term Management of Data and Metadata
Ingest and Dissemination with DAITSS
OAIS Producer (archive) Consumer Management
DAITSS: Dark Archive in the Sunshine State
DAITSS and the Florida Digital Archive
An Introduction to Tessella and The Safety Deposit Box Platform
Joseph JaJa, Mike Smorul, and Sangchul Song
Statewide Digitization and the FCLA Digital Archive
Research data preservation in Canada
An Open Archival Repository System for UT Austin
Open Archival Information System
Metadata The metadata contains
Robin Dale RLG OAIS Functionality Robin Dale RLG
The Reference Model for an Open Archival Information System (OAIS)
Database management systems
Presentation transcript:

OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0

Preface Overview Data Model Function Model Architecture Model OAIS in use 2

3

I want to build own restaurant. What should I do? 4

What you should know 5

What you should plan 6

How you should run 7

I don’t tell you a blueprint or concrete model for running a restaurant. But I guide you WHAT and HOW that you have to consider when plan to run a restaurant business. 8

I want to build an archival information system. What should I do? 9

Understand OAIS reference model Understand OAIS reference model 10

11

O pen A rchival I nformation S ystem In 2000 the Research Libraries Group (RLG) and Online Computer Library Center (OCLC) discussed how both organizations build an infrastructure for purposes of archiving digital objects. It guides you to build archival information system OCLC.org 12

Purpose – Model a system for archival information, which is represented in digital format, for long-term preservation Scope – Framework for long-term preservation and access – Terminology Architectures and Operation Preservation strategies and techniques Data model 13

Primary functions – To preserve digital resource over an extended period of time – To provide user access to the information in archives 14

Roles – Producer- a data provider – Management- an administrator – Consumer- a data retriever Important functions – Ingest- submit data to system – Store- preserve data in system – Access- retrieve data from system 15

Person(s), or client systems, who provide the information to be preserved Person(s) who set the overall policy of the OAIS. Management is separate from administrative functions Person(s), or client systems who interact with the OAIS system and services OCLC.org Roles OAIS (archive) 16

OCLC.org Important functions And workflow Ingest Store Access 17

Producer – Ingest digital resource to system Management – Monitor, verify digital resource, do preservation planning, migrate digital resource, and etc. Consumer – Search and access digital resource in repository Roles and resposibilites 18

19

Preserved data in the system needs to be formed in a package. Owing to the 3 important functions of OAIS ( Ingest, Store, and Access), package of preserved data are transformed into 3 types – SIP, AIP, and DIP – SIP - to enter to the system – AIP - to preserved in the system – DIP - to distribute from the system Each package type is based on the same concept that is described hereafter 20

DIP AIP SIP Producer Administrator Consumer Ingest Store Query Access 3 important functions, 3 package types, and 3 roles Disseminate 21

SIPAIP DIP SIP A form of package that the is suitable to ingest to the system by the producer. Majorly, SIP contains Content Info and PDI. Multiple SIPs may associate with the same PDI. AIP A form of package that is suitable stored in the system. One or more SIPs is transformed to AIP that has complete set of PDI associated Content Info. AIP may be a collection of AIPs. DIP A form of package that is suitable to disseminate to consumer. AIP is transformed to DIP for sharing purpose. DIP may contains one or more AIP that may not has complete set of PDI. OCLC.org 22

Big picture Of Information Model Of a package OCLC.org 23

4 Simple information concepts Content Information Content Information PDI Preservation Description Information PDI Preservation Description Information Archive Packaging Information Descriptive Information about Package 1 Descriptive Information about Package 1 Package 1 24

1.Content Information – A digital resource that need to preserve e.g. text, image, video, sound, … 2.Preservation Description Info (PDI) – Contain preservation metadata that informs humans or machines to know what they should concern when they want to access, render or other actions to the digital resource. 3.Archive Packaging Information – A package that enwraps both Content Information (1) and PDI (2) to store as one object 4.Descriptive Information (Information of Archive Package) – It performs as a metadata of Archive Packaging Information (3). – It helps search engine that does not need to costly extract Archive Package Info to query Content Info or PDI directly 4 Simple information concepts 25

Content Information Content Information PDI Preservation Description Information PDI Preservation Description Information Archive Packaging Information Descriptive Information about Package 1 Descriptive Information about Package 1 Package 1 Content Information: – Original targeted for preservation. – Physical/Digital object and it Representation Information. OCLC.org Content Information 26

Content Information – A basic concept of information that contain data and its representation information. – For example, it can be “Thailand Map” Content Information OCLC.org 27

Data Object – It is an object that need to preserve. – It can be either physical thing in the real world or digital object content containing bit string. – In this case, it can be file content ( …..) of image file of Thailand Map – In fact, it is just a string of bit that has no meaning if no one cannot understand. 28

Representation Information – A bit string ( …) may be useless if no one knows its meaning. The representation Information inform what structure of “ …” is and how to interpret it. – It may has representation of representation if the data object content has complex structure or encode by many level Format in byte form Construct JPEG format structure Interpret to color of pixels to be a picture Raw bit string 29

Preservation Description Information (PDI): – What is needed to preserve the Content Information Provenance – For reliability – Source of content – histories Context – Environment to render Reference – Refer to thing outside e.g. ISBN Fixity – Check sum, MD5, … Content Information Content Information PDI Preservation Description Information PDI Preservation Description Information Archive Packaging Information Descriptive Information about Package 1 Descriptive Information about Package 1 Package 1 OCLC.org 30

PDI contains – Reference Info Identifier that link to thing outside system or real world resource; such as ISBN – Provenance Info To record why the digital resource born, where it born, why, and how. Including software and environment that created it. – Context Info To inform how reliable of the digital resource To inform original or source of content To inform history of change To inform migration process – Fixity Info To provide necessary information to access and verify digital resource E.g. keyword, Checksum, MD5, and etc OCLC.org 31

Example of PDI components OCLC.org 32

Example of PDI components OCLC.org 33

Example of PDI components OCLC.org 34

Archive Package Information: – Collect Content Information and PDI together to store in the system – The package has a name for example “Package 1” Content Information Content Information PDI Preservation Description Information PDI Preservation Description Information Archive Packaging Information Descriptive Information about Package 1 Descriptive Information about Package 1 Package 1 OCLC.org 35

Descriptive Information: – Because searching in the package directly take time, it needs metadata of package in order to search. – Information which is used to discover which package has the Content Information of interest. – Full set of attributes that are searchable in catalog service. – To perform indexing to this information may improve performance of searching. Content Information Content Information PDI Preservation Description Information PDI Preservation Description Information Archive Packaging Information Descriptive Information about Package 1 Descriptive Information about Package 1 Package 1 OCLC.org 36

Descriptive Information about Package 1 Descriptive Information about Package 1 Package 1 OCLC.org 37

38

Big picture of all functions and flow of packages OCLC.org 39

Big picture of all functions and flow of packages OCLC.org, CORNELL.edu 40

Accept SIPs from Producers Verify SIPs that user submits Generate AIPs for archive storage Overview 41

OCLC.org 42

Receive Submission – Upload SIP package from producer by electronic transfer such as FTP Quality Assurance – Validate transmission (e.g. checksum) error SIP package and log a result Generate AIP – Transform SIP to AIP and report result Generate Descriptive Info – Produce metadata support searching, retrieving AIPs (to answer who, what, when, where, why), and browsing such a thumbnail Coordinate Update – Provide a single point to access (add, modify, remove, get) storage area Description 43

The main task is to store data. It also maintains data and guarantee that preservation data still be accessible form constrain of media and security Furthermore, it provides disaster recovery capabilities Overview 44

OCLC.org 45

Receive Data – Receive AIP from Ingest to permanent storage Manage Storage Hierarchy – Provide administration functions for storage media Replace Media – Support functions of migration from a media to another media Error Checking – Check and notification error from data in storage area Disaster Recovery – Provide mechanism for replicating digital content to safe place Provide data – Copy data from storage area to Access in order to serve consumer query Description 46

Mainly, the API works for many functions related to database – Manage DB configuration – Maintain database schema – Define integrity constrains – Perform DB update – Perform query management Overview 47

OCLC.org 48

Administer Database – Mainly, focus on database administration functions e.g. define schema, configure database, define integrity constrains, and etc. Perform Queries – Point that request query from consumer, then query to database, and finally generate result set Generate Report – Receive reports from Ingest and Access to summary Receive Database Updates – Point that perform database operations such a insert, update, and delete Description 49

Solicit and negotiate submission agreement – With producer Audit submission – To ensure that they meet standard Maintain Configuration Management of – System hardware – Software Day-to-day governance of the other OAIS functional entities Overview 50

OCLC.org 51

Negotiate Submission Agreement – Deal submission agreement with producer Manage System Configuration – Configure and control change which effect to system engineering of archival system Archival Information Update – Receive change from produce’s change and inform Access to update the change of DIP from the change of AIP Physical Access Control – Authorize resource to access from consumer Description 52

Establish Standards and Policies – Manage standards and policies in order to approve migration and replication processes Audit Submission – Verify that AIP and SIP is following specification and agreement – Ensure that PID is understandable for the digital resource Activate Requests – To check the request of consumer is correct, then submit the request to Access Customer Service – Provide functions to manage user’s account Description (cont) 53

Monitor environment of OAIS Provide recommendations – Still accessible? – Long-term? – If original computing environment becomes obsolete? Overview 54

OCLC.org 55

Monitor Designated Community – Allow consumer and producer to track change of available technologies Monitor Technology – Report change of software and hardware contributing to preservation process Develop Preservation Strategies and Standards – Develop and recommend strategies and standards for future change of technology Develop Packaging Designs and migration Plans – Customize SIP and AIP template for migration goal Description 56

Determine – Existence – Description – Location – Availability Of information in OAIS Allow Consumer – Request – Retrieve Information of Products Overview 57

OCLC.org 58

Coordinate Access Activities – Provide single user interface for features browse, search, and access Generate DIP – Generate DIP from AIP Deliver Response – Handle response from query and access and delivery to consumer – Report access activities to administrator Description 59

OCLC.org 60

Producer – Ingest package to system System store AIP in Archival storage System store descriptive metadata in Data management Consumer – Query data via Access Query from descriptive metadata from Data management – Retrieve data via Access Get data from Archival storage Management – Manage and monitor every flow in system 61

62

OAIS system that work alone and provide basic functionalities to end users OCLC.org 63

Many OAIS systems can exchange Information package from one to another. Thus, the system needs to specific DIP to be SIP of another system. The two systems must have standard functions to end users. User communicate many systems by the same way OCLC.org 64

Many OAIS systems provide a single access point that connect to each systems. End user knows only common catalog that he/she faces with. The set of systems are hidden from user view OCLC.org 65

Many OAIS systems provide shared storage area and data management that are used together with other OAIS system. – They should agree common standards of the archival storage and data Management Other functions own by each system; such as, ingest, access and etc. OCLC.org 66

67

What you should know OCLC.org 68

What you should plan OCLC.org 69

How you should run OCLC.org 70

Let’s see example implementations of OAIS Let’s see example implementations of OAIS 71

A software to build digital repository for academic purpose. It preserves and enables open access easily to all digital contents – E.g. text, images, movie, and etc. Dspace as OAIS – The software uses concepts of OAIS to build a system from both functionalities, data model, and dataflow – End client can access a repository’s functions from web- based application dspace.org 72

dspace.org 73

A system that serve digital content repository for a wide variety of users – E.g. institutional repository, digital archive, content management system, scholarly publishing enterprises, and digital library. Fedora as OAIS – It built on OAIS based on both data model, function model, and architecture models – End client can access a repository’s functions via web services fedora-commons.org 74

fedora-commons.org 75

76