Download presentation
Presentation is loading. Please wait.
1
Producer-Archive Workflow Network (PAWN) Goals Consistent with the Open Archival Information System (OAIS) model Use of web/grid technologies and platform independent Ease of integration with current pilot system based on data grids XML representation of metadata and bitstreams Accountability of transfer and guarantee of data integrity Project Members Joseph JaJa Mike Smorul Yang Wang Mike McGann Fritz McCall Chris Wambler Gary Jackson Tim Norris Producer ComponentsArchive Components Database to track registered objects Certificate Authority management Management server supplies web service interfaces to ingestion clients and management operations. Clients are designed to be standalone, with security certificates issued by producer Receiving servers validate connecting clients and validate SIPs Validation Services are simple webservice calls. Abstract I/O layer into digital archive. All components are scalable using standard load balancing techniques. Secure Distributed Ingestion Distributed security management through multiple Certificate Authorities (CA) Compatible with existing producer CA’s SSL encrypted and authenticated connections Automatic Certificate Revocation List (CRL) checking Scalable using standard load balancing technology Ingestion Workflow 1.Negotiate Submission Agreement. Create XML document regarding expected file formats, metadata, and layout of submission 2.Workflow Initialization and Submission Information Packet (SIP) creation. Trust relationship between Archive and Producer is established Clients are issued and register data 3.Transfer of SIPs to archive. A Submission Information Packet is created on a client. Client contacts archive and transfers SIP 4.Validation of SIP transfer Metadata and bitstreams are checked for integrity against checksums All items are also checked against requirements document Bitstreams are validated against test specified in requirements document. 5.Organization of data and transfer into persistent archive. Metadata may be transformed into an optimal object format depending on digital archive requirements Defining an Information Packet PAWN uses the Metdata Encoding and Transmission Standard (METS) schema to describe the contents and metadata of a Submission Information Packet (SIP). Each client generates a SIP containing a METS XML document and bitstreams to transfer to an archive. PAWN uses a template document based on METS combined with a set of rules that allow PAWN to enforce restrictions on how a SIP should arrive at an archive. These restrictions allow for the following types of control: Structural Limitations on the hierarchical ordering of document can be enforces Format Formats can be defined in a few ways, including required validation tests as defined by an archive, or simpler mime-types Metadata Metadata can be restricted by schema to certain structural areas PAWN Client Multiple PAWN clients run at each producer, each client can independently register and transfer holdings to an archive. Clients perform two functions, registering its holdings with a producer management server, and later transferring its holdings to an archive. During registration clients will notify a management server about holdings that it wants to transfer to an archive, along with metadata that is locally harvested. After registration a client will later create a SIP and transfer it to an archive. The two step transfer process allows oversight at the producer. Between registration and submission of data, context at a producer wide level may be attached to holdings.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.