Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital Archiving: A FEDORA-Based Infrastructure for Preserving Electronic Journals LACUNY Institute 2005 Scholarly Publishing and Open Access: Payers.

Similar presentations


Presentation on theme: "Digital Archiving: A FEDORA-Based Infrastructure for Preserving Electronic Journals LACUNY Institute 2005 Scholarly Publishing and Open Access: Payers."— Presentation transcript:

1 Digital Archiving: A FEDORA-Based Infrastructure for Preserving Electronic Journals
LACUNY Institute 2005 Scholarly Publishing and Open Access: Payers and Players May 20, 2005 Ronald C. Jantz Rutgers University Libraries LACUNY 2005, R. Jantz - 5/20/2005

2 Some Questions to Think About
What is the oldest digital object you know of? How can you tell if object A and object B are the same? How do you know if a digital object has been changed? What is the nature of the change? LACUNY 2005, R. Jantz - 5/20/2005

3 Should We Be Concerned? (About Our Ability to do Digital Preservation)
The Clinton Administration produced approximately 90 million messages. During the Iran-Contra scandal, John Poindexter and Oliver North erased 5,000 messages. Chronicle of Higher Education, Jan. 30, 2004. “The patent office, home to nearly 6.5 million patents dating to 1790, is converting to an electronic database and discarding a significant portion of its paper files after they have been scanned and digitized.” -Mitchell, A. (2001). Ingenuity’s Blueprints, Into History’s Dustbin. NY Times. December 30, 2001, p. A1. The Nazis destroyed 100 million books in the years from 1933 to 1945. LACUNY 2005, R. Jantz - 5/20/2005

4 Digital Library Repository Initiative
(Rutgers University Libraries) Objectives: To provide seamless, perpetual access to digital collections -- our resources and the resources of others. To develop a flexible framework of “core” capabilities providing the enabling infrastructure, interoperability, and sustainability. LACUNY 2005, R. Jantz - 5/20/2005

5 Digital Preservation and Archiving Institutional Requirements
Institutional clarity about what to preserve Very large mass storage systems, scaling to millions of objects Flexibility to handle many digital formats (digital object architecture) Integration of key technologies Well defined preservation metadata and processes Sustainability – content, technology, financial LACUNY 2005, R. Jantz - 5/20/2005

6 Digital Preservation (A definition from Research Libraries Group)
Digital preservation is defined as the managed activities necessary for ensuring: 1. The long term maintenance of a byte stream (including metadata) sufficient to reproduce a suitable facsimile of the document, and 2. Continued accessibility of the contents thru time and changing technology. LACUNY 2005, R. Jantz - 5/20/2005

7 Why Would You Digitally Preserve?
Preserve material that exists in electronic form only Protect original artifact by using a surrogate Provide surrogate if original artifact is destroyed LACUNY 2005, R. Jantz - 5/20/2005

8 Digital Preservation Involves Both Process and Technology
Creation of The Digital Object Ingest, Store, Access to Life Cycle Management Of the Digital Yes Decision To Digitally Preserve No D1.0 D3.0 D2.0 Migration (transferring digital materials from one media or format to another) is the only workable life cycle approach. LACUNY 2005, R. Jantz - 5/20/2005

9 Digital Library Concepts
Digital Library Repository (DLR) The repository is designed and managed to contain and provide access to digital resources created by an institution. Repositories can provide both access and preservation. Digital Object The digital object is the basic unit of management and digital preservation, consisting of a persistent identifier, metadata, and associated byte streams. An object can represent a book, map, e-journal article, photograph, numeric data, etc. LACUNY 2005, R. Jantz - 5/20/2005

10 The Fedora* Infrastructure
The Infrastructure (from Fedora) An extensible digital object model APIs for developing new applications Scalable, persistent storage for content and metadata Content Versioning and audit trails Metadata harvesting Development and Integration (by RUL) Design of the digital object architecture Integration of key technologies and standards Development of applications *Flexible Extensible Digital Object Repository Architecture LACUNY 2005, R. Jantz - 5/20/2005

11 RUL Digital Repository Architecture
External Applications Browse Search Export “Native” Applications Browse Search Admin ftns Internet Internet Server Digital Object Repository (Fedora) Server ftns DB access METS-XML Export Ingest Export (OAI, MARC, etc.) Local Database Objects

12 Digital Projects at Rutgers University Libraries
External (to Fedora) Applications Electronic Journals (journals published by RUL) The Eagleton Poll Archive: The NJ Environmental Digital Library: CETH projects (Roman coins, 18th century journals, classic texts) Native (Fedora) Projects The NJ Digital Highway – Jazz Oral Histories (digital sound) LACUNY 2005, R. Jantz - 5/20/2005

13 E-Journals at RUL Why are we undertaking this new role?
To support new, open models for the dissemination of scholarship. Journal publishing complements the Libraries' key role in supporting scholarship within the academy. Libraries have a traditional role in the preservation of scholarly materials. The E-Journal Platform at RUL Based on the Open Journal Systems (OJS) from the Public Knowledge Project. Digital preservation based on the integration of OJS, Fedora, and special processes and technologies. All journals are freely accessible. LACUNY 2005, R. Jantz - 5/20/2005

14 Available at: http://pcsp.libraries.rutgers.edu

15 Available at: http://ejbe.libraries.rutgers.edu

16 Available at: http://rulj.libraries.rutgers.edu

17 Digital Object Example (An E-journal Article)
Article Object Repository ID Descriptive Technical Source Rights Digital Prov. Administrative Disseminators Metadata Datastreams SMAP1 – Structure Map DS1- article (djvu) DS2 - article (pdf) ARCH1- Manuscript as Submitted. LACUNY 2005, R. Jantz - 5/20/2005

18 Important Technologies, Processes, and Standards
Persistent identifiers Digital Signatures (based on SHA1) Audit Trails Versioning Digital Certificates Pipelines (to automate sequential processes) Preservation Metadata (based on Nat’l Library of Australia approach) METS (Metadata Exchange and Transmission Standard) OAI-PMH (Protocol for metadata harvesting) Open source – Linux, Apache, Fedora, Amberfish (search engine) LACUNY 2005, R. Jantz - 5/20/2005

19 Persistent Identifier (PID)
Why is the PID important? An essential technology to preserve “referential integrity”. Approximately 41% or the urls referenced in Computer and CACM journals in the period were inaccessible in 2002 (Spinellis, 2003) What is it? An identifier that is technology and protocol independent and is mapped to a url. The handle for a PCSP issue is /pcsp1.1.47 Url access: CNRI Handle System ( For assigning, managing and resolving persistent identifiers Managed by the Corporation for National Research Initiatives LACUNY 2005, R. Jantz - 5/20/2005

20 Digital Signatures Objective – to detect and report unauthorized changes in an object Signature Process SHA1 signatures for both object and archival master Created automatically and inserted into metadata Verified periodically Failures reported thru Alerting Services LACUNY 2005, R. Jantz - 5/20/2005

21 The E-Journal Preservation Process
All articles in digital object form are exported to the Digital Repository (Fedora) Signatures and PIDs computed automatically Signatures verified automatically – failures reported via Repository alerting services External application (website) periodically captured and exported automatically to the Repository LACUNY 2005, R. Jantz - 5/20/2005

22 Issues and Questions We need “persistent” organizations
The service model for e-journals within the Library The cost/benefit model Research on earlier questions Sustainability – content, technology, financial There are many skeptics, e.g. Cullen (2000) asks rhetorically “How confident can we be when an object whose authentication is crucial depends on electricity for its existence?”. LACUNY 2005, R. Jantz - 5/20/2005


Download ppt "Digital Archiving: A FEDORA-Based Infrastructure for Preserving Electronic Journals LACUNY Institute 2005 Scholarly Publishing and Open Access: Payers."

Similar presentations


Ads by Google