Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives Lisa M. Schmidt

Slides:



Advertisements
Similar presentations
Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
Advertisements

Enabling Secure Internet Access with ISA Server
2014 Redrock Software Conference ADV – Advanced Preferences and Settings By Jonathan Smith.
OCLC Digital Archive Overview Judith Cobb LIPA Meeting July 2006.
The future’s so bright…. DAITSS DIGITAL PRESERVATION SYSTEM: RE-ARCHITECTED, RE- WRITTEN, AND OPEN SOURCE Priscilla Caplan Florida Center for Library Automation.
Fedora Users’ Conference Rutgers University May 14, 2005 Researching Fedora's Ability to Serve as a Preservation System for Electronic University Records.
Institutional Repositories It’s not Just the Technology New England Archivists Boston College March 11, 2006 Eliot Wilczek University Records Manager Tufts.
Authentication of the Federal Register Charley Barth Director, Office of the Federal Register United States Government.
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
SOFTWARE PRESENTATION ODMS (OPEN SOURCE DOCUMENT MANAGEMENT SYSTEM)
Sustainable Preservation Services for Archivists through Distributed Custody Caryn Wojcik State of Michigan Records Management Services.
Preserving a Born-Digital Archive: The H-Net Lists Lisa M. Schmidt MATRIX: The Center.
Digital Preservation Practices and Strategies at Colorado State University Libraries.
LBSC 708X The Record Nature of Electronic Records College of Information Studies.
DCAPE Project Update Richard MarcianoChien-Yi Hou Caryn Wojcik University of University of State of Michigan North Carolina North Carolina Records Management.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System 1 Zaihua Ji Doug Schuster Steven Worley Computational.
November 2009 Network Disaster Recovery October 2014.
Preserving the H-Net Lists: A Case Study in Trusted Digital Repository Assessment Lisa M. Schmidt
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Trusted Digital Repositories,
Ingest and Dissemination with DAITSS Presented by Randy Fischer, Programmer, Florida Center for Library Automation, University of Florida DigCCurr2007.
Electronic Mail List Preservation Takes Off: The H-Net Archive Lisa M. Schmidt MATRIX: The Center.
Preserving Electronic Mailing Lists: The H-Net Archive H-Net Mapped to the OAIS Model Preservation AssessmentPreservation improvementsOverview How H-Net.
Classroom User Training June 29, 2005 Presented by:
15 Maintaining a Web Site Section 15.1 Identify Webmastering tasks Identify Web server maintenance techniques Describe the importance of backups Section.
Section 15.1 Identify Webmastering tasks Identify Web server maintenance techniques Describe the importance of backups Section 15.2 Identify guidelines.
Science Archives in the 21st Century 25/26 April Towards an International standard for Audit and Certification of Digital Repositories David Giaretta.
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
Module 9 Configuring Messaging Policy and Compliance.
Relationships July 9, Producers and Consumers SERI - Relationships Session 1.
Richard MarcianoChien-Yi Hou Caryn Wojcik University of University of State of Michigan North Carolina North Carolina Records Management ServicesSALT DCAPE.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan, Florida Center for Library Automation DCC Workshop on Long-term Curation within Digital Repositories.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
Use & Access 26 March Use “Proof of Concept” Model for General Libraries & IS faculty Model for General Libraries & IS faculty Test bed for DSpace.
The Canadian Information Network for Research in the Social Sciences and Humanities Tim Au Yeung and Mary Westell Libraries.
Digital Preservation: Current Thinking Anne Gilliland-Swetland Department of Information Studies.
My Workspace ELearning in Sakai Randy Graff, PhD HSC Training.
Archival Workshop on Ingest, Identification, and Certification Standards Certification (Best Practices) Checklist Does the archive have a written plan.
OAIS: From Requirements to Reality at OCLC FLICC / CENDI Symposium, Dec Pam Kircher Product Manager, Digital Archive OCLC Digital & Preservation.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan Florida Center for Library Automation (FCLA)
The Project Three-year grant from the National Historical Publications and Records Commission (NHPRC), April 2010-March 2013 Develop electronic records.
Fedora and the Preservation of University Electronic Records Project NHPRC Electronic Records Research Grant Kevin L. Glick Manuscripts and Archives, Yale.
DAITSS and the Florida Digital Archive Priscilla Caplan Florida Center for Library Automation iPRES 2006.
ARIADNE is funded by the European Commission's Seventh Framework Programme Archiving and Repositories Holly Wright.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
A Project of the University Libraries Ball State University Libraries A destination for research, learning, and friends.
The OAIS model SEEDS meeting May 5 th, 2015, Lausanne Bojana Tasic.
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Data Management and Digital Preservation Carly Dearborn, MSIS Digital Preservation & Electronic Records Archivist
OAIS (archive) Producer Management Consumer. Representation Information Data Object Information Object Interpreted using its Yields.
OAIS (archive) OAIS (archive) Producer Management Consumer.
R2R ↔ NODC Steve Rutz NODC Observing Systems Team Leader May 12, 2011 Presented by L. Pikula, IODE OceanTeacher Course Data Management for Information.
GNU EPrints 2 Overview Christopher Gutteridge 19 th October 2002 CERN. Geneva, Switzerland.
The world’s libraries. Connected. The Benefits of CONTENTdm Hosting Services OCLC’s Digital Lifecycle Webinar Series April 9, 2013.
Ingest and Dissemination with DAITSS
OAIS Producer (archive) Consumer Management
DAITSS: Dark Archive in the Sunshine State
DAITSS and the Florida Digital Archive
An Overview of Data-PASS Shared Catalog
An Introduction to Tessella and The Safety Deposit Box Platform
Section 15.1 Section 15.2 Identify Webmastering tasks
Implementing an Institutional Repository: Part II
Technical Issues in Sustainability
Robin Dale RLG OAIS Functionality Robin Dale RLG
Implementing an Institutional Repository: Part II
How to Implement an Institutional Repository: Part II
Presentation transcript:

Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives Lisa M. Schmidt MATRIX: The Center for Humane Arts, Letters & Social Sciences Online Michigan State University August 26, 2008

H-Net: Humanities and Social Sciences Online International consortium of scholars and teachers Oldest collection of born-digital and content- moderated arts, humanities, and social science material on the Internet Valuable scholarly resource –More than 180 networks, or lists –More than 230 “private” lists More than 1 million messages Hosted by MATRIX

NHPRC Grant Conduct assessment of existing H-Net preservation policies and practices Apply NARA/OCLC TRAC checklist Develop and implement an improved long- term preservation plan Useful to those managing large collections of electronic records Research semantic clustering search techniques

Preserving Lists as Scholarly Resources How H-Net Works Current Preservation Practices Preservation Improvement Plan

How H-Net Works: Backup & Security 2.7 TB of data, including H-Net Server rack kept in climate controlled, physically secured room Daily incremental backups, weekly full –Tapes cycle through system every 6 weeks –Swapped tapes stored in secure location –Tapes replaced as needed Monthly full, permanent tape backups –Tapes kept in minimally secure cabinet –Plans to keep log and move to offsite storage

How H-Net Works: Posting Messages H-Net runs on LISTSERV Software Users must be list subscribers to post Messages written in plain text No attachments allowed on public lists Editors approve and post messages Editors can overwrite creation metadata

How H-Net Works: Archiving of Lists Messages post from a few seconds up to several days after approval Messages kept in flat text files called “notebooks” Notebook includes messages posted during a weekly time period

How H-Net Works: Archiving of Lists Time PeriodDay of Month a1-7 b8-14 c15-21 d22-28 e29-31 Ex. “h-africa.log0802a”

How H-Net Works: Archiving of Lists Log browse cache application extracts key metadata, creates MD5 hashes Cache builder script writes metadata to MySQL database cache –Notebook filename –Offset (byte position) of message –Author name and address –Subject –Date in two formats –Messageid (MD5 hash)

How H-Net Works: Message Retrieval &month=0808&week=b&msg=w8utW6nKNO1FuY19vSK2mo &user=&pw=

Current Preservation Practices Message Ingest, Storage, and Retrieval Processes

Current Preservation Practices Backup and storage Significant property: message/notebook content, stored in plain text formats Authenticity –Informal check by author and/or editor on posting –Broken URL on message retrieval attempt Notebook filename partially fulfills PDI recommendation –Reference, Content, Provenance Information –(ex., h-albion.log0808b) –No Fixity Information

Preservation Improvement Plan: Backup & Storage Media refreshment schedule More than one set of permanent backup tapes, or a server mirror Secure storage systems Backup log Participation in distributed storage system

Preservation Improvement Plan: Authenticity Fixity: Individual Messages (SIPs) Shorten time window for generation of MD5 hashes Create database of MD5 hashes for fixity checks Validate message hashes on notebook completion Fixity: Notebook Files (AIPs) Create SHA-2 message digests on completion of notebooks Calculate SHA-2 message digests for existing notebooks Create database of SHA-2 message digests for fixity checks Validate notebook hashes on weekly basis

Preservation Improvement Plan: Authenticity Accurate Message Creation Metadata Build list editing web interface for editors Will only help with new messages Restriction of Editors’ Administration Capabilities Eliminate editors’ ability to retrieve and change notebooks Restrict notebook modification rights to MATRIX postmasters H-Net Tampering Risk? Low—staff with root system account privileges are trusted employees No action required

Preservation Improvement Plan: Attachments Browser Access for Private Lists Provide constructed URLs, as with public lists Provide download links to attachments Migration Strategy Conduct inventory of attachments on H-Net-related lists Provide conversion on demand –Option 1: Keep conversion tools in reserve –Option 2: Automate conversion Establish or leverage technology watch

Preservation Improvement Plan: Other Technical Improvements Preservation of Links to Original Content Redirect URLs within messages to archived websites Shorter Persistent URLs Develop naming scheme for shorter URLs Map shorter URLs to actual URLs

Preservation Improvement Plan: From TRAC Checklist Succession plan Periodic review or trigger event definition Document, document, document! –Technology history –Change management system –Staff roles, responsibilities, and authorizations –Written recovery plan

References H-Net Archives, Documentation, H-Net: Humanities and Social Sciences Online, InterPARES, MATRIX: The Center for Humane Arts, Letters, and Social Sciences Online, OAIS Reference Model, Trustworthy Repositories Audit & Certification: Criteria and Checklist,