ITHAKA Preservation Metadata 2.0: Revising the Event Model A last-minute presentation on work currently in progress Evan Owens VP, Content Management ITHAKA.

Slides:



Advertisements
Similar presentations
OCLC Digital Archive Overview Judith Cobb LIPA Meeting July 2006.
Advertisements

The future’s so bright…. DAITSS DIGITAL PRESERVATION SYSTEM: RE-ARCHITECTED, RE- WRITTEN, AND OPEN SOURCE Priscilla Caplan Florida Center for Library Automation.
TIPR: Repository Exchange Package Use Cases and Best Practices Joseph Pawletko and Priscilla Caplan IS&T Archiving 2011.
PREMIS Conformance. Agenda 1.NLNZ and NLB conformance exercise 2.History of PREMIS Conformance 3.Current status 4.Mapping to functionality.
Persistent identifiers – an Overview Juha Hakala The National Library of Finland
DRS 2 Metadata Migration June 25, Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
An Introduction to MODS: The Metadata Object Description Schema Tech Talk By Daniel Gelaw Alemneh October 17, 2007 October 17, 2007.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Funded by: © AHDS Sherpa DP – a Technical Architecture for a Disaggregated Preservation Service Mark Hedges Arts and Humanities Data Service King’s College.
1 Extending the Implementation of PREMIS to Geospatial Resources in the Stanford Digital Repository: An Exploration By Nancy J. Hoebelheinrich Metadata.
Mark Evans, Tessella Digital Preservation Boot Camp – PASIG meeting, Washington DC, 22 nd May 2013 PREMIS Practical Strategies For Preservation Metadata.
Inside View of DDI Version 3.0: Structural Reform Group Report Presented to IASSIST 25 May 2005 Edinburgh Scotland UK.
PREMIS What is PREMIS? – Preservation Metadata Implementation Strategies When is PREMIS use? – PREMIS is used for “repository design, evaluation, and archived.
Vocabulary Markup Language (Voc-ML) Project Joseph A. Busch Content Intelligence Evangelist Interwoven.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
AIP Archival Information Package – Defines how digital objects and its associated metadata are packaged using XML based files. METS (binding file) MODS.
PREMIS in the Real World: some reflections on constraints Jan Lavelle Senior Librarian (Systems Development) State Library of Tasmania.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation Mike Smorul, Joseph JaJa, Yang Wang, and Fritz McCall.
Incompatible or Interoperable? A METS bridge for a small gap between two digital preservation software packages Lucas Mak Metadata & CatalogLibrarian
1 1 Roadmap to an IEPD What do developers need to do?
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Metadata standards, tools and processes for audio preservation at the British Library: An overview of new systems for audio description, preservation and.
NETMOD Architecture Phil Shafer IETF 72.
DDI Lifecycle: Moving Forward Outcome of the Recent Workshop in Dagstuhl Joachim Wackerow.
How to build your own Dark Archive (in your spare time) Priscilla Caplan FCLA.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
Tools for Diagrammatic Specifications Stian Skjerveggen Supervisors: Yngve Lamo, Adrian Rutle, Uwe Egbert Wolter.
Challenges of Digital Media Preservation Karen Cariani, Director Media Library and Archives Dave MacCarn, Chief Technologist.
A Metadata Application Profile for the DRIADE Project Sarah Carrier, Jed Dube, Jane Greenberg March 13, 2007 _____________________.
O Supervisor : Dr. Harold Boley o Advisor : Dr. Tara Athan o Team : Simranjit Singh Pratik Shah Bijiteshwar R Aayush.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan, Florida Center for Library Automation DCC Workshop on Long-term Curation within Digital Repositories.
Preservation Audio Using METS: The Sound Directions Project Robin Wendler Harvard University Library 7 May 2007.
Archival Information Packages for NASA HDF-EOS Data R. Duerr, Kent Yang, Azhar Sikander.
PREMIS Rathachai Chawuthai Information Management CSIM / AIT.
2005 Epocrates, Inc. All rights reserved. Integrating XML with legacy relational data for publishing on handheld devices David A. Lee Senior member of.
Creating Archive Information Packages for Data Sets: Early Experiments with Digital Library Standards Ruth Duerr, NSIDC MiQun Yang, THG Azhar Sikander,
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
The FCLA Digital Archive Joint Meeting of CSUL Committees, 2005.
Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair San.
PREMIS Implementation Fair – SF 2009 PREMIS use in Rosetta Yair Brama – Ex Libris.
1 Overview of XSL. 2 Outline We will use Roger Costello’s tutorial The purpose of this presentation is  To give a quick overview of XSL  To describe.
PREMIS Implementation Fair, San Francisco, CA October 7, Stanford Digital Repository PREMIS & Geospatial Resources Nancy J. Hoebelheinrich Knowledge.
1 SMWG Service Management Modelling Notes Anthony Crowson Colin Haddow October 2009, ESTEC October 15, 2008.
PREMIS at the British Library Markus Enders, The British Library PREMIS Implementation Fair, San Fransisco, CA 07 October 2009.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan Florida Center for Library Automation (FCLA)
Todd King. The first requirement for the Information Model [IM2008] is: 1. The Information Model shall be developed and maintained independent from any.
Open Planets Foundation Hackathon Database Archiving Event Implementation of SIARD at the Danish National Archives.
Implementing PREMIS in DigiTool Michael Kaplan ALA 2007 Update.
DAITSS and the Florida Digital Archive Priscilla Caplan Florida Center for Library Automation iPRES 2006.
CTI STIX SC Status Report October 22, 2015.
Florida Digital Archive PREMIS and DAITSS. Florida Digital Archive.
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
Copyright © 2007, Oracle. All rights reserved. Using Document Management and Collaboration Appendix B.
Using DSDL plus annotations for Netconf (+) data modeling Rohan Mahy draft-mahy-canmod-dsdl-01.
1 Model Driven Health Tools Design and Implementation of CDA Templates Dave Carlson Contractor to CHIO
R2R ↔ NODC Steve Rutz NODC Observing Systems Team Leader May 12, 2011 Presented by L. Pikula, IODE OceanTeacher Course Data Management for Information.
Joint Meeting of CSUL Committees,
Building A Repository for Digital Objects
DAITSS: Dark Archive in the Sunshine State
DAITSS and the Florida Digital Archive
An Introduction to Tessella and The Safety Deposit Box Platform
Heppenheim Prototype for the MOT design and for the Transfer follow-up
UML to XSD.
Integrating PREMIS and METS
Metadata in Digital Preservation: Setting the Scene
AIXM 5 UML Modelling Conventions
A Tale of Two Archives: Notes from the Dark Side
Presentation transcript:

ITHAKA Preservation Metadata 2.0: Revising the Event Model A last-minute presentation on work currently in progress Evan Owens VP, Content Management ITHAKA (JSTOR / Portico)

Background Portico Preservation Metadata designed & implemented in –Inspired by PREMIS working group participation –Operational before PREMIS was completed! Portico Archive as of October 2009 –>14 Million E-Journal Articles plus other content –~150 Million Files –~1 Billion Events –Only 1K manual events; % system generated –Over 1 TB of Preservation Metadata Portico / JSTOR / Ithaka merger in 2009

2.0 PMD Revision Project Begun in 2008; Implementation now underway Design Goals for Revision to Events: –Consistent editorial/coding practices (capitalization, verb tenses, etc.) –Clarify what event goes with which object and why –Eliminate redundant information where possible –Make explicit all data constraints not currently expressed in our schemas –Synchronize event metadata with the high-level preservation metadata so that the events properly document changes in the core metadata –Establish a clean base line for future expansion of events metadata

PMD 2.0 Design Choices Use our own data model / information architecture –Optimized for Java, Oracle, and XML instantiations –XML designed to reduce future versioning: XSD schema for frame (syntax) only All business rules (semantics) expressed in Schematron –Not METS, not DIDL, not PREMIS XML –PREMIS compliant Optimized for size and speed –Fully relationally normalized –Inheritable attributes / metadata –Events attached to objects

Processing Record “master” for each processing pass Bring together information common to all the events from a given processing pass; e.g., initial ingest, future migration, etc.

Not a real event! Example XML serialization showing all possible child elements to illustrate the information model

Event Types Check: Virus, Fixity, … Characterize: File, … Generate: Desc. MD, Tech. MD, Fixity, … Edit: Desc. MD, … Set: Status, Format, Preservation Level, … Ingest: into Archive Add, Create, Remove File

Mapping PMD 2.0 to PREMIS

Observations Large-scale automated events feel very different from human events ITHAKA archive will quadruple in 2010 –Likely 3-5 billion events... Every bit of metadata has to be need justified Events have proved their value –An entire talk on that subject alone Nothing is easy in quantities of billions We still have to work on full lifecycle events THIS IS STILL A WORK IN PROGRESS!