HATHI TRUST A Shared Digital Repository Digital Repositories for Preservation and Access Digital Directions 2013 Jeremy York July 22, 2013 Unless otherwise.

Slides:



Advertisements
Similar presentations
Richard Jones, Systems Developer Technical Issues for Repository Software Theses Alive! Edinburgh University Library SHERPA Nottingham.
Advertisements

Criteria for the trustworthiness of data centres Jens Klump Helmholtz Centre Potsdam German Research Centre for Geosciences (GFZ) DataCite Summer Meeting.
Beyond the Google Book: the Future of the Digital Library Cory Snavely Library IT Core Services manager University of Michigan April 20, 2010.
HATHI TRUST A Shared Digital Repository Building A Future By Preserving Our Past The Preservation Infrastructure of HathiTrust Digital Library Jeremy York.
KAT HAGEDORN HATHITRUST SPECIAL PROJECTS COORDINATOR UNIVERSITY OF MICHIGAN LIBRARIES OCTOBER 9, 2009 Seamless Sharing: NYU, HathiTrust, ReCAP and the.
HathiTrust Sharing a Federal Print Repository: Issues and Opportunities May 25, 2011 Heather Christenson.
What is HathiTrust and How Can it Make a Difference? Sourcing and Scaling brought to the collective collection.
KAT HAGEDORN HATHITRUST SPECIAL PROJECTS COORDINATOR UNIVERSITY OF MICHIGAN LIBRARIES OCTOBER 9, 2009 Seamless Sharing: NYU, HathiTrust, ReCAP and the.
Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
Introduction to Planets Hans Hofman Nationaal Archief Netherlands Prague, 17 October 2008.
Preservation as a Process of a Repository David Tarrant University of Southampton (UK) Preserv Repository Preservation and Interoperability.org.uk.
Joint Information Systems Committee Digital Library Services BL/JISC Workshop Rachel Bruce JISC Programme Director The Digital Library and its Services,
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
Digital Preservation A Matter of Trust. Context * As of March 5, 2011.
May 16, 2012EDMC Workshop in College Park MDDan Kowal Trusted Digital Repositories: A New Audit Standard A Follow-on to the OAIS Dan Kowal, Data Administrator,
An Introduction to Repositories Thornton Staples Director of Community Strategy and Alliances Director of the Fedora Project.
HATHITRUST A Shared Digital Repository We’re Preserving the Past, What About the Present? NISO Webinar: Ensuring the Preservation of E-Books May 23, 2012.
ICDL-Contentra Workshop 29 th November /11/2013 Contentra Technologies Confidential (RajuB)1.
HATHITRUST A Shared Digital Repository HathiTrust current work, challenges, and opportunities for public libraries Creating a Blueprint for a National.
Digital Content Solutions Digital content management technology has transformed the way to manage content and knowledge, in this knowledge era. Research.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
HATHITRUST A Shared Digital Repository Big Collections in an Era of Big Copyright: Practical Strategies for Making the Most of Digitized Heritage Jeremy.
Depositing and Disseminating Digital Resources Alan Morrison Collections Manager AHDS Subject Centre for Literature, Linguistics and Languages.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
Author(s): David A. Wallace and Margaret Hedstrom, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Digital Asset Management for All? Visualising a Flexible DAMS Solution for Small and Medium Scale Institutions Paul Bevan Llyfrgell Genedlaethol Cymru.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Trusted Digital Repositories,
Ingest and Dissemination with DAITSS Presented by Randy Fischer, Programmer, Florida Center for Library Automation, University of Florida DigCCurr2007.
Statewide Digitization and the FCLA Digital Archive Priscilla Caplan, Florida Center for Library Automation Statewide Digitization Planners Meeting OCLC,
Science Archives in the 21st Century 25/26 April Towards an International standard for Audit and Certification of Digital Repositories David Giaretta.
HATHITRUST A Shared Digital Repository HathiTrust: Key Concepts and Issues in Managing the Digital Archive ICPSR Summer Workshop “Curating and Managing.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan, Florida Center for Library Automation DCC Workshop on Long-term Curation within Digital Repositories.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share.
The FCLA Digital Archive Joint Meeting of CSUL Committees, 2005.
Digital Preservation: Current Thinking Anne Gilliland-Swetland Department of Information Studies.
Digital preservation activities at the NLW Sally McInnes 18 September 2009.
Linked Digital Archive Institutional Repository Rathachai Chawuthai CSIM/SET/AIT.
Archival Workshop on Ingest, Identification, and Certification Standards Certification (Best Practices) Checklist Does the archive have a written plan.
Data in the NEES Data Repository Conditions for Current and Future Use and Re-Use Quake Summit 2012, Boston, Massachusetts July 12, 2012 Stanislav Pejša.
OAIS: From Requirements to Reality at OCLC FLICC / CENDI Symposium, Dec Pam Kircher Product Manager, Digital Archive OCLC Digital & Preservation.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
M-1 INGEST OVERVIEW Don Sawyer National Space Science Data Center NASA/GSFC October 13, 1999.
Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives Lisa M. Schmidt
HathiTrust: Collaboration in Building the Universal Collection John Wilkin 1 October 2009.
HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access.
DAITSS and the Florida Digital Archive Priscilla Caplan Florida Center for Library Automation iPRES 2006.
ARIADNE is funded by the European Commission's Seventh Framework Programme Archiving and Repositories Holly Wright.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
The OAIS Reference Model and Trustworthy Repositories Josh Lubell Manufacturing Engineering Laboratory NIST
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
SEDAC Long-Term Archive Development Robert R. Downs Socioeconomic Data and Applications Center Center for International Earth Science Information Network.
The OAIS model SEEDS meeting May 5 th, 2015, Lausanne Bojana Tasic.
Data Management and Digital Preservation Carly Dearborn, MSIS Digital Preservation & Electronic Records Archivist
Developing a Dark Archive for OJS Journals Yu-Hung Lin, Metadata Librarian for Continuing Resources, Scholarship and Data Rutgers University 1 10/7/2015.
Joint Meeting of CSUL Committees,
Trusted Repository Systems Overview
DAITSS: Dark Archive in the Sunshine State
Trustworthiness of Preservation Systems
DAITSS and the Florida Digital Archive
Statewide Digitization and the FCLA Digital Archive
Implementing an Institutional Repository: Part II
Research data preservation in Canada
Q1 (2013).
Implementing an Institutional Repository: Part II
How to Implement an Institutional Repository: Part II
Presentation transcript:

HATHI TRUST A Shared Digital Repository Digital Repositories for Preservation and Access Digital Directions 2013 Jeremy York July 22, 2013 Unless otherwise noted, these slides and their contents are licensed under a Creative Commons Attribution Unported License.Creative Commons Attribution Unported License

Digital repositories Primary mission to preserve content Performs actions to this end

Reasons to preserve content For access Guard against threats to content – Digitization accepted method of preservation reformatting – Digital deteriorates, is fragile

Reasons to provide access Meet needs of designated community Check on integrity of content Content that is accessible is more likely to be valued and preserved in the future

Reasons access might not be offered Copyright Privacy Licensing Needs of user community – Content available elsewhere Technical limitations – Networking and storage requirements

A number of models Full user access to preserved digital objects No end-user access to digital objects Delayed or triggered user access to digital objects Partial access to digital objects

Requirements to preserve content OAIS – “An OAIS is an Archive, consisting of an organization...of people and systems that has accepted the responsibility to preserve information and make it available for a Designated Community.” [does not imply unrestricted access]

OAIS Support information model – Define target of preservation (content data and representation information) – Define metadata needed to preserve, identify, contextualize information (PDI) Fulfill responsibilities – Accept information from Producers – Obtain control sufficient to preserve – Ensure understandable to designated community – Ensure preservation – Make available to designated community with information supporting authenticity

Ensure preservation Some strategies: – Transformation – Validation – Checks on integrity – Replication – Choice of formats – Migration

TRAC Starts with “a mission to provide reliable, long-term access to managed digital resources to its designated community, now and into the future” Encompasses – Organizational Infrastructure – Digital Object Management – Technical Infrastructure

TRAC (2) Borrows vocabulary from OAIS Adapts ideas for applying criteria from nestor and Digital Curation Centre – Documentation (evidence) – Transparency – Adequacy – Measurability

OAIS TRAC Transparency Documentation Adequacy Measurability Provenance Context Reference Fixity Access Rights Designated Community Mission Organizational Infrastructure Digital Object Management Technical Infrastructure Representation Information Content Data Preservation Actions Authenticity Reliability Integrity Preserve Content

Where does access come in Some level of access is necessary – Management, integrity What is preserved may not be what is most useful to the end user Implications across the repository

Content formats Can the content you are preserving be delivered over the Web? – Will you be storing derivative files? – Is some kind of transformation needed? – Do the files offer consistent functionality? Implications for scale of repository, access systems, changes to services In HathiTrust: – Limited to 3 formats, largely uniform in technical characteristics ITU G4 TIFF JPEG2000 Unicode (with and without coordinates)

Storage of information about content Is information about object adequately available for both preservation and access? – Structural information – Preservation information with implications for interface HathiTrust uses METS as a wrapper – Available for preservation and access

Content Package images Source METS text HT METS Zip

Architecture images bib data bib data Source METS text HT METS../uc1/pairtree_root/b3/54/34/86/b b zip b mets.xml

Storage Does the storage system support needs for ingest and access? In HathiTrust: – Need to have fast access to repository systems to support services

Security Data Integrity – Checksum validation, digital object provenance Physical security – Biometric door systems, locked racks Network security – Firewalling, vulnerability scanning Application security – Developer best practices, input validation Access control…

Differential access to content Rights database – Ensures appropriate access Holdings database – Facilitates lawful uses of materials

Authentication/Authorization Mechanisms to enable differential access, ensure security and appropriate use

User services Bibliographic and full-text search indexes Collection-building capabilities User interfaces

APIs and Datasets Data API Bibliographic API OAI “Hathifiles” Datasets

More Quality User Support Correction

Provide Access Content Package Content Formats ArchitectureStorage AuthenticationSecurityAuthorization Differential Access Services / User Interfaces Lawful Uses APIs and Datasets Copyright/Agreem ents User Support Indexes CorrectionInformation Quality

Content Package Content Formats ArchitectureStorage AuthenticationSecurityAuthorization Differential Access Services / User Interfaces Lawful Uses APIs and Datasets Copyright/Agre ements User Support Indexes Correction Information Quality OAIS TRAC Transparency Documentation Adequacy Measurability Provenance Context Reference Fixity Access Rights Designated Community Mission Organizational Infrastructure Digital Object Management Technical Infrastructure Representation Information Content Data Preservation Actions Authenticity Reliability Integrity Preservation Access

Thank you!

How to find out more About: Twitter: Facebook: Monthly newsletter: – – RSS Contact us: Blogs: – Large-scale Search – Perspectives from HathiTrust