Herbert Van de Sompel OCLC ESR, Evanston, IL, March 23 2015 Archiving the Evolving Scholarly Record: A Perspective Herbert Van de Los Alamos.

Slides:



Advertisements
Similar presentations
Partnering with Faculty / researchers to Enhance Scholarly Communication Caroline Mutwiri.
Advertisements

The Messy World of Grey Literature in Cyber Security 8 th Grey Literature Conference 4-5 December 2006 New Orleans, Louisiana Patricia Erwin – I3P Senior.
The way to open resources Laurent Romary CNRS. Two aspects of scientific communication Research papers –All types (Conferences, journals, grey literature.
S.J. Coles a*, M.B. Hursthouse a, R.A. Stephenson a, P. Cliff b, E. Lyon b, M. Patel b J. Downing c & P. Murray-Rust.
Joint Information Systems Committee Digital Library Services BL/JISC Workshop Rachel Bruce JISC Programme Director The Digital Library and its Services,
Digital Repositories: interoperability & common services Closing Remarks Dr Liz Lyon, UKOLN, University of Bath, UK
DSpace: the MIT Libraries Institutional Repository MacKenzie Smith, MIT EDUCAUSE 2003, November 5 th Copyright MacKenzie Smith, This work is the.
DIGITAL HUMANITIES SUMMER SCHOOL 2011 DIGITAL LIBRARY TECHNOLOGIES AND BEST PRACTICE, PART 1: DECONSTRUCTING DIGITAL LIBRARIES Christine Madsen R&D Project.
Copyright management in open access projects Iryna Kuchma Open Access Programme Manager Attribution 3.0 Unported.
DARE: building a networked academic repository in the Netherlands ICOLC October 25 Ronald Dekker Delft University of Technology Library.
Extended-Linking Services: towards a Quality Web Eric F. Van de Velde California Institute of Technology
CNRIS CNRIS 2.0 Challenges for a new generation of Research Information Systems.
Object Re-Use and Exchange Mellon Retreat, Nassau Inn, Princeton, NJ, March Herbert Van de Sompel, Carl Lagoze The OAI Object Re-Use & Exchange.
Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the.
Open Annotation Collaboration Rob Sanderson, Herbert Van de Sompel DMSS Meeting, May 14-15, Stanford, CA Robert Sanderson –
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
Integrating Repositories into a New Model of Scholarly Communication Dr Andrew Treloar Director, Information Management and Strategic Planning, Monash.
The Open Archives Initiative Simeon Warner (Cornell University) Open Archives seminar “Facilitating Free and Efficient Scientific.
Prototypes of pro-active approaches to support the archiving of web references for scholarly communications Richard Wincewicz 1, Peter Burnhill 1 & Herbert.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation Mike Smorul, Joseph JaJa, Yang Wang, and Fritz McCall.
1 CS 502: Computing Methods for Digital Libraries Lecture 4 Identifiers and Reference Links.
PV2013 Summary Results Data Stewardship Interest Group WGISS-37 Meeting Cocoa Beach (Florida-US) - April 14-18, 2014.
WORLD BANK Publications The reference of choice on development The Promise, and Challenge, of Implementing Open Access at the World Bank Carlos Rossel.
Addressing Metadata in the MPEG-21 and PDF-A ISO Standards NISO Workshop: Metadata on the Cutting Edge May 2004 William G. LeFurgy U.S. Library of Congress.
5-7 November 2014 ADLSN - ADLC Practical Digital Content Management from Digital Libraries & Archives Perspective.
5-7 November 2014 DR Workflow Practical Digital Content Management from Digital Libraries & Archives Perspective.
Perspectives on scholarly communication Herbert Van de Sompel Los Alamos National Laboratory – Research Library Universitaire Stichting – Brussels - October.
OAIS Open Archival Information System. “Content creators, systems developers, custodians, and future users are all potential stakeholders in the preservation.
Enhancing Content Visibility in Institutional Repositories: Maintaining Metadata Consistency Across Digital Collections Ahmet Meti Tmava and Daniel Gelaw.
ResourceSync was funded by the Sloan Foundation & JISC A Modular Framework for Web-Based Resource Synchronization Martin Klein Los Alamos National Laboratory.
1 Keeping stuff safe: how can libraries maintain their e-journal collections in the long-term? Richard Gartner King's College London International conference.
September 17, 2015 The Evolving Scholarly Record: Scope, Stakeholders, and Stewardship Brian Lavoie Constance Malpas OCLC Research.
Librarians as a Resource for African Journals Partnership Project (AJPP) Journals Christine Wamunyima Kanyengo
The DART Project: building the new collaborative e- research infrastructure Presentation to 2006 AusWeb Conference.
The Canadian Information Network for Research in the Social Sciences and Humanities Tim Au Yeung and Mary Westell Libraries.
Shruthi(s) II M.Sc(CS) msccomputerscience.com. Introduction Digital Libraries have become the source of information sharing across the globe for education,
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
DNER Architecture Andy Powell 6 March 2001 UKOLN, University of Bath UKOLN is funded by Resource: The Council for.
Accessing a national digital library: an architecture for the UK DNER Andy Powell ELAG 2001, Prague 7 June 2001 UKOLN, University of Bath
Hiberlink – Towards Time Travel for the Scholarly Web July 25 th 2013, Indianapolis, IN, USA 1 Hiberlink – Towards Time Travel for the Scholarly Web Martin.
Symposium on Global Scientific Data Infrastructures Panel Two: Stakeholder Communities in the DWF Ann Wolpert, Massachusetts Institute of Technology Board.
Digital repositories and scientific communication challenge Radovan Vrana Department of Information Sciences, Faculty of Humanities and Social Sciences,
Research Information Management: Continuity, Change and Impact Michael Jubb Research Information Network UUK Workshop 5 December 2007.
Hussein Suleman University of Cape Town Department of Computer Science Digital Libraries Laboratory February 2008 Data Curation Repositories:
4 way comparison of Data Citation Principles: Amsterdam Manifesto, CoData, Data Cite, Digital Curation Center FORCE11 Data Citation Synthesis Group Should.
The Importance of Standards in Digital Preservation Tina Norris Kayla Payne Jennifer
Author(s): Paul Conway, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share.
OAI-PMH for Resource Harvesting Tutorial OAI4, October 20 th 2005, CERN, Geneva, Switzerland The American Physical Society Project: Standards-based Mirroring.
Hiberlink is funded by the Andrew W. Mellon Foundation Investigating Reference Rot in Web-Based Scholarly Communication Martin Klein Los Alamos National.
Hiberlink is funded by the Andrew W. Mellon Foundation The Missing Link Proposal #hiberlink #memento Herbert.
Carl Lagoze Digital Library Service Registry Workshop Services in a Scholarly Communication Framework.
Herbert van de sompel Frye Leadership Institute Emory University, June 11th 2002 Herbert Van de Sompel Los Alamos National Laboratory – Research Library.
Lifecycle Metadata for Digital Objects September 4, 2002 Overall framework: OZ meets WC3.
CNI Task Force Meeting April 7, 2008 OAI-ORE Project Briefing David Reynolds Tim DiLauro Sayeed Choudhury Library Digital Programs Sheridan Libraries Johns.
1 CS 502: Computing Methods for Digital Libraries Guest Lecture William Y. Arms Identifiers: URNs, Handles, PURLs, DOIs and more.
Herbert Van de Sompel OCLC ESR, Washington, DC, December Archiving the Evolving Scholarly Record: A Perspective Herbert Van de
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
The R EPOSITORY AS P UBLISHER OPPORTUNITIES AND CHALLENGES IN A DUAL ROLE BEN HOCKENBERRY SYSTEMS LIBRARIAN | ST. JOHN FISHER COLLEGE.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Big Data, Little Data, No Data – Who is in Charge of Data Quality? World Data Systems Webinar #9 9 May 2016 Christine L. Borgman Distinguished Professor.
The Multi-Faceted Use of the OAI-PMH in the LANL Repository Written By: Henry, Xiaoming,Patrick Henry, Xiaoming,Patrick and Herbert. Presented By: Shashi.
DART: Drivers, Design, Dimensions, Demonstrators and Deliverables
Joseph JaJa, Mike Smorul, and Sangchul Song
Systems for scholarly communication
Accessing a national digital library: an architecture for the UK DNER
OAI protocol beyond discovery metadata
Research Data Management
Institutional Repositories
Presentation transcript:

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Archiving the Evolving Scholarly Record: A Perspective Herbert Van de Los Alamos National Laboratory Acknowledgments: Andrew ANDS

Herbert Van de Sompel OCLC ESR, Evanston, IL, March In This Talk 1.Functions of scholarly communication 2.Characterizing the future 3.Archiving the future

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Functions of Scholarly Communication Registration: Allows claims of precedence for a scholarly finding Certification: Establishes validity of the claim Awareness: Allows actors in the system to remain aware of new claims Archiving: Preserves the scholarly record over time Roosendaal, H, Geurts, C. (1997) Forces and functions in scientific communication

Herbert Van de Sompel OCLC ESR, Evanston, IL, March System of Journals, Paper Version Registration: Manuscript submission Certification: Peer review Awareness: alerts, library shelf surfing Archiving: Journals in library stacks

Herbert Van de Sompel OCLC ESR, Evanston, IL, March System of Journals, Digital Version Registration: Manuscript submission Certification: Peer review Awareness: Various web discovery services Archiving: Special purpose archives (e.g. Portico), publishers

Herbert Van de Sompel OCLC ESR, Evanston, IL, March In This Talk 1.Functions of scholarly communication 2.Characterizing the future 3.Archiving the future

Herbert Van de Sompel OCLC ESR, Evanston, IL, March The Future – Core Observations The research process, not just its outcome, is becoming visible … on the web Massive extension of the scholarly record with an enormous variety of novel objects The objects are heterogeneous, dynamic, compound, inter-related and distributed across the web The objects are often hosted on common web platforms that are not dedicated to scholarship

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Characterizing the Future – Scholarly Communication

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Characterizing the Future – Communicated Objects

Herbert Van de Sompel OCLC ESR, Evanston, IL, March In This Talk 1.Functions of scholarly communication 2.Characterizing the future 3.Archiving the future

Herbert Van de Sompel OCLC ESR, Evanston, IL, March The Future – Core Observations The research process, not just its outcome, is becoming visible … on the web Massive extension of the scholarly record with an enormous variety of novel objects The objects are heterogeneous, dynamic, compound, inter-related and distributed across the web The objects are often hosted on common web platforms that are not dedicated to scholarship The capture/archival paradigm must take these characteristics into account

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Considerations about Archiving On the right track? Capturing paradigms Pockets of persistence Recording versus Archiving A perspective on scholarly infrastructure

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Considerations about Archiving On the right track? Capturing paradigms Pockets of persistence Recording versus Archiving A perspective on scholarly infrastructure

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Web-Based Journal System – Links to Articles Special-purpose archival solutions for articles Rosenthal finds that what is archived is too few, too healthy, too easy Attempts with the Keepers Registry to map out what is archived Based on [ISSN, volume, issue], not on DOI, HTTP URI David Rosenthal (2013) Patio Perspectives at ANADP II: Preserving the Other Half

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Web-Based Journal System – Links to Articles Peter Burnhill (2014) Ensuring access to digital back copy

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Web-Based Journal System – Links to Web at Large Resources Web archives contain snapshots, the result of incidental archiving The Hiberlink project finds that for the large majority of these “Web at Large” resources, no temporally appropriate archived versions exist Memento infrastructure allows auditing what is globally archived based on HTTP URI

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Links Abstracted to Top Level Domain Targets Martin Klein, Herbert Van de Sompel et al. (2014) Scholarly context not found. In: PLOS ONE

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Loss of Current Context – Link Rot Martin Klein, Herbert Van de Sompel et al. (2014) Scholarly context not found. In: PLOS ONE

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Loss of Past Context – Archival Status (14 day window) Martin Klein, Herbert Van de Sompel et al. (2014) Scholarly context not found. In: PLOS ONE

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Considerations about Archiving On the right track? Capturing paradigms Pockets of persistence Recording versus Archiving A perspective on scholarly infrastructure

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Perspective on “Repository” Capture Paradigm Atomic object Finalized object Removal of context Perspective on object: file in a file system Capture request by owner of object Capture time decided by owner of object

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Perspective on “Web” Capture Paradigm Compound object (context essential) Constituents of compound object in flux Perspective on constituents: resources with URIs on the web Capture request by user of the constituents, owned by self, owned by 3 rd parties Capture time decided by user of the constituents

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Considerations about Archiving On the right track? Capturing paradigms Pockets of persistence Recording versus Archiving A perspective on scholarly infrastructure

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Creating Pockets of Persistence How to achieve the ability to: Persistently Precisely Seamlessly revisit the Scholarly Web of the Past and of the Now at some point in the Future

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Creating Pockets of Persistence How to achieve the ability to: Persistently Precisely Seamlessly revisit the Scholarly Web of the Past and of the Now at some point in the Future This challenge exists for the entire web, but some communities actually care about addressing it: scholarly communication, legal publications, journalism, Wikipedia, …

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Pro-Active Capture for a Seed Collection Seed Collection - Starting point for capture is a seed collection of interest to communities that care, e.g. o Scholarly literature o Legal documents o On-Line journalism o Wikipedia articles Lifecycle Events – Intervene at critical moments in the lifecycle of items in these collections to pro-actively capture o Collection items – some solutions in place o Web resources referenced in collection items

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Pro-Active Capture for a Seed Collection Request by agent (human, machine) interacting with A to capture A, B, C, D, E Request for capture may result in In-situ or remote capture Creation of snapshot or creation of trace Archival URI, capture datetime Interoperability for on-demand capture Orchestration of capture process

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Pro-Active Capture for Seed Collection What those crucial lifecycle events are may depend on the seed collection type Scholarly Literature

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Scholarly Literature: Experimental Zotero Extension Richard Wincewicz (2014) Prototype Hiberlink plugin for Zotero

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Scholarly Literature: Experimental HiberActive Service Martin Klein et al. (2014) HiberActive: Pro-Active Archiving of web references Open Repositories

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Considerations about Archiving On the right track? Capturing paradigms Pockets of persistence Recording versus Archiving A perspective on scholarly infrastructure

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Web Platforms for Scholarship Increasingly, common web platforms are used for scholarship GitHub, Wikis, Wordpress, etc. Many of these platforms have desirable characteristics Versioning Time stamping Social embedding But, these platforms record rather than archive

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Recording is not Archiving “GitHub reserves the right at any time and from time to time to modify or discontinue, temporarily or permanently, the Service (or any part thereof) with or without notice.” “GitHub does not warrant that (i) the service will meet your specific requirements, (ii) the service will be uninterrupted, timely, secure, or error-free, (iii) the results that may be obtained from the use of the service will be accurate or reliable, (iv) the quality of any products, services, information, or other material purchased or obtained by you through the service will meet your expectations, and (v) any errors in the Service will be corrected.” GitHub Terms of Service

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Recording versus Archiving RecordingArchiving Short-termLonger-term No guarantees providedAttempt to provide guarantees Write many/read manyWrite once/Read many Scholarly processScholarly record

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Considerations about Archiving On the right track? Capturing paradigms Pockets of persistence Recording versus Archiving A perspective on scholarly infrastructure

Herbert Van de Sompel OCLC ESR, Evanston, IL, March

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Infrastructure Considerations Various incentives to move objects from Private to Recording: Share with self, team, comply with funder requirements Objects in Recording are network accessible and in global (HTTP) namespace Within reach of web-scale processes aimed at selectively moving them from Recording to Archiving Core aspects of these processes include Ability to snapshot the state of interlinked objects at specific moments in their lifecycle Transfer of snapshots from Recording platforms to appropriate, distributed Archive platforms (interoperability) Decisions regarding which objects should be captured

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Capture Considerations What are the criteria involved in deciding (which states of) which objects get captured/archived? What triggers transition from Recording to Archiving? On-demand in lifecycle, social status of the object, reference made to object, deliberate randomness for serendipity, … What to capture/archive? Snapshot of object or trace of object (metadata, provenance, …) ? What is the Scholarly Record that requires archiving? Outcome? Process and Outcome?

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Archiving the Evolving Scholarly Record: A Perspective Herbert Van de Los Alamos National Laboratory Acknowledgments: Andrew ANDS

Herbert Van de Sompel OCLC ESR, Evanston, IL, March In This Talk 1.Functions of scholarly communication 2.Pointers to the future 3.Characterizing the future 4.Archiving the future

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Registration - GitHub

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Registration - Neurolex

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Registration – Research Objects

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Registration - Observations Registration of wide variety of objects dynamic, compound, inter-related, distributed across the web Decoupling registration from certification Time stamping, versioning

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Certification – The Open Journal

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Certification – slideshare

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Certification - Observations Certification decoupled from registration Certification of various types of objects Social interactions validating Machines validating

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Awareness – Twitter

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Awareness – eLabNoteBook RSS Feeds

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Awareness - Observations Awareness for various types of objects including objects involved in the research process Real time awareness Awareness through social media

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Archiving – DANS Easy

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Archiving – Australian Antarctic Data Centre

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Archiving – perma.cc

Herbert Van de Sompel OCLC ESR, Evanston, IL, March Archiving - Observations Archiving/Archives for various types of objects Distributed archives Archival consortia Audit for trustworthiness