Institutional digital repositories: What role do they have in curation? Steve Hitchcock, JISC KeepIt Project ECS, University of Southampton ICE Forum,

Slides:



Advertisements
Similar presentations
EPrints - Introducing EPrints 3 Software William J Nixon Digital Library Development Manager, University of Glasgow With many thanks to Les Carr and the.
Advertisements

HathiTrust Sharing a Federal Print Repository: Issues and Opportunities May 25, 2011 Heather Christenson.
What is HathiTrust and How Can it Make a Difference? Sourcing and Scaling brought to the collective collection.
Preserv Preservation Eprint Services Simple Preservation Services – towards Proactive Support for the Institutional Repository.
28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
How to make preservation into the repository's friend Steve Hitchcock Preserv 2 Project School of Electronics and Computer Science (ECS), University of.
Engaging repository policy with preservation Steve Hitchcock and Neil Jefferies* Preserv 2 Project School of Electronics and Computer Science (ECS), Southampton.
Engaging repository policy with preservation Steve Hitchcock and Neil Jefferies* Preserv 2 Project School of Electronics and Computer Science (ECS), Southampton.
Preserv Preservation Eprint Services Scenario: Digital lifecycle begins with author creation and deposit of paper or data content into the institutional.
IRs: towards preservation services Steve Hitchcock Preserv Project Intelligence Agents Multimedia Group, School of Electronics and Computer Science (ECS),
Reshaping Preserv 2 from a Life(cycle) perspective Steve Hitchcock and Dave Tarrant Preserv 2 Project School of Electronics and Computer Science (ECS),
E-Print Repositories for Research Visibility: T ime to Deposit Pauline Simpson and Jessie Hey 17/10/03.
Repository models and policies for preservation Steve Hitchcock Preserv Project Intelligence Agents Multimedia Group, School of Electronics and Computer.
Repository preservation services: divisible, viable and sustainable? Steve Hitchcock Preserv 2 Project Intelligence Agents Multimedia Group, School of.
The IR on the International Stage Mary Robinson SHERPA, University of Nottingham Embedding Repositories event, University of Lincoln,
DRIVER Building a worldwide scientific data repository infrastructure in support of scholarly communication 1 JISC/CNI Conference, Belfast, July.
DRIVER Long Term Preservation for Enhanced Publications in the DRIVER Infrastructure 1 WePreserve Workshop, October 2008 Dale Peters, Scientific Technical.
1 Integrating user environments and data liquidity to improve the research experience.
Intute Repository Search Project A showcase for UK research output Sophia Jones SHERPA October.
RSP Goes Back to School September 2009 Mary Robinson European Development Officer University of Nottingham, UK
OR11, University of Texas, Austin 8 th – 10 th June 2011 Emily Nimmo.
1 SHERPA Securing a hybrid environment for research preservation and access.
Enlighten: Glasgows Universitys online institutional repository Morag Greig University Library.
1 NECOBELAC Project WORK PACKAGE 3 Cross-national advocacy infrastructure.
Building Repositories of eprints in UK Research Universities Bill Hubbard SHERPA Project Manager University of Nottingham.
UCL LIBRARY SERVICES LERU and Open Access and E-Presses Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer President of LIBER (Association.
Scholarly Communications in Flux Michael Jubb Director, Research Information Network Bloomsbury Conference on E-Publishing and E-Publications 29 June 2007.
Discovery Archives Discovery Forum Guy Grannum 21 March 2012.
Working together in difficult times: Challenges for academic libraries Sally Curry Research Information Network JIBS Conference York, 2 December 2010.
The role of libraries in supporting research Alma Swan Key Perspectives Ltd Truro, UK M25 Consortium of Academic Libraries General Meeting, London, 24.
Researchers and academic libraries Alma Swan Key Perspectives Ltd Truro, UK Quebec universities libraries sub-committee conference, Quebec, 9 May 2008.
Digital Preservation Tools for Repository Managers A practical course in five parts presented by the KeepIt project in association with School of ECS,
Sustainability of repositories - and EPrints Repositories – Software – Community Steve Hitchcock, WAIS, ECS, University of Southampton Kultivate Sustainability.
Applying preservation metadata to repositories For JISC KeepIt course on Digital Preservation Tools for Repository Managers Module 3, Primer on preservation.
Digital Preservation Tools for Repository Managers A practical course in five parts Concluding the course Module 5 University of Northampton, 30 March.
CURRENT ISSUES Current contents Over 3,000 items open access, 42% reports and working papers, 21% journal articles, 21% conference items, 7% book chapters,
Institutional Repositories: Laying Foundations for a New Era of Scholarly Communication? Jessie Hey Online Information London, UK 1 Dec 2004 A practical.
A centre of expertise in digital information management UKOLN is supported by: Think Digital: Best Practices for creation of Digital e-learning.
The Future of Scholarship in the Digital Age: The Role of Institutional Repositories Ann J. Wolpert Director of Libraries Massachusetts Institute of Technology.
A centre of expertise in digital information management UKOLN is supported by: Dealing with the Data Cloud Dr Liz Lyon, Director, UKOLN,
Moving Forward With Digital Preservation at the Library of Congress Laura Campbell Associate Librarian for Strategic Initiatives Library of Congress.
Digital Repositories: interoperability & common services Closing Remarks Dr Liz Lyon, UKOLN, University of Bath, UK
Collection-level description & the Information Landscape: users evaluate strategies for resource discovery Collection Description Focus Workshop 5 Cambridge,
A centre of expertise in data curation and preservation DigCCur2007 Symposium, Chapel Hill, N.C., April 18-20, 2007 Co-operation for digital preservation.
1 PORTO Open Repository Publications TORINO Technical architecture of U-GOV Pubblications Archive and PORTO Open Repository Publications Maddalena Morando.
Lorcan Dempsey OCLC Big Heads – Heads of Technical Services of Large Research Libraries ALA 2013 Chicago 28 June things about
Digital Preservation A Matter of Trust. Context * As of March 5, 2011.
New Knowledge Management Roles in Support of a University CTSA TRLN Annual Meeting July 25, 2011 New Roles for Librarians Barrie Hayes, Bioinformatics.
1 Implementing Internet Web Sites in Counseling and Career Development James P. Sampson, Jr. Florida State University Copyright 2003 by James P. Sampson,
1 Normalising digital literacies across the university Professor Neil Witt Head of Academic Support, Technology & Innovation.
Caren Milloy, Head of Projects, JISC Collections & Graham Stone, Information Resources Manager, University of #oapenuk.
Collections and services in the information environment JISC Collection/Service Description Workshop, London, 11 July 2002 Pete Johnston UKOLN, University.
Building repositories Iryna Kuchma, eIFL Open Access program manager, eIFL.net Presented at “Open Access: Maximising Research Impact” workshop, May 25.
Caren Milloy, Head of Projects, JISC #oapenuk.
Open Access (OA) Repositories Laurian Williamson, Open Access Adviser, Centre for Research Communications,
Anna Szot-Sacawa University of Toronto Bora Laskin Law Library TABLETS USE IN LIBRARIES OCUL webinar October 11, 2012 Bridging the physical & digital worlds.
Dave Chaffey, E-Business and E-Commerce Management, 4 th Edition, © Marketing Insights Limited 2009 Slide 1.1 Introduction to e-business and e-commerce.
C U L. OER definitions “... digitised materials offered freely and openly for educators, students and self-learners to use and.
DIGITAL HUMANITIES SUMMER SCHOOL 2011 DIGITAL LIBRARY TECHNOLOGIES AND BEST PRACTICE, PART 1: DECONSTRUCTING DIGITAL LIBRARIES Christine Madsen R&D Project.
Knowledge Exchange TF-EMC 2, Lyon - 14 February 2011 Christopher Brownhttp://
EDLocal kick off meeting June 26-27, María Luisa Martínez-Conde Subdirectorate General for Library Co-ordination Digital Libraries in Spain: Policies.
Information Professionals and Learning Object Repositories … more than just metadata quality … Sarah Currier Stòr Cùram Project Librarian JISC X4L Repository.
Oracle User Productivity Kit Professional Ensuring Success with Oracle Apps
1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.
Open Access to Grey Literature: Challenges and Opportunities in India By Dr. Manorama Tripathi Prof. H. N. Prasad Banaras Hindu University, Varanasi. Mr.
Services for Object Storage and Preservation March 2008 All content in these slides is considered work in progress. In no way does it represent an absolute.
Research libraries in a European e-science infrastructure Wouter Schallier Executive Director LIBER (Association of European Research Libraries)
A centre of expertise in digital information management UKOLN is supported by: Functional Requirements Eprints Application Profile Working.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Introduction to SHERPA RoMEO and its Significance for Publishers
Presentation transcript:

Institutional digital repositories: What role do they have in curation? Steve Hitchcock, JISC KeepIt Project ECS, University of Southampton ICE Forum, London, 29 June 2011

How much digital data? 9.57ZB of data processed by 27M computers in ZB of data in digital universe by year end TB/year Twitter TB data generated by 6 MIT case studies TB data generated by 1 MIT physics case study 3.5TB documents in 298 European repositories 2000TB Internet Archive Wayback Machine 394TB Hathi Trust 8.793M volumes 74TB LoC 15.3 million digital items online Meta MB Giga GB Tera TB Peta PB Exa EB Zetta ZB Yotta YB

Data generation layer - worldwide Moving data, data consumed 27M computers processed 9.57ZB in 2008 Americans consumed 3.6ZB in 2008 Bohn, Short, How Much Information? 2010 Report on Enterprise Server Information Static data, original sources EST. 1.2ZB of data in digital universe by year end 2010 IDC/EMC (2010) User-generated data Twitter 35MB/s, 155M tweets/day (ReadWriteWeb, May 25, 2011) = 196.5TB/year

The Rapid Growth in Unstructured Data, via

Repository layer DRIVER search (1 June 2011) documents in 298 repositories from 38 countries Est 1MB/doc = 3.5TB Weibel (blog) March 2009 Are data repositories new IRs? institutional-repositories.html Madnick, Smith, How much Info? July 2009 UCSD Webinar MIT 6 case studies – 16 faculty workers Total data generated 41391TB (Physics TB) 5-10x more data than 5 years ago, expect similar growth rates in future Chronopolis – data grid for replication multiple copies of valued data collections cf LOCKSS Lots Of Copies Keep Stuff Safe

Archive layer Internet Archive Wayback Machine contains c.2000TB, currently growing at a rate of 20TB/month Hathi Trust (beginning of June M volumes), 394TB 264/unlocking_hathitrust_inside_the_librarians.html.csp Library of Congress 15.3 million digital items online, 74TB nearly 142M items in the Librarys physical collections Matt Raymond, February 11, 2009 by LoC (start 2011) 147M items: 33M books + other print, 3M recordings, 12.5M photos, 5.4M maps, 6M sheet music, 64.5M manuscripts

Visualising data ratios (larger scale) Data generation Repository layer Archival layer Moving data (Bohn, Short, 2008) Static data (IDC 2010)

European IRs (DRIVER) MIT data case studies (2009) MIT physics case study (2009) Twitter/y Repository layer Archival layer Data generation Visualising data ratios (smaller scale) Internet Archive Wayback Machine Hathi Trust (June 2011) LoC digital items (2009) Moving data (Bohn, Short, 2008) X 10 7 Static data (IDC 2010) X 10 7

Digital repositories diversifying: institution-wide outputs ScienceTeaching ResearchArts KeepIt exemplar preservation repositories

Summary of implications of the KeepIt project findings Digital preservation starts with detailed knowledge and awareness of your own content The issues raised by preservation are the same as those raised by content management Data curation is likely to be a natural progression for a preservation-focussed repository Provenance of data should be a key role for research institutions Preservation tools are delivering specialist expertise directly to the user JISC should promote its role in the development of digital preservation tools more loudly Creating a sense of capability will assist those new to preservation practice Converged multi-data type repositories are likely to increase complexity for preservation Preservation should not be prioritized prematurely, especially among relatively new content repositories Digital institutional repositories will not instantly become preservation repositories, and repository managers are not archivists, but they both have a role in preservation

Digital institutional repositories will not quickly become preservation repositories, and repository managers are not archivists, but they both have a role in preservation As there are vastly more digital content repositories than 'preservation repositories, if we are to have preservation-ready content repositories then many more need to be allowed to navigate the path towards digital preservation without imposing on them all the requirements of specialists. Should we view target content repositories as first-stage curators rather than archivists, i.e. as a process that informs and selects for preservation? argues digital archival programs will be recreated by academies with trusted repository and OSS-that's KeepIt Thu May

Digital preservation starts with detailed knowledge and awareness of your own Shorter summary of DP: know what you have and value, assess risk, take action to avoid risk, repeat. Problem: people don't do it Thu Jan All the needs and requirements of preservation stem from this knowledge, enabling a repository manager, for example, to then select appropriate preservation tools and services. In essence, this is the problem that KeepIt set out to help the managers of different types of institutional repository to resolve.

Data curation is likely to be a natural progression for a preservation-focussed repository The work of NECTAR at the University of Northampton indicates the growing prevalence of the idea that repositories could be used for data curation, even if content (e.g. open access) repositories and data repositories remain separate within institutions to serve different metadata, interoperability and author requirements. If repositories are the new wave of scholarly communication, then data repositories in the cloud could be the next new wave.

Preservation tools are delivering specialist expertise directly to the user Widely and freely available tools can support a full preservation programme for repositories, from policy- making to costings, technical content management, and risk analysis. Analysis showed that around 70% of these tools had been developed in JISC projects.

Creating a sense of capability will assist those new to preservation practice Porter: 'create a sense of urgency'. No, create a sense of capability. That's what many JISC DP projects have done #brtf Fri May At a recent JISC end-of-programme event one keynote speaker questioned the impact of digital preservation on digital repositories. Once again, the situation was presented as urgent. Without reference to the range of tools now available for digital preservation, urgency unnecessarily detracts from creating a sense of capability.

What did the KeepIt exemplars do about preservation? All see preservation as an ongoing practical commitment, providing it can be managed within the scope of existing work and resources. We can expect to see progress where it fits with repository development and emerging requirements. We cannot expect to see all repositories take the same path towards preservation at the same speed. Progress will depend on type of repository content, but also on other factors including institutional issues, scale and growth of repository content.

Find out more about KeepIt Web: Blog: Diary of a Repository Preservation Project Papers and presentations, Repository: Presentations, Slideshare: Wiki: Training resources and bibliography Twitter: Final report (June 2011)