HathiTrust: A Big Idea with Bold Plans

Slides:



Advertisements
Similar presentations
Beyond the Google Book: the Future of the Digital Library Cory Snavely Library IT Core Services manager University of Michigan April 20, 2010.
Advertisements

HathiTrust Digital Library
HATHI TRUST A Shared Digital Repository Building A Future By Preserving Our Past The Preservation Infrastructure of HathiTrust Digital Library Jeremy York.
HATHI TRUST A Shared Digital Repository HathiTrust Digital Library Is There A Past In Your Future? Princeton University February 2010.
KAT HAGEDORN HATHITRUST SPECIAL PROJECTS COORDINATOR UNIVERSITY OF MICHIGAN LIBRARIES OCTOBER 9, 2009 Seamless Sharing: NYU, HathiTrust, ReCAP and the.
HathiTrust: Building the Universal Collection John Wilkin 18 May 2009.
This Library Never Forgets Preservation, Cooperation, and the Making of HathiTrust Digital Library Jeremy York Project Librarian HathiTrust Digital Library.
HATHI TRUST A Shared Digital Repository HathiTrust Overview Julie Bobay, Heather Christenson, and John Wilkin April 12, 2011.
HathiTrust Sharing a Federal Print Repository: Issues and Opportunities May 25, 2011 Heather Christenson.
HATHI TRUST A Shared Digital Repository Digital Preservation, HathiTrust, and the Reimagination of the Library Landscape Jeremy York Iceland August 5,
HATHI TRUST A Shared Digital Repository HathiTrust How We Can Make A Difference Jeremy York Yale University November 3, 2010.
What is HathiTrust and How Can it Make a Difference? Sourcing and Scaling brought to the collective collection.
HATHI TRUST A Shared Digital Repository HathiTrust 101 John Wilkin and Jeremy York August 27, 2010.
What is HathiTrust and Why is it relevant to research libraries? Sourcing and Scaling brought to the collective collection.
Building the Universal Library: Introducing HathiTrust Patricia A. Steele Indiana University Libraries John Price Wilkin University of Michigan Libraries.
KAT HAGEDORN HATHITRUST SPECIAL PROJECTS COORDINATOR UNIVERSITY OF MICHIGAN LIBRARIES OCTOBER 9, 2009 Seamless Sharing: NYU, HathiTrust, ReCAP and the.
A Community Approach to Preservation: Experiences with Social Science Data ASIST Summit 2010 Jonathan Crabtree April 9, 2010.
E-Content Service Group Virtual Meeting Digital Preservation: How to Get Started.
Digital Preservation A Matter of Trust. Context * As of March 5, 2011.
Preserving and Sharing Digital Data Greg Colati, Director, Archives and Special Collections May 11, 2012.
DuraSpace: Digital Information All Ways, Always Pretoria, South Africa May 14 th, 2009.
HATHITRUST A Shared Digital Repository We’re Preserving the Past, What About the Present? NISO Webinar: Ensuring the Preservation of E-Books May 23, 2012.
HATHITRUST A Shared Digital Repository HathiTrust current work, challenges, and opportunities for public libraries Creating a Blueprint for a National.
HATHITRUST A Shared Digital Repository HathiTrust as a Model for Preservation and Access Jeremy York Media Preservation Conference April 17, 2013.
HATHITRUST A Shared Digital Repository The HathiTrust Print Monograph Archive Planning Task Force Print Archive Network Forum ALA 2015 Midwinter Meeting.
The Documentum Team Lance Callaway, Brooke Durbin, Perry Koob, Lorie McMillin, Jennifer Song Missouri University of Science and Technology Rolla, Missouri.
NEXT-GEN AND MULTI- INSTITUTIONAL TECHNICAL SERVICES ARL Membership Meeting Brian E. C. Schottlaender The Audrey Geisel University Librarian 21 May 2009.
Moving Shared Print to the Network Level Emily Stambaugh ALA Annual Conference Las Vegas, NV June 27, 2014 “Looking to the Future of Shared Print” Shared.
HATHITRUST A Shared Digital Repository A Preservation Infrastructure Built to Last: Preservation, Community, and HathiTrust UNESCO Memory of the World.
Introduction to Implementing an Institutional Repository Delivered to Technical Services Staff Dr. John Archer Library University of Regina September 21,
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
HATHITRUST A Shared Digital Repository HathiTrust Past, Present, and Future A Brief Introduction.
HATHITRUST A Shared Digital Repository More, Better, Together: HathiTrust Accomplishments and Aspirations The Researcher of Tomorrow Universidad Complutense.
HathiTrust – How To By Dr. Rob McGeachin 20 th Annual AgNIC Meeting May 7, 2015.
HATHITRUST A Shared Digital Repository HathiTrust: Putting Research in Context HTRC UnCamp September 10, 2012 John Wilkin, Executive Director, HathiTrust.
Delivering a New Desktop and Application Deployment Strategy Indiana University and the New Emerging Personal Computing Model Duane Schau
Merging the National Library and the National Archives LIBER General Annual Conference, Tartu, June 2012 Els van Eijck van Heslinga, Head Finance and Corporate.
HathiTrust Digital Library. Overview ›Began in 2008 ›Large scale digital preservation repository ›Partnership of major research libraries ›Focus on both.
Robert H. McDonald Associate Dean for Library Technologies Associate Director Data to Insight Center DLP Brownbag IU and the BIG DIGITAL MACHINE.
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Breana McCracken University of Illinois at Urbana-Champaign HathiTrust and Copyright Future Implications - Strong precedent for libraries to continue to.
HATHITRUST A Shared Digital Repository HathiTrust and TRAC DigitalPreservation 2012 July 25, 2012 Jeremy York, Project Librarian, HathiTrust.
H ATHI T RUST HTTP :// WWW. HATHITRUST. ORG Large-Scale Digital Initiatives and their potential impact on the Maine Shared Collections Strategy Colby College.
Challenges and Opportunities for Academic Libraries Collaborative Imperatives to Support Collections, Digital Initiatives, and New Services for a Changing.
HathiTrust’s Past, Present and Future. Short- and Long-term Functional Objectives Short-term Page turner mechanism (and Mobile!) Branding (overall initiative;
Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share.
HATHITRUST A Shared Digital Repository The HathiTrust Print Monograph Archive Planning Task Force Print Archive Network Forum ALA 2015 Annual Meeting June.
Implementing an Institutional Repository: Part III 16 th North Carolina Serials Conference March 29, 2007 Resource Issues.
HATHITRUST A Shared Digital Repository HathiTrust and the Future of Research Libraries American Antiquarian Society March 31, 2012 Jeremy York, Project.
2CUL: EMERGING MODEL OF DEEP COLLABORATION? Anne R. Kenney ASERL Fall 2010 Membership Meeting.
Big Heads July 10, 2009 Next Generation Technical Services Rethinking Library Technical Services for the University of California.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
Symposium on Global Scientific Data Infrastructures Panel Two: Stakeholder Communities in the DWF Ann Wolpert, Massachusetts Institute of Technology Board.
Preservation Program Digital Preservation Program Digital Preservation Services: Extending tools to meet campus needs Patricia Cruse, Director, Digital.
HATHITRUST A Shared Digital Repository Institution Uses of HathiTrust Jeremy York University of Maine May 24, 2013.
Warwick Cathro Assistant Director-General Resource Sharing and Innovation National Library of Australia Trove – a service built on collaboration OCLC Asia.
April 14, 2005MIT Libraries Visiting Committee Libraries Strategic Plan Theme III Work to shape the future MacKenzie Smith Associate Director for Technology.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
HATHITRUST A Shared Digital Repository HathiTrust Large Digital Libraries: Beyond Google Books Modern Language Association January 5, 2012 Jeremy York,
Leveraging the Expertise of our Staff and the Information Resources We Manage MIT Libraries Visiting Committee April 13, 2005.
Chang, Wen-Hsi Division Director National Archives Administration, 2011/3/18/16:15-17: TELDAP International Conference.
Grant Writing for Digital Projects September 2012 IODE Project Office IODE Project Office Oostende, Belgium Oostende, Belgium Sustainability and.
HathiTrust: A valuable and visionary Partnership.
Redefining the Library’s Role through an Institutional Repository Sharon Mader, Dean Jeanne Pavy, Scholarly Communications Librarian Earl K. Long Library.
HathiTrust Digital Library Interface and Services
Trustworthiness of Preservation Systems
Building the Universal Library: Introducing HathiTrust
Implementing an Institutional Repository: Part III
HathiTrust And Its Research Center
Institutional Repositories
Presentation transcript:

HathiTrust: A Big Idea with Bold Plans Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication Statewide IT Conference, Indiana University Sept. 27, 2010

HathiTrust - Outline A Big Idea Statewide IT Conference, Indiana University September 27, 2010 HathiTrust - Outline A Big Idea Mission and Goals; Partners; Governance Content and Use Relationship to Google Books and Internet Archive Size, characteristics of content A few words about technology Bold Plans

Importance of A Name Hathi (pronounced hah-tee) Statewide IT Conference, Indiana University September 27, 2010 Importance of A Name Hathi (pronounced hah-tee) Hindi word for elephant, an animal highly regarded for its memory, wisdom, and strength Trust A core value of research libraries and one of their greatest assets. In combination, the words convey the key benefits researchers can expect from a first-of-its-kind shared digital repository There’s an elephant in the library.

Statewide IT Conference, Indiana University September 27, 2010 What is HathiTrust? Started in 2008 as a partnership among research libraries, HathiTrust is an open web resource that aggregates, preserves and provides access to the collections of member libraries. Initial purpose was to provide trusted shared repository for books and journals digitized by and available through Google Books and Internet Archive

Google Books/Internet Archive Statewide IT Conference, Indiana University September 27, 2010 Google Books/Internet Archive In 2004, Google began digitizing the books and journals from many major research libraries in U.S. – including, starting in 2008, IU’s Some libraries, including the University of California, had similar digitization projects with the Internet Archive Books and journals digitized from these projects were deposited in HathiTrust 5

Current HathiTrust Partners: 29 and Counting Statewide IT Conference, Indiana University September 27, 2010 Current HathiTrust Partners: 29 and Counting Columbia University Dartmouth University University of California system (11 libraries) CIC (Committee on Institutional Cooperation) (12 libraries) University of Chicago University of Minnesota University of Illinois Northwestern University Indiana University Ohio State University University of Iowa Pennsylvania State University University of Michigan Purdue University Michigan State University University of Wisconsin, Madison New York Public Library Princeton University University of Virginia Yale University

Statewide IT Conference, Indiana University September 27, 2010 If Google and Internet Archive have these books, why do we need HathiTrust? HathiTrust’s mission is much broader than simply to replicate Google Books: Contribute to the common good by collecting, organizing, preserving, communicating, and sharing the record of human knowledge. 7

Why do we need HathiTrust? (1) Statewide IT Conference, Indiana University September 27, 2010 Why do we need HathiTrust? (1) Preservation…For The Long Term Better entrusted to research libraries than to a private corporation, even a benevolent one Not just preserving bits Full preservation program, including active curation, metadata, migration, management plans, etc. Seeking TRAC Certification (Trustworthy Repository Audit and Certification) 8

Why do we need HathiTrust? (2) Statewide IT Conference, Indiana University September 27, 2010 Why do we need HathiTrust? (2) Expanded access and discoverability Full-text access to pre-1923 books and journals, plus those which have had rights cleared Beyond full-text keyword search: enhanced discoverability options 9

Why do we need HathiTrust? (3) Statewide IT Conference, Indiana University September 27, 2010 Why do we need HathiTrust? (3) Focus on scholarly values and needs Develop content, access and functionality that meets needs of researchers Share expertise and cost of preserving and providing access to scholarly record among institutions who share this fundamental mission 10

HathiTrust: Getting Started Statewide IT Conference, Indiana University September 27, 2010 HathiTrust: Getting Started Initial development responsibility: University of Michigan, with mirror site at IUPUI, administered by UITS Enterprise Infrastructure Much future development will be distributed among partner institutions under direction of HathiTrust Executive Committee

Statewide IT Conference, Indiana University September 27, 2010 A Unique Partnership HathiTrust is library work at scale; an early example of an “above-campus” service A new experiment in collaboration Not a separate entity; not a 501(c)(3) like Sakai, Kuali, DuraSpace or many open source software projects Instead, a jointly-funded, jointly governed, jointly developed partnership. Together, we are HathiTrust.

Sustainability: HathiTrust Governance 2008-2012 Statewide IT Conference, Indiana University September 27, 2010 Sustainability: HathiTrust Governance 2008-2012 Executive Committee Budget, finances, decision making Strategic Advisory Board Guidance on policy and planning HathiTrust staff Working groups and committees

Current Working Groups Statewide IT Conference, Indiana University September 27, 2010 Current Working Groups Discovery Interface Collections Quality Communication Usability Storage Development Environment Research Center

HathiTrust Functional Framework Governance Budget, Finances Decision-making Policy Planning Enterprise Management Communication and Coordination with partner institutions Project management Repository Administration Hardware configuration and maintenance Web and application server configuration and maintenance Security Permissions Logging Data management (content storage, backup, integrity checks, deletion) Hardware selection and replacement Content and Metadata specifications Disaster Recovery Processes for ensuring content integrity Rights Management Copyright determination Copyright review Copyright information management (database) Rightsholder permissions Bibliographic Data Management Entity description (record-level) Object identification (item-level) Data availability Collection Development Digital Expansion beyond books and journals (born-digital, images and maps, audio) Selection of content (for non-Google volume ingest and pilots projects) Print Cloud Library (effect of digital on print) e-Commerce Print on Demand Content Ingest Transformation Validation Content Access PageTurner Collection Builder Large-scale Search Bibliographic Catalog Research Center APIs Quality Assurance Quality Review Content Certification User Services Usability User support (helpdesk) Outreach Project website Monthly newsletter Papers and presentations Communication with potential partners Surveys, general inquiries Repository evaluation and audit (e.g., DRAMBORA, TRAC) Legal Risk management (use of materials) Partner agreements Advocacy  Financial contributions of partners HathiTrust Functional Framework

Next steps in governance Statewide IT Conference, Indiana University September 27, 2010 Next steps in governance 5-year agreements, reviewed in the third year of every term First Constitutional Convention will be in 2012 Partners will determine governance structures and partnership models, effective 2013

Focus On Users Preservation…with access Statewide IT Conference, Indiana University September 27, 2010 Focus On Users Preservation…with access Benefits to IU researchers and their colleagues around the world: Ensure long-term preservation and access Increase discoverability Create scholarly tools Expand content beyond Google and Internet Archive

HathiTrust – constantly changing Statewide IT Conference, Indiana University September 27, 2010 HathiTrust – constantly changing Rapid growth and development; fluid environment Next few slides describe HathiTrust currently Will follow with discussion about future plans

Statewide IT Conference, Indiana University September 27, 2010 HathiTrust - Content The vast majority of what is currently in HathiTrust consists of files received from Google from volumes digitized by Google for Google Book Search Almost all of the remainder consists of files received from Internet Archive. Much of the content from University of California comes by way of Internet Archive

Statewide IT Conference, Indiana University September 27, 2010 HathiTrust Content (2) Since not all of Google’s “library partners” are members of HathiTrust, and none of Google’s publisher partners are, HathiTrust is still (mostly) a subset of what is in Google Book Search. However….

Statewide IT Conference, Indiana University September 27, 2010 HathiTrust Content (3) Because of HathiTrust’s copyright clearance project, there are some things available in full text in HathiTrust that are only available in “snippet view” in Google. Because of Internet Archive, there are probably some things in HathiTrust that are not available in Google at all.

HathiTrust - focus on collections Statewide IT Conference, Indiana University September 27, 2010 HathiTrust - focus on collections HathiTrust is about collections, not simply Google digitization For example: access for persons with print disabilities opening access for public domain volumes collection building tool high-quality bibliographic data necessary for scholarly work

Content Growth Statewide IT Conference, Indiana University September 27, 2010 Content Growth

Content Distribution Statewide IT Conference, Indiana University September 27, 2010 Content Distribution

Language Distribution (1) Statewide IT Conference, Indiana University September 27, 2010 Language Distribution (1)

Language Distribution (2) Statewide IT Conference, Indiana University September 27, 2010 Language Distribution (2)

Statewide IT Conference, Indiana University September 27, 2010 Dates

Originating Institution Statewide IT Conference, Indiana University September 27, 2010 Originating Institution

Content Over Time Statewide IT Conference, Indiana University September 27, 2010 Content Over Time

Statewide IT Conference, Indiana University September 27, 2010

Statewide IT Conference, Indiana University September 27, 2010

Statewide IT Conference, Indiana University September 27, 2010

Statewide IT Conference, Indiana University September 27, 2010

HathiTrust DataGrid Using Isilon Clustered Storage System Statewide IT Conference, Indiana University September 27, 2010 HathiTrust DataGrid Using Isilon Clustered Storage System Similar principles to a datagrid using WAFS (OneFS) Wide Area File System (2.3 PB per file system) Automated data replication among nodes Currently Two Nodes Ann Arbor - University of Michigan Indianapolis – Indiana University NOC Connected via I-Light and Michigan Lambda Rail

HathiTrust Grid Isilon OneFS Currently Supports Statewide IT Conference, Indiana University September 27, 2010 HathiTrust Grid Indianapolis Ann Arbor Isilon OneFS Currently Supports up to 2.3 PB between Two Nodes

More on HathiTrust Technology Statewide IT Conference, Indiana University September 27, 2010 More on HathiTrust Technology http://www.hathitrust.org/technology

Statewide IT Conference, Indiana University September 27, 2010 A Use Case IUB scholar needed quick access to a definitive 52-volume set of Voltaire’s work published in late 1800s; deadline approaching Had been transferred to the Auxiliary Library Facility Available in HathiTrust and Google Books Google Books not usable for this scholarly purpose Able to do work much more efficiently and quickly in HathiTrust

HathiTrust’s Bold Plans Statewide IT Conference, Indiana University September 27, 2010 HathiTrust’s Bold Plans We believe the HathiTrust of tomorrow will look very different from the HathiTrust of today Google and Internet Archive digitized volumes just the beginning The sky’s the limit (or, more accurately, the combined will and resources of the partnership are the limit)

Vision for the future: More Content Statewide IT Conference, Indiana University September 27, 2010 Vision for the future: More Content Current and backlist scholarly monographs Born-digital materials Some locally-digitized collections Some non-book/non-journal resources …anything that is appropriate for a research library collection AND IS A SHARED PRIORITY FOR PARTNERS

Vision for the future: More Content (2) Statewide IT Conference, Indiana University September 27, 2010 Vision for the future: More Content (2) More full-text: Google Book Settlement - if approved: could receive all Google-digitized files to preserve could make much more full-text available Rights-clearing project - open access to public domain materials

Vision for the Future: More Functionality Statewide IT Conference, Indiana University September 27, 2010 Vision for the Future: More Functionality Research tools Computational research Advanced collection builders Advanced discovery Expanded quality processes Rigorous preservation guarantees Defining paths for fair uses Tools for shared print collection management

Vision for the Future: Enhanced Discoverability Statewide IT Conference, Indiana University September 27, 2010 Vision for the Future: Enhanced Discoverability Not just keyword searching of full-text Highly-functional bibliographic access HathiTrust catalog Integration into other discovery tools: IUCAT, WorldCat, Discovery Services

HathiTrust and local digital library initiatives Statewide IT Conference, Indiana University September 27, 2010 HathiTrust and local digital library initiatives HathiTrust is a solution for large-scale, shared high-priority needs of partners; currently optimized for digitized monographs and journals Partners will identify priorities for content and functionality development HathiTrust will not supplant all institutionally-based digital library initiatives Local digital library collections and services will still be needed

How Can HathiTrust Make a Difference? Statewide IT Conference, Indiana University September 27, 2010 How Can HathiTrust Make a Difference? Future not yet known precisely, but… For the first time in history, HathiTrust has: defined a large-scale partnership to achieve a large-scale goal built the first version of a very large, high-quality shared repository Building blocks to ensuring that research collections, print and digital: are preserved, curated, highly discoverable and accessible retain their research value in a digital platform

Some lessons learned so far Statewide IT Conference, Indiana University September 27, 2010 Some lessons learned so far HathiTrust can serve as shared repository for mass digitized library collections HathiTrust can provide organizational structure for other collaborations Shared print collection management Bibliographic integration The research library community is able to collaborate deeply to attain shared goals

HathiTrust Mission - redux Statewide IT Conference, Indiana University September 27, 2010 HathiTrust Mission - redux Contribute to the common good by collecting, organizing, preserving, communicating, and sharing the record of human knowledge. 46

Statewide IT Conference, Indiana University September 27, 2010 Credits Our thanks to colleagues who generously granted us permission to use their slides for this presentation: John Wilkin, HathiTrust Executive Director Jeremy York, HathiTrust Project Librarian Heather Christenson, Mass Digitization Project Manager, California Digital Library Also, many of the ideas for this presentation based on: Courant, Paul N. and John Wilkin. “Building ‘Above Campus’ Library Services.” Educause Review, July/August 2010, 74-75.