HathiTrust Digital Library Interface and Services

Slides:



Advertisements
Similar presentations
HathiTrust Unless otherwise noted, these slides and their contents are licensed under a Creative Commons Attribution Unported License.
Advertisements

Beyond the Google Book: the Future of the Digital Library Cory Snavely Library IT Core Services manager University of Michigan April 20, 2010.
HATHI TRUST A Shared Digital Repository Building A Future By Preserving Our Past The Preservation Infrastructure of HathiTrust Digital Library Jeremy York.
KAT HAGEDORN HATHITRUST SPECIAL PROJECTS COORDINATOR UNIVERSITY OF MICHIGAN LIBRARIES OCTOBER 9, 2009 Seamless Sharing: NYU, HathiTrust, ReCAP and the.
HathiTrust Sharing a Federal Print Repository: Issues and Opportunities May 25, 2011 Heather Christenson.
What is HathiTrust and How Can it Make a Difference? Sourcing and Scaling brought to the collective collection.
How HathiTrust Serves the UC Community Users Council May 21, 2012 Heather Christenson, California Digital Library.
Digital Preservation A Matter of Trust. Context * As of March 5, 2011.
HathiTrust Digital Library: Enrich Your Research and Scholarship Doreen Bradley Chris Powell University Library May 2011.
HATHITRUST A Shared Digital Repository We’re Preserving the Past, What About the Present? NISO Webinar: Ensuring the Preservation of E-Books May 23, 2012.
HATHITRUST A Shared Digital Repository HathiTrust current work, challenges, and opportunities for public libraries Creating a Blueprint for a National.
HATHITRUST A Shared Digital Repository HathiTrust as a Model for Preservation and Access Jeremy York Media Preservation Conference April 17, 2013.
HATHITRUST A Shared Digital Repository HathiTrust on the Move A Growing Partnership Taking Stock and Looking Ahead National Library of Medecine October.
HATHITRUST A Shared Digital Repository HathiTrust: A Second Life for Library Collections Jeremy York Exploring Humanities Cyberinfrastructure April 30,
Newspaper Preservation through Collaboration and Communication The Texas Digital Newspaper Program By Ana Krahmer & Mark Phillips University of North Texas.
HATHITRUST A Shared Digital Repository Big Collections in an Era of Big Copyright: Practical Strategies for Making the Most of Digitized Heritage Jeremy.
HATHITRUST A Shared Digital Repository HathiTrust Overview: Partnership and Services Jeremy York Wesleyan University Web Presentation February 18, 2014.
N ew Stage of the Digital Library of the National Diet Library of Japan: Digitization of Japanese Books and Digital Archive Portal By Kazuharu Honda Assistant.
Europeana: Europe's Digital Library, Museum and Archive Ashley Carter and Dana Sagona.
1 Archiving and Preserving the Web Kristine Hanna Internet Archive April 2006.
HATHITRUST A Shared Digital Repository HathiTrust Past, Present, and Future A Brief Introduction.
Partnership agreement between Complutense University and Google Books Manuela Palafox Parejo Servicio Edición Digital y Web Biblioteca de la Universidad.
HATHITRUST A Shared Digital Repository More, Better, Together: HathiTrust Accomplishments and Aspirations The Researcher of Tomorrow Universidad Complutense.
New Innovative Access to Educational and Cultural Multimedia Contents Yuka Egusa Educational Resources Research Center, National Institute for Educational.
HathiTrust – How To By Dr. Rob McGeachin 20 th Annual AgNIC Meeting May 7, 2015.
HATHITRUST A Shared Digital Repository HathiTrust: Putting Research in Context HTRC UnCamp September 10, 2012 John Wilkin, Executive Director, HathiTrust.
HATHITRUST A Shared Digital Repository Collaborating Globally, Planning Locally HathiTrust and New Opportunities in Collection Management GWLA/UNM: Emerging.
Web Capture team Office of strategic initiatives February 27, 2006 Selecting Content from the Web: Challenges and Experiences of the Library of Congress.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
HATHITRUST A Shared Digital Repository HathiTrust Infrastructure and Information Organization November 7, 2011 Jeremy York Project Librarian, HathiTrust.
HathiTrust Digital Library. Overview ›Began in 2008 ›Large scale digital preservation repository ›Partnership of major research libraries ›Focus on both.
Digitization Panel August 12, 2010 Christopher C. Brown, coordinator Mike Culbertson, Colorado State U. James Mauldin, GPO.
Google Books, UMI and Other Intriguing Trends in Digital Publishing Joe Wible Hopkins Marine Station of Stanford University October 9, 2006.
From Concept to Reality: An overview of the University of Wisconsin Digital Collections Melissa Mclimans.
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
National Park Service U.S. Department of the Interior Resource Information Management Division National Information Systems Center Office of the Chief.
The Legislative Library of Ontario’s Ontario Documents Repository Road to Partnership.
HATHITRUST A Shared Digital Repository HathiTrust: Key Concepts and Issues in Managing the Digital Archive ICPSR Summer Workshop “Curating and Managing.
HTRC Workshop 101 THATCamp Gainesville April 24, 2014.
Breana McCracken University of Illinois at Urbana-Champaign HathiTrust and Copyright Future Implications - Strong precedent for libraries to continue to.
Preserving Digital Culture: Tools & Strategies for Building Web Archives : Tools and Strategies for Building Web Archives Internet Librarian 2009 Tracy.
HATHITRUST A Shared Digital Repository HathiTrust and TRAC DigitalPreservation 2012 July 25, 2012 Jeremy York, Project Librarian, HathiTrust.
H ATHI T RUST HTTP :// WWW. HATHITRUST. ORG Large-Scale Digital Initiatives and their potential impact on the Maine Shared Collections Strategy Colby College.
Challenges and Opportunities for Academic Libraries Collaborative Imperatives to Support Collections, Digital Initiatives, and New Services for a Changing.
OCLC Research: Selected projects Eric Childress Larry Olszewski Presentation for Dpto. Biblioteconomía y Documentación Universidad Carlos III de Madrid.
HathiTrust’s Past, Present and Future. Short- and Long-term Functional Objectives Short-term Page turner mechanism (and Mobile!) Branding (overall initiative;
Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share.
0 Open Content Alliance Initiative: Building a Digital Archive Global Content for Universal Access Warren Holder University of Toronto.
9/26/2007OCLC Orientation & Services1 What is OCLC?
HATHITRUST A Shared Digital Repository HathiTrust and the Future of Research Libraries American Antiquarian Society March 31, 2012 Jeremy York, Project.
INTELLECTUAL RIGHTS AND HISTORIC CORPORA Mark Sandler University of Michigan ICOLC, March, 2003.
The Oxford-Google Digitization Project* Michael Popham Oxford Digital Library * Rules of commercial confidentiality apply to this presentation!
HATHITRUST A Shared Digital Repository Institution Uses of HathiTrust Jeremy York University of Maine May 24, 2013.
HathiTrust: Possibilities Metadata Working Group Cornell University Library March 21, 2014.
HATHITRUST A Shared Digital Repository HathiTrust Large Digital Libraries: Beyond Google Books Modern Language Association January 5, 2012 Jeremy York,
Barbara Preece ICOLC, April Mark Sandler Center for Library Initiatives Chicago Illinois Indiana Iowa Michigan Michigan State Minnesota Northwestern.
Digitizing Historical Newspapers South Carolina Digital Newspaper Program's participation with the Library of Congress' Chronicling America: Historic American.
HathiTrust: A valuable and visionary Partnership.
CENTRAL/WESTERN MASSACHUSETTS AUTOMATED RESOURCE SHARING Digitization GOALS & THEIR LOGISTICS Michael J. Bennett Digital Initiatives Librarian C/WMARS,
HathiTrust--a GovDocs Repository? Brian Vetruba, Catalog Librarian/Germanic Studies Librarian Washington University in St. Louis Leveraging.
Arabic Collections Online (ACO)
Discovering the HathiTrust Digital Library Collection
Mass Digitization of Books and the Potential for Universal Access
HathiTrust Copyright Review
GALILEO Support Services December 2008
Digital Collections Update
Building the Universal Library: Introducing HathiTrust
Christopher C. Brown Reference Librarian
Expanding Access, Fair Use, and Creative Commons
HathiTrust And Its Research Center
Presentation transcript:

HathiTrust Digital Library Interface and Services Angelina Zaytsev Collection Services Librarian azaytsev@hathitrust.org

Agenda Collection overview Interface overview Other services If time permits: Programs Working groups & committees Governance & partnership

Collection Overview

HathiTrust Collections: Oct 2016 14.7 million total items 7.4 million book titles 405,000 serial titles 767,000 US federal government documents 5.7 million items open (public domain & CC- licenses) These volumes have been contributed by over 40 different institutions and primarily comes from institutions located in North America. 6 April 2016

For more information... You can click through to see the results for all of these categories! https://www.hathitrust.org/statistics_visualizations

What kind of content formats? Scanned from book-like materials Image formats: TIFFs and JPEG2000s Plain-text OCR PDFs are generated on-the-fly and delivered to users (NOT stored in the repository) Some: Born-digital pdfs (and maybe epubs soon!) Photos

Where does content come from? - Digitization source Type Characteristics Google 94.8% of the collection Download restrictions Primarily scanned in black and white with some color pages Large-scale mass digitization = quality can vary Internet Archive 3.7% of the collection No download restrictions Scanned in full color (as a result, file sizes are 2.5 times larger than Google content) Locally digitized & vendor services 1.48% of the collection Various restrictions may apply Typically small scale, “boutique” digitization = high quality (with some exceptions)

Where does content come from? - Top 10 contributing libraries Institution Volumes University of Michigan 4,714,231 University of California 3,835,563 Harvard University 841,969 Cornell University 585,190 University of Wisconsin - Madison 561,945 Indiana University 530,763 University of Illinois at Urbana-Champaign 528,545 University of Minnesota 503,057 The University of Texas 460,139 Pennsylvania State University 390,345

Special Collections Universidad Complutense de Madrid: Latin, Spanish and French documents from the 1500-1800s Keio University: 92,000+ Japanese and some Chinese language materials Islamic Manuscripts from University of Michigan: 8th-20th century CE mss., 1,795 titles in Arabic, Persian, and Ottoman Turkish languages, collaborative cataloging project Benson Latin American Collection, University of Texas at Austin: 460,000 vols related to Latin American culture and history Minnesota Digital Library & Minnesota Historical Society: 60,000 photos related to Minnesota history US Fed Gov Docs: 766,000+ documents and growing! UCM is one of the oldest universities in the world - around since 1293 Keio is the oldest university of Japan

Access is determined by several factors: Copyright status of the item Derived from: Bibliographic metadata (inc for US fed gov docs) Manual copyright review Permissions agreements Geographic location of the user In the United States vs. Outside the United States Member affiliation Yes/no? Digitization source and/or contributing institution Any restrictions imposed by these entities?

Type of work Search (bibliographic and full-text) Text and Data Mining Viewable* Full-PDF download Print disabilities* Preservation uses (Section 108)* Public domain worldwide Worldwide Partners only if 3rd-party restrictions. If not, worldwide. N/A Public domain (US) – Non-US works published between 1873 and 1923. Available within the United States When accessed from with the United States Partners in the US if 3rd party restrictions. If not, anyone in the US Partners in the US; partners worldwide where laws permit Works that rights holders have opened access to in HathiTrust Worldwide unless license forbids it Worldwide (if digitized by Google, full-PDF only available if opened with CC license) Works that are in-copyright or of undetermined status Forthcoming Not available * Note: Access to in-copyright works is subject to conditions listed in HathiTrust’s policies on Access and Use.

Interface Overview

Full-text search Catalog search Pageturner Collection Builder

Other services Get bib records in bulk: https://www.hathitrust.org/data Get datasets: https://www.hathitrust.org/datasets Get high resolution image files: https://www.hathitrust.org/data_api Get bib records for known identifiers: https://www.hathitrust.org/bib_api Get some data about all HT content: https://www.hathitrust.org/hathifiles

For help See https://www.hathitrust.org/help Contact feedback@issues.hathitrust.org

Questions?

BUT... HathiTrust is more than just a library!

HathiTrust Research Center Goal: to build a secure environment and provide services to support data mining and text analysis Portal: https://analytics.hathitrust.org/ Soon: analysis against the copyrighted content in HT Advanced Collaborative Support (ACS): mini-grants where awardees get staff time, not $ HathiTrust+Bookworm: visualize word trends Extracted Features dataset: bits of data about the content

US Federal Documents Program Build a Registry of all known US fed gov docs Collect a complete corpus of all know US fed gov docs https://www.hathitrust.org/usgovdocs

Shared Print Program Build a shared print monograph program across the membership in order to reduce collective costs of maintaining print collections Goal: secure retention commitments for all monographs in HathiTrust https://www.hathitrust.org/shared_print_program

Copyright Review Management System Volunteers from HT member libraries undertake manual copyright review of certain categories of materials To date, has focused on the following categories: Monographs published in Australia, the United Kingdom and Canada from 1876-1945 Monographs published in the United States from 1923-1977 https://www.hathitrust.org/copyright-review

Members participate in other groups and committees User Support Working Group https://www.hathitrust.org/wg_user-support_charge Collections Committee https://www.hathitrust.org/collections-committee-charge Metadata Policy, Strategy, Use and Sharing Advisory Group (MUSAG) https://www.hathitrust.org/wg_musag_charge HathiTrust Quality Assurance and Standards Working Group https://www.hathitrust.org/qaswg_charge