Presentation is loading. Please wait.

Presentation is loading. Please wait.

HathiTrust Digital Library Interface and Services

Similar presentations


Presentation on theme: "HathiTrust Digital Library Interface and Services"— Presentation transcript:

1 HathiTrust Digital Library Interface and Services
Angelina Zaytsev Collection Services Librarian

2 Agenda Collection overview Interface overview Other services
If time permits: Programs Working groups & committees Governance & partnership

3 Collection Overview

4 HathiTrust Collections: Oct 2016
14.7 million total items 7.4 million book titles 405,000 serial titles 767,000 US federal government documents 5.7 million items open (public domain & CC- licenses) These volumes have been contributed by over 40 different institutions and primarily comes from institutions located in North America. 6 April 2016

5

6

7

8 For more information... You can click through to see the results for all of these categories!

9 What kind of content formats?
Scanned from book-like materials Image formats: TIFFs and JPEG2000s Plain-text OCR PDFs are generated on-the-fly and delivered to users (NOT stored in the repository) Some: Born-digital pdfs (and maybe epubs soon!) Photos

10 Where does content come from? - Digitization source
Type Characteristics Google 94.8% of the collection Download restrictions Primarily scanned in black and white with some color pages Large-scale mass digitization = quality can vary Internet Archive 3.7% of the collection No download restrictions Scanned in full color (as a result, file sizes are 2.5 times larger than Google content) Locally digitized & vendor services 1.48% of the collection Various restrictions may apply Typically small scale, “boutique” digitization = high quality (with some exceptions)

11 Where does content come from? - Top 10 contributing libraries
Institution Volumes University of Michigan 4,714,231 University of California 3,835,563 Harvard University 841,969 Cornell University 585,190 University of Wisconsin - Madison 561,945 Indiana University 530,763 University of Illinois at Urbana-Champaign 528,545 University of Minnesota 503,057 The University of Texas 460,139 Pennsylvania State University 390,345

12 Special Collections Universidad Complutense de Madrid: Latin, Spanish and French documents from the s Keio University: 92,000+ Japanese and some Chinese language materials Islamic Manuscripts from University of Michigan: 8th-20th century CE mss., 1,795 titles in Arabic, Persian, and Ottoman Turkish languages, collaborative cataloging project Benson Latin American Collection, University of Texas at Austin: 460,000 vols related to Latin American culture and history Minnesota Digital Library & Minnesota Historical Society: 60,000 photos related to Minnesota history US Fed Gov Docs: 766,000+ documents and growing! UCM is one of the oldest universities in the world - around since 1293 Keio is the oldest university of Japan

13 Access is determined by several factors:
Copyright status of the item Derived from: Bibliographic metadata (inc for US fed gov docs) Manual copyright review Permissions agreements Geographic location of the user In the United States vs. Outside the United States Member affiliation Yes/no? Digitization source and/or contributing institution Any restrictions imposed by these entities?

14 Type of work Search (bibliographic and full-text) Text and Data Mining Viewable* Full-PDF download Print disabilities* Preservation uses (Section 108)* Public domain worldwide Worldwide Partners only if 3rd-party restrictions. If not, worldwide. N/A Public domain (US) – Non-US works published between 1873 and 1923. Available within the United States When accessed from with the United States Partners in the US if 3rd party restrictions. If not, anyone in the US Partners in the US; partners worldwide where laws permit Works that rights holders have opened access to in HathiTrust Worldwide unless license forbids it Worldwide (if digitized by Google, full-PDF only available if opened with CC license) Works that are in-copyright or of undetermined status Forthcoming Not available * Note: Access to in-copyright works is subject to conditions listed in HathiTrust’s policies on Access and Use.

15 Interface Overview

16 Full-text search Catalog search Pageturner Collection Builder

17 Other services Get bib records in bulk: Get datasets: Get high resolution image files: Get bib records for known identifiers: Get some data about all HT content:

18 For help See https://www.hathitrust.org/help
Contact

19 Questions?

20 BUT... HathiTrust is more than just a library!

21 HathiTrust Research Center
Goal: to build a secure environment and provide services to support data mining and text analysis Portal: Soon: analysis against the copyrighted content in HT Advanced Collaborative Support (ACS): mini-grants where awardees get staff time, not $ HathiTrust+Bookworm: visualize word trends Extracted Features dataset: bits of data about the content

22 US Federal Documents Program
Build a Registry of all known US fed gov docs Collect a complete corpus of all know US fed gov docs

23 Shared Print Program Build a shared print monograph program across the membership in order to reduce collective costs of maintaining print collections Goal: secure retention commitments for all monographs in HathiTrust

24 Copyright Review Management System
Volunteers from HT member libraries undertake manual copyright review of certain categories of materials To date, has focused on the following categories: Monographs published in Australia, the United Kingdom and Canada from Monographs published in the United States from

25 Members participate in other groups and committees
User Support Working Group Collections Committee Metadata Policy, Strategy, Use and Sharing Advisory Group (MUSAG) HathiTrust Quality Assurance and Standards Working Group


Download ppt "HathiTrust Digital Library Interface and Services"

Similar presentations


Ads by Google