TECHNOLOGY SUPPORT FOR ESSSS Progress, Issues, and Challenges Marshall Breeding Director for Innovative Technology and Research Vanderbilt University Library.

Slides:



Advertisements
Similar presentations
E-Content Service Group Virtual Meeting Digital Preservation: How to Get Started.
Advertisements

Capacity Building Passing on the Experience Dr. Noha Adly World Digital Library Arab Peninsula Regional Group meeting.
Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program
October 28, 2003Copyright MIT, 2003 METS repositories: DSpace MacKenzie Smith Associate Director for Technology MIT Libraries.
File Management Chapter 3
David Underdown 22 November 2013 Digital Records and the County Durham Home Guard Catalogue Day 2013.
Using Digital Photography in Family History Work Using digital cameras to save document images By: Bob Curry.
Bringing an Institutional Repository to the Ball State University Community Cardinal Scholar (CS) Bradley Faust, Assistant Dean LITS University Libraries.
Building The Rare book Collection at Rijeka University Library in the Digital Age Ines Cerovac, Senka Tomljanović, Rijeka University Library Seminar The.
DIGITIZATION OF LOCAL HISTORY COLLECTIONS IN PUBLIC LIBRARY “VLADISLAV PETKOVIC DIS” IN CHACHAK: DIGITIZATION OF THE NEWSPAPER “THE VOICE OF CHACHAK” Bogdan.
Toulouse School of Graduate Studies Theses and Dissertations ETDs - Why We Do them –We at UNT believe that electronic theses and dissertations enhance.
Peter Granda Archival Assistant Director / ICPSR and the Gerald R. Ford Presidential Library: Two Decades of Collaboration.
CONTENTdm Important Features and Capabilities. CONTENTdm provides an “out of the box” solution to a complex web programming challenge. With minimal customization,
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
An emergent system for the creation and dissemination of manuscript transcriptions An emergent system for the creation and dissemination of manuscript.
2.01 Understand Digital Raster Graphics
1 Copyright and Intellectual Property Design Issues by Jeremy Rowe
Progress in Access Technologies: NLM Video Search Jennifer Marill Chief, Technical Services Division Edward Luczak Systems Architect, Office of Computer.
Document Delivery Formats for the Web and Legal Digital Collections Kevin Reiss June 18 th, 2004 Law Library Rutgers-Newark School of Law.
Developing a Digital Humanities Project in Gastronomy Digital Humanities, also known as humanities computing, is a field of study, research, teaching,
Core Issues in Digital Preservation: Text and Images Jacob Nadal, Preservation Officer UCLA Library.
Database Systems: Design, Implementation, and Management Ninth Edition
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
‘ {] Chapter 2 (HW01) Getting Started with Windows 7.
© Tanner, KCL 2007 How do I decide if JPEG 2000 is for me? Choosing standards when there are so many… Simon Tanner Director.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
1. 2 introductions Nicholas Fischio Development Manager Kelvin Smith Library of Case Western Reserve University Benjamin Bykowski Tech Lead and Senior.
Planning a digital library How to Build a Digital Library Ian H. Witten and David Bainbridge.
Meta-Knowledge Computer-age study skill or What kids need to know to be effective students Graham Seibert Copyright 2006.
Metadata Considerations Implementing Administrative and Descriptive Metadata for your digital images 1.
VIDEO ARCHIVING Models and opportunities Marshall Breeding Director for Innovative Technology and Research Vanderbilt University Library Executive Director,
CC&E Best Data Management Practices, April 19, 2015 Please take the Workshop Survey 1.
Discovering Computers Fundamentals Fifth Edition Chapter 9 Database Management.
Copyright 2015 XenData Limited XenData X2500-USB LTO-6 Archive System.
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
Planning a digital library How to Build a Digital Library Ian H. Witten and David Bainbridge.
Introduction to Omeka. What is Omeka? - An Open Source web publishing platform - Used by libraries, archives, museums, and scholars through a set of commonly.
Digitization Programmes National Library of the Czech Republic Adolf Knoll
IS 325 Notes for Wednesday August 28, Data is the Core of the Enterprise.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
Ali Alshowaish.  Human nature  Babies  As we grow, we develop more sophisticated cognitive abilities  The need to organize  Why do we organize? 
Enterprise Solutions Chapter 10 – Enterprise Content Management.
Collecting History: Profiles in Science Alexa T. McCray National Library of Medicine Bethesda, MD Stanford University August 21, 1999.
1 BCS, Oxfordshire, 19 February, 2004 WEB ARCHIVING issues and challenges Deborah Woodyard Digital Preservation Coordinator.
Feb 2012Teldap, Taipai1 Creativity, Collaboration, Convergence and the change from print to a digital environment: Theme and case study. (Also Friday 09:30.
The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University
DAEDALUS: ePrints Overview Web Meeting, 4th December 2004 William J Nixon Project Manager (DAEDALUS)
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
VERI is an interface that provides a Web based front end to the access the datasets generated by the MVED. The goal is to Provide open access to the Don.
Al Cornish, Systems Librarian Washington State University Libraries Preserving Access to Multimedia Collections.
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
Integrating Laserfiche and SharePoint PO108 Alex Wilson and Jessica Huang.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
A RCHIVAL COLLECTIONS IN A D IGITAL W ORLD Cheryl Walters Nov. 6, 2008.
Michael J. Bennett University of Connecticut Storrs, CT/USA & F. Barry Wheeler Library of Congress Washington, DC/USA IS&T Archiving 2010 Conference The.
Post-ALA Annual July 11, 2008 Pre-Conference Workshop: The Care and Feeding of Compound Objects Geri Ingram OCLC Digital Collection Services Manager, User.
CENTRAL/WESTERN MASSACHUSETTS AUTOMATED RESOURCE SHARING Digitization GOALS & THEIR LOGISTICS Michael J. Bennett Digital Initiatives Librarian C/WMARS,
Product Workshop We talk about the latest improvements made to Trapeze and review some of the tools and features we’ve been working on. Presented by Anthony.
A strategic view of document and digital object management for the University of the Witwatersrand, Johannesburg Prof Derek W. Keats Deputy Vice Chancellor.
CENTRAL/WESTERN MASSACHUSETTS AUTOMATED RESOURCE SHARING Digital Repositories Build It & They Will Come Michael J. Bennett Access Services Supervisor C/WMARS,
Building A Repository for Digital Objects
Joseph JaJa, Mike Smorul, and Sangchul Song
IMAGE SIZE AND RESOLUTION
Archiving and Delivery of Student Portfolios
Introduction of Week 6 Assignment Discussion
DIGITAL LIBRARY.
Prepared by Jaroslav makovski
RESEARCH TOPICS Web-Interface Performance DTD Extensibility Imaging
Current Challenges in Digitization
Presentation transcript:

TECHNOLOGY SUPPORT FOR ESSSS Progress, Issues, and Challenges Marshall Breeding Director for Innovative Technology and Research Vanderbilt University Library Founder and Publisher, Library Technology Guides ESSSS Digital Archive Workshop February 4, 2012

Turning Pages on Paper to Digital Images  Digitizing in the field involves many compromises compared to what can be done in more controlled settings  Access to archives may be of limited duration  Arbitrary and political  Materials deteriorating rapidly  Practices related to physical preservation tend to be minimal  Must be light, fast, and expensive

Achieve best results possible  Maximize quality and consistency  Handheld digital cameras  Rapid advancement in capabilities  Early images down at lower resolutions compared with what is possible today  Fixed camera stands  Consistency in orientation and framing  Organization of Images (folders / image names)

Image Standards  TIFF: Currently regarded as best image format for archiving images  RAW: Native proprietary format of a camera  JPEG: Compressed images for display on the Web  Data lost during compression: non-reversible  VU system creates multiple sizes of JPEG images  JPEG2000  Lossless compression method  Not well supported on the Web

Bringing Images to the Web  Take advantage of infrastructure developed at by the Vanderbilt University Library to manage images  Digital Library framework:  Presentation and functionality created in Perl-based interface  Data and Metadata stored in MySQL relational tables  ODBC connectivity between presentation layer and MySQL  Microsoft Windows Server/IIS for Web server  Images reside on digital storage provided by the Vanderbilt University Library

Digital Preservation  Disaster Recovery  Ability to restore files in the case of any hardware, software, or human Error  Digital Preservation  Commitment and processes in place to preserve digital information for the very long term  Multiple replications  Migration of data into future formats as current standards become obsolete

Building structure through Metadata  Metadata structure based on Dublin Core  Volume-level descriptive metadata  Courtney Campbell designed metadata structure and is analyzing volumes to populate metadata for each volume  EXIF Data extracted from images into the individual records for each page  Page-level structure  Supports ability to select volumes and browse page images

Demonstration  Image management environment  Interface  Metadata  Page Images

Turning Pages into Data  The contents of the page images contain valuable data  Page images can be read by humans but do not support essential features: search, computer analysis, etc.  Full value of these collections can be realized through transcription

Challenges in transcription  Page characteristics  Hand written by many different hands  Many names and numbers  Spanish language  Varying contrast  Many defects: water damage, insects, etc

Human transcription  Scholars that work with pages of interest can create transcriptions manually  Optical character recognition?  Highly accurate for typescript  Not effective for handwritten manuscripts

Crowdsourcing  Find ways to have large numbers of persons create transcript snippets  Google uses crowdsourcing to improve transcripts for Google Books project.

Google ReCAPTCHA:  “Digitizing books one word at a time”  Each transaction transcribes one or two words  Each word is transcribed many times  Results compared to determine correct version

Google ReCAPTCHA

Crowdsourcing to Transcribe ESSSS  Scholars contribute any transcriptions created as they work with any given set of pages  Students assigned to create transcriptions  Language, history, LIS  Collaboration with some organization with ReCAPTCHA like infrastructure