Technology Bootcamp January 18, 2014 Large-Scale Digital Libraries Digitization Process Krystyna K. Matusiak, Ph.D. Assistant Professor Library & Information.

Slides:



Advertisements
Similar presentations
E-Content Service Group Virtual Meeting Digital Preservation: How to Get Started.
Advertisements

IR Workshop Digitisation 1-3 April 2009 Presented by Henning van Aswegen.
Digital Imaging of Photographs Jenn Riley IU Digital Library Program September 19, 2003.
A Digital Imaging Primer Nick Dvoracek Instructional Resources Center University of Wisconsin Oshkosh.
1 CS 502: Computing Methods for Digital Libraries Lecture 9 Conversion to Digital Formats Anne Kenney, Cornell University Library.
Multimedia for the Web: Creating Digital Excitement Multimedia Element -- Graphics.
Graphics CS 121 Concepts of Computing II. What is a graphic? n A rectangular image. n Stored in a file of its own, or … … embedded in another data file.
Rutgers University Libraries What is RUcore? o An institutional repository, to preserve, manage and make accessible the research and publications of the.
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
2.01 Understand Digital Raster Graphics
The University of Adelaide Picture Perfect: Image Formats and Resolution Peter Murdoch March 2014 PREPARING GOOD LOOKING DOCUMENTS.
File Formats The most common image file formats, the most important for cameras, printing, scanning, and internet use, are JPG, TIF, PNG, and GIF.
1 The Vietnam Center and Archive Stephen Maxner, Ph.D.
Digital Images. Scanned or digitally captured image Image created on computer using graphics software.
 Scanned or digitally captured image  Image created on computer using graphics software.
The National Digital Newspaper Program (NDNP) An NEH/LC Collaborative Program Enhancing access to historical newspapers Release: September 2006.
HBCU-CUL Digital Imaging Workshop, November 2005
Core Issues in Digital Preservation: Text and Images Jacob Nadal, Preservation Officer UCLA Library.
Key terms and concepts: introducing first principles.
What is it a scanner? An optical input device that uses light- sensing equipment to capture an image on paper or some other subject. The image is translated.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
WORKFLOWS AND OTHER CONSIDERATIONS FOR DIGITIZATION  Steve Bingo  Processing Archivist Washington State University Libraries  Alex Merrill  Assistant.
DIGITISING IMAGES IAN WALKER SEARCH TECH MANAGER MEDIA EQUATION.
ALI Digital Library Workshop Creating Digital Content: Digitization Jenn Riley Digital Media Specialist IU Digital Library Program
Digitization Panel August 12, 2010 Christopher C. Brown, coordinator Mike Culbertson, Colorado State U. James Mauldin, GPO.
Photoshop Software Rasterized, file formats, and printing choices.
The Complexities & Economics of Digitizing Microfilm
Metadata Considerations Implementing Administrative and Descriptive Metadata for your digital images 1.
Kentuckiana Digital Library: A Digital Archive of Kentucky History Eric Weig Head, Digital Programs Special Collections & Digital Programs Division University.
Mark Sullivan Digital Library of the Caribbean. Imaging  Imaging Theory & Specifications  Recommended Equipment and Software 2 dLOC Training (7/29/2013)
Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.
Organizational Relationships and Shaping the Digital Resource July 21, 2010 Johanna Bauman, Senior Production Manager, ARTstor.
Digital Reformatting and File Management Public Library Partnerships Project Sheila A. McAlister Director, Digital Library of Georgia and Sandra McIntyre.
Digital Cameras And Digital Information. How a Camera works Light passes through the lens Shutter opens for an instant Film is exposed to light Film is.
Digitizing Photographs For Sustainable Heritage Workshop, June 12-15, 2014 By Steven Bingo Project Archivist, Washington State University.
GRAPHICS. Topic Outline What is graphic. Resolution. Types of graphics. Using graphic in multimedia applications.
Digitization Programmes National Library of the Czech Republic Adolf Knoll
University of Florida Digital Collections.
Digital Image Capture of Musical Scores Jenn Riley, Indiana University Digital Library Program Ichiro Fujinaga, McGill University.
Graphics workshop Library and Information Services University of St Andrews.
 Scanned or digitally captured image  Image created on computer using graphics software.
Things to Remember When working with digital images.
Collecting History: Profiles in Science Alexa T. McCray National Library of Medicine Bethesda, MD Stanford University August 21, 1999.
A Multi-Tiered Architecture for Distributed Data Collection and Centralized Data Delivery Stacy Kowalczyk and James Halliday April 28, 2008.
Graphics Concepts Presentation
1/16/2016I. Revels Digital Imaging Workshop 1 Selection Considerations For Digital Imaging Projects.
Scanners. Using a Scanner Scanners are used to digitize any flat object. Several types of scanners- flatbed, sheet fed, handheld, film. Most common is.
Digitization & Digital Preservation
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
The Complexities & Economics of Scanning Microfilmed Documents Videos
Laurie N. Taylor Lourdes Santamaría-Wheeler The Basics of Digitizing Collections.
Digital Libraries: What are foundations?. Vannevar Bush Some day there will be an easy way to store, disseminate, and preserve all of “man’s” knowledge,
Introduction to Scanning. Why Digitize? Provide better access Protect fragile or valuable materials in your collections Digital surrogates will help preserve.
Scanner Scanner Introduction: Scanner is an input device. It reads the graphical images or line art or text from the source and converts.
Digitizing Historical Newspapers South Carolina Digital Newspaper Program's participation with the Library of Congress' Chronicling America: Historic American.
NLM Update and Still Image Serving April 27, 2016 John Doyle, Doron Shalvi, TA Nguyen National Library of Medicine.
2.01 Understand Digital Raster Graphics
2.01 Understand Digital Raster Graphics
Digital Stewardship Curriculum
DIGITIZATION OF PAPER DOCUMENTS OF INSTITUTE OF OCEANOGRAPHY’S LIBRARY
PASIG Bootcamp: Image Formats Robert Buckley NewMarket Imaging/
2.01 Understand Digital Raster Graphics
Digital Images.
ImageEditing Understanding Image Resolution.
University of Florida Digital Collections
2.01 Understand Digital Raster Graphics
2.01 Understand Digital Raster Graphics
Lesson 5: Multimedia on the Web
Basic Concepts of Digital Imaging
Current Challenges in Digitization
Presentation transcript:

Technology Bootcamp January 18, 2014 Large-Scale Digital Libraries Digitization Process Krystyna K. Matusiak, Ph.D. Assistant Professor Library & Information Science Program

Overview Large-scale digital libraries (DLs)  The National Science Digital Library (NSDL)  HathiTrust  Europeana  The Digital Public Library of America (DPLA) Digitization as a conversion process Fundamental questions  What?  Why?  How? Digitization as a multi-step process Digitization standards and guidelines  The notion of archival master files and derivatives  Image capture: technical factors Digitization technology Overview of Digitization2

LARGE-SCALE DIGITAL LIBRARIES Overview of Digitization3

Large-Scale Digital Libraries Massive aggregations of scientific and cultural heritage content with millions of digital objects  Offer a new centralized approach to providing access to scientific and cultural materials  Aggregate content (or metadata) from smaller individual DLs and provide portals for global searching and retrieval  Address the limitations of the resource discovery in the DL environment  Build upon over two decades of extensive digitization efforts Types of content  Born-digital  Digitized Overview of Digitization4

Large-Scale Digital Libraries Sources of content  Local digitization: Individual DLs created by academic and public libraries, archives, historical societies, and other cultural heritage and research organizations  Mass digitization: Google Book Project; Open Content Alliance Information ecosystem – multilayered trusted networks Models  Distributed (DPLA, Europeana, NSDL)  Centralized (HathiTrust) Coverage Goals  Expanding access  Supporting digital preservation Overview of Digitization5

The National Science Digital Library (NSDL) 6

HathiTrust 7

Europeana 8

The Digital Public Library of America (DPLA) 9

DIGITIZATION PROCESS How have we created this critical mass of digitized content? Overview of Digitization10

Overview of Digitization11 Digitization is a process of conversion of analog information into a digital format through scanning or digital photography. It is a multi-step process that involves selection, image capture, creation of descriptive and technical metadata, and digital preservation of the objects created as a result of the conversion process.

Basic Digitization Workflow Digitization is More than Scanning Digitization Overview12 Selection Image capture Digital processing Indexing and metadata Ingesting Preservation and maintenance

What? Manuscripts * Books *Journals *Maps Overview of Digitization13

What? Archival Materials Overview of Digitization14

What? Cultural Heritage Materials on Tape and Film Overview of Digitization15

Why? Expand access – 24/7 Provide access to unique primary sources held in local archives Extend search capabilities of digital text Improve resource discovery Provide access to high-resolution images Integrate resources in multiple modes of representation Bring together dispersed collections Assist preservation and conversation efforts Overview of Digitization16

How? General Guidelines Digitize at the highest resolution appropriate to the nature of the source material  Avoid rescanning and handling of the originals in the future Create digital objects that are accessible and interoperable across platforms and devices  High-quality  Consistent  Authentic Produce digital objects that support the intended current and future use  Build a repository of digital master files to facilitate reprocessing and maintaining digital collections over time  Provide derivative access files for current use Create backup copies of all files on servers and have an off- site backup strategy Overview of Digitization17

Digital Master Files Created as a direct result of the image capture process either through scanning or photographing with a digital camera Should represent the visual information of the original material Serve as a long term archival file and a source for derivative images  Digital masters are not used for online delivery or print output General recommendations for digital master file creation include :  Scanning at the highest quality affordable  No compression or lossless compression  Non-proprietary archival formats TIFF – text or still images WAV – audio AVI or Motion JPEG 2000 or MXF – moving images * * Unlike text, still image, or audio, there is no archival file format that has been definitively established for moving images Overview of Digitization18

Examples  Photographic print 5x7 in. scanned in RGB mode at 600 ppi → 35 MB TIFF file, e.g. kw tif  Large map 63 x 56 cm. (24. 8 x 22 in.) scanned in RGB mode at 300 ppi → 185 MB TIFF file, e.g. am tif  Monograph page 23 cm (approx. 9 in.)scanned in RGB mode at 400 ppi → 25 MB TIFF file, e.g. 001_Front cover.tif Digital Masters

Derivative Files Created from digital master files for specific use including  Access images for digital collections or other types of Web delivery  User requests  High resolution prints General recommendations for derivative files :  Reduce the resolution depending on the intended use  72 dpi or 96 dpi for Web access  300 dpi for print output or for high-resolution viewers  Compress files to reduce their size  Select appropriate access formats PDF – text JPEG or JPEG still images MP3 – audio MPEG-4 (MP4) or QuickTime or Real Video – moving images Overview of Digitization20

Image Capture Technical Factors Mode of capture  Bitonal — one bit per pixel representing black and white  Grayscale — multiple bits per pixel representing shades of gray  RGB (red-green-blue) — multiple bits per pixel representing color File formats  Tiff  JPEG  JPEG2000  RAW and DNG No compression Compression  Lossless  Lossy 21

Image Capture Technical Factors Resolution (ppi – pixels per inch; dpi – dots per inch)  An image 1500 x 2100 pixels displayed at 100 ppi = ? in.  The same image 1500 x 2100 pixels displayed at 300 ppi = ? in Bit depth  The number of bits used to represent each pixel determines how many colors can appear in a digital image 22 Source: BCR’s CDP Digital Imaging Best Practices.

Digital Masters – Photographs and Text Scanning Specifications Source: Wisconsin Heritage Online Digital Imaging Guidelines (2009). Version 2.0. Original materialScanning resolution Bit depthApproximate scanned dimensions Approx. size of preview image Photographs 16” x 20” +200 ppi24-bit color6400 x 8000 pixels146 MB 8 ½” x 11”–16” x 20”300 ppi24-bit color3200 x 4000 pixels36 MB 8” x 10”400 ppi24-bit color3200 x 4000 pixels36 MB 5” x 7”625 ppi24-bit color3200 x 4000 pixels36 MB 4” x 5”800 ppi24-bit color3200 x 4000 pixels36 MB 4” x 2 ½”1200 ppi24-bit color3200 x 4000 pixels36 MB Text Print—no images600 ppi1-bit bitonalVaries Print—with images300 ppi8-bit grayscale or 24-bit color Varies Manuscript400 ppi8-bit grayscale or 24-bit color Varies

Scanners Source materials in a variety of formats require versatile scanning equipment  Photographs (reflective and transparent materials)  Photographic prints → flatbed scanners  Film negatives and slides → film scanners, flatbed scanners with transparency adapters  Text (reflective materials)  Single leaf documents → flatbed scanners, sheet-fed scanners  Bound materials → overhead scanners or digital cameras  Oversize materials (reflective materials)  Maps, charts, etc. → large format scanners or digital cameras  Microfilm (transparent)  Newspapers → microfilm scanners Overview of Digitization24

Book scanner for Book, Oversized Prints and Maps DSLR for Oversized Prints, Maps, Scrolls, and 3D objects Film and Slide scanner Film and Slide Scanner with auto-feeder Flatbed scanner for Prints, Glass, and Transparent objects Video conversion Audio conversion Large format scanner for maps and oversized materials 25Overview of Digitization

Resources General Digitization Guides, Standards, and Best Practices Association for Library Collections & Technical Services (ALCTS). (2013). Minimum Digitization Capture Recommendations. capture-recommendationshttp:// capture-recommendations BCR’s CDP Digital Imaging Best Practices (2008). [updated version of Western States Digital Imaging Best Practices] BCR CDP Digital Imaging Best Practices_2008.pdf Besser, Howard. Introduction to Imaging, Revised Edition (2003). The J. Paul Getty Trust. This book is free as a downloadable PDF. A Framework of Guidance for Building Good Digital Collections. 3rd Edition (2007). NISO Framework Advisory Group. Handbook for Digital Projects: A Management Tool for Preservation and Access. (2000). Northeast Document Conservation Center. Moving Theory into Practice: Digital Imaging Tutorial. (2000). Cornell University Library/Research Department. The NINCH Guide to Good Practice in the Digital Representation and Management of Cultural Heritage Materials. (2002). The National Initiative for a Networked Cultural Heritage (NINCH). CDP Digital Imaging Best Practices_2008.pdf 26Overview of Digitization