“Workshop on Technical Skills and Standards for the World Digital Library” Doha, Qatar February 22-23, 2012
World Digital Library www.wdl.org Agenda (Day One) Introduction: Overview of the World Digital Library and WDL in the Arab Peninsula Region (Jason) Participants Introduction and Discussion (All) WDL Standards: Purpose and Practice (Sandy / Chris) Overview of the WDL Production Process (Jason) Workflow for Content and Metadata Preparation (Sandy / Chris / Ted)
World Digital Library www.wdl.org Agenda (Day Two) Image Files: Digitization and Preparation (Sandy / Chris) Expert Descriptions: Nominal Process for Review and Evaluating Quality (Jason / Ted) Discussion on “Expert Descriptions based on Arab Heritage Resources” (Mr. Mohammed Hammam Fikri) Image Files: Transfer (Sandy / Chris) Discussion: Increasing the Usage of WDL in the Arab World (All) Summary: (Jason) Review of the WDL Production Process New Developments and Opportunities for Technical Staff at Partner Institutions Improving Communication and Collaboration Feedback Questionnaire, Certificates and Photographs Distribution
WDL Trainers: Sandy Bostian, WDL Content Manager World Digital Library www.wdl.org WDL Trainers: Sandy Bostian, WDL Content Manager Chris Masciangelo, WDL Digital Conversion Specialist Ted Waddelow, U.S. Fulbright Scholar at University of Bahrain Jason Yasner, WDL Operations Manager
World Digital Library www.wdl.org Introduction: Overview of the World Digital Library and WDL in the Arab Peninsula Region Jason Yasner WDL Operations Manager
World Digital Library: Mission and Objectives World Digital Library www.wdl.org World Digital Library: Mission and Objectives Mission: Digitize and make freely available over the Internet, in multilingual format, primary source materials that tell the stories and highlight the achievements of all countries Objectives: Promote international and intercultural understanding and awareness Expand multilingual and culturally diverse content on the Internet Provide resources to educators and contribute to scholarly research Build knowledge and capacity in the developing world; help narrow the digital divide
Key Features of the WDL Website World Digital Library www.wdl.org Key Features of the WDL Website Multilingualism Interface in seven (7) languages Content in more than seventy-five (75) languages High quality content of cultural and historical importance Consistent, high-quality metadata to allow searching and browsing across cultures and time periods Item-level descriptions, curator videos to enhance user understanding of the content Speed and performance Web 2.0 features
Partners and User Statistics (As of February 1, 2012) World Digital Library www.wdl.org Partners and User Statistics (As of February 1, 2012) 138 partners from 72 countries www.wdl.org launched on April 21, 2009 Usage since launch: More than 19 million visitors Top countries by number of visitors: Spain, United States, Mexico, Brazil, Argentina, China, France, Russian Federation, Colombia, Portugal, Germany, UK Links from other sites to WDL: 3.5 million Visitors from the Arab World: 454,390
The WDL Governance Structure World Digital Library www.wdl.org The WDL Governance Structure WDL launched as a Library of Congress-UNESCO partnership WDL Charter provides for: Annual partner meeting Executive Council Standing Committees for: Technical Architecture Content Selection Translation and Language Regional and Subject Sub-committees Arabic Scientific Manuscripts Chinese Language Content Meso-American Codices Arab Peninsula Regional Group Library of Congress serves as Project Manager (2010-2015)
WDL in the Arab Peninsula Region (and neighboring parts of Africa) World Digital Library www.wdl.org WDL in the Arab Peninsula Region (and neighboring parts of Africa) Abu Dhabi Authority for Culture and Heritage (ADACH) Al- Ahgaf Library for Manuscripts, Yemen Central Library, Qatar Foundation, Qatar King Abdulaziz University Library, Saudi Arabia King Abdullah University of Science and Technology (KAUST), Main Library, Saudi Arabia King Hamad Library, Bahrain National Library of Jordan National Library of Sudan Omani Digital Library (Kawab Al-Marifa), Sultanate of Oman Sultan Qaboos University Libraries, Sultanate of Oman
Partner Introduction and Discussion World Digital Library www.wdl.org Partner Introduction and Discussion Name, Position, Institution What are you digitizing, e.g., books, manuscripts, maps, photos, etc.? What content do you plan to contribute to WDL? Do you have a preservation policy in place? How are you storing your digital items? What equipment are you using? Do you have dedicated funding for digital production? Do you have a dedicated physical space (scanning facility)?
WDL Standards: Purpose and Practice World Digital Library www.wdl.org WDL Standards: Purpose and Practice project.wdl.org Sandy Bostian WDL Content Manager
WDL Standards: Why? Your images are what people see. World Digital Library www.wdl.org WDL Standards: Why? Your images are what people see. Deep Zoom shows flaws. “Put your best foot forward.” Your metadata runs the display.
WDL Standards: What is a digital object? World Digital Library www.wdl.org WDL Standards: What is a digital object? Digital Content Metadata Relationships Behavior Partner WDL
WDL Standards: One vs. Many World Digital Library www.wdl.org WDL Standards: One vs. Many Books, Manuscripts Single Volume – 1 object Multi-Volume – 1 object Newspapers Each Issue – 1 object Journal Photographs Single Photograph – 1 object Photo Album – ??? Maps Single Map – 1 object Atlas – ???
WDL Standards: Digital Object World Digital Library www.wdl.org WDL Standards: Digital Object 1 Record = 1 Object Make Sure the Numbers Match Before Sending Put the Digital Identifier in the Record
WDL Standards: Real Life Example World Digital Library www.wdl.org WDL Standards: Real Life Example 806 Records 316 Files + 55 Directories = 371 Objects No Digital Identifiers in Records
WDL Standards: Filenaming World Digital Library www.wdl.org WDL Standards: Filenaming ASCII characters, preferably numbers Case matters (my_file vs. My_file vs. my_File) Three-letter file extensions (.tif NOT .tiff) Do not use these characters: < or > : " / or \ | ? * Space “ “ ( or )
Overview of WDL Production Process World Digital Library www.wdl.org Overview of WDL Production Process Content Selection Content Transfer Content Processing Cataloging (consistent metadata, descriptions) Translation Publishing Jason Yasner WDL Operations Manager
World Digital Library www.wdl.org
World Digital Library www.wdl.org WDL 2 Workflow
Content Workflow: Transfer World Digital Library www.wdl.org Content Workflow: Transfer Transfer to “landing zone” server Malware & virus scans “Bag & tag” Inventory Transfer to tape backup Transfer to “work zone” server Content Workflow: Triage Reconcile records & media Cursory vetting for obvious problems Missing media files Missing data files/records Corruption issues Cataloging/Translation derivatives Starter kit prep - item level tracking Load to catalog or pre-translation
Content Workflow: Object Building World Digital Library www.wdl.org Content Workflow: Object Building Quality review Media adjustments as needed (cropping, etc.) Color management/sRGB conversion Create master reference image Object building WDL directory structure Add image header information Handle registration Reference image derivatives PDFs Content Workflow: Validation Run script that checks for missing parts & server permissions issues Final media validation (JHOVE based tool) Link items to dev team server area Update inventory system
Description Process WDL Description team: Function: Process: World Digital Library www.wdl.org Description Process WDL Description team: Professors, scholars, researchers, and other field experts Editors Description Coordinator Partner Reviewers Function: To produce a summary that explains each item and highlights its significance To illuminate the material in a way that is accurate, succinct, and engaging, working with information provided by partners, subject specialists, and other authoritative resources Process: Pre-description: determine and execute the appropriate course of action for the writing and editing processes, i.e.: how to improve existing descriptions provided by partners pinpoint the appropriate expert to write descriptions, if they are not provided Writing: supplement pre-existing descriptions or produce original ones Editing: substantive edit and copy-edit Partner Review: evaluation of WDL-edited descriptions by partner institution Finalizing: final approval before description is sent to Metadata team
Metadata Process Original metadata creation World Digital Library www.wdl.org Metadata Process Original metadata creation Edit English title, read and check the description Find or verify names in VIAF Verify publication date and language used in the material Verify place names in TGN Assign appropriate topics and additional subjects Clean “Notes” and “Physical Description” fields Add collection title if needed Metadata review Review each field to make sure there are no errors Verify the names, DDC, and subjects, etc. Batch metadata creation Analyze metadata to develop strategy and workflow Create metadata mapping and decide output format Create or/and verify generic information, names, places, DDC, and subjects, etc. Develop and create macros and scripts Process metadata in batch using various tools Final review of the metadata in Metadata Management App
Translation and Language Management World Digital Library www.wdl.org Translation and Language Management Linguistic team Three translation vendors who share our six languages (FR, ES, PT, AR, RU, ZH) One reviewer per language Translation process Pre-translation when content is submitted to us in a language other than English Translation into six languages, using translation management systems such as CAT tools In-context review and quality control on our staging website Related work Terminology and style management (glossaries and style guides) Query management (direct access to content writers and subject matter experts) Feedback loop management between subject matter experts, translators and reviewers
World Digital Library www.wdl.org Development Process Ongoing supportive role of entire production process Ongoing maintenance and scaling of website Building tools for content and metadata management Troubleshooting
Workflow for Content and Metadata Preparation World Digital Library www.wdl.org Workflow for Content and Metadata Preparation Your metadata runs the display! Sandy Bostian WDL Content Manager
Metadata Preparation: Identifiers (IDs) World Digital Library www.wdl.org Metadata Preparation: Identifiers (IDs) WDL Identifier Record Identifier Digital Identifier
Metadata Preparation: Title Information World Digital Library www.wdl.org Metadata Preparation: Title Information Original Title Original Title Language Title vs. Work English Title (if available)
Metadata Preparation: Contributors World Digital Library www.wdl.org Metadata Preparation: Contributors Contributor Name Birth/Death Dates (if known) Role Author Scribe/Copyist Calligrapher Photographer Editor/Compiler Architect
Metadata Preparation: Publication Info World Digital Library www.wdl.org Metadata Preparation: Publication Info Date Created Start End Created vs. Copied CE and/or Hijri Publisher Place of Publication
Metadata Preparation: Subjects World Digital Library www.wdl.org Metadata Preparation: Subjects Place Hierarchical Controlled Vocabularies Time Publication vs. Subject Time CE and/or Hijri Dewey Decimal Classification (DDC) Additional Subjects (keywords)
Metadata Preparation: Type of Item World Digital Library www.wdl.org Metadata Preparation: Type of Item Controlled Vocabulary Books Journals Manuscripts Maps Motion Pictures Newspapers Prints, Photographs Sound Recordings
Metadata Preparation: Descriptions World Digital Library www.wdl.org Metadata Preparation: Descriptions Description/Abstract Physical Description Notes Language
Metadata Preparation: Additional Info World Digital Library www.wdl.org Metadata Preparation: Additional Info Collection Series Institution URL
Image Files: Digitization and Preparation World Digital Library www.wdl.org Image Files: Digitization and Preparation Chris Masciangelo WDL Digital Conversion Specialist
Agenda Technology Hardware Software Pre-capture Set Up World Digital Library www.wdl.org Agenda Technology Hardware Software Pre-capture Set Up Information Capture / Scanning File Formats Storage Quality Review / Post Processing Problems Audio and Video Other Considerations
Technology Hardware - Flatbed scanner - Overhead/Planetary scanner World Digital Library www.wdl.org Technology Hardware - Flatbed scanner - Overhead/Planetary scanner - V-shaped book scanner - Large Format scanning system - Monitor - Computer
World Digital Library www.wdl.org Hardware Flatbed Scanners
World Digital Library www.wdl.org Hardware Overhead/Planetary Scanners
Hardware V-shaped Book Scanners - Manual [Atiz BookDrive Pro] World Digital Library www.wdl.org Hardware V-shaped Book Scanners - Manual [Atiz BookDrive Pro] - Automatic [Kirtas KABIS III]
Hardware Large Format Scanning System World Digital Library www.wdl.org Hardware Large Format Scanning System
Hardware Large Format Scanning System - Camera - Digital Back World Digital Library www.wdl.org Hardware Large Format Scanning System - Camera - Digital Back - Lighting - Platform/Easel
Hardware Monitor - CRT Calibration – colorimeter / spectrophotometer World Digital Library www.wdl.org Hardware Monitor - CRT - LCD Calibration – colorimeter / spectrophotometer
Software Computer/Operating System - Windows - Mac World Digital Library www.wdl.org Software Computer/Operating System - Windows - Mac
Software Scanning software/driver Image Editing software World Digital Library www.wdl.org Software Scanning software/driver Image Editing software - Adobe Photoshop Creative Suite - Adobe Bridge - Adobe Photoshop Elements - Capture One Image Validation software [JHOVE] Image Metadata Editing [ExifTool]
Pre-capture Set Up Examine materials to be scanned for issues/problems World Digital Library www.wdl.org Pre-capture Set Up Examine materials to be scanned for issues/problems Identify balance between adequate informational capture and quality, physical item size, file format and compression, resolution and bit depth, and master file size Ensure equipment is properly focused, has capture profile set (including no auto-sharpening), and that the monitor is calibrated Ensure proper and equal distribution of lighting Good practice to shoot with a commercial color bar or grayscale
Pre-capture Set Up Risks during imaging - Physical damage World Digital Library www.wdl.org Pre-capture Set Up Risks during imaging - Physical damage - Exposure to light - Exposure to heat - Disassociation
Information Capture / Scanning World Digital Library www.wdl.org Information Capture / Scanning Targets, color bars, and grayscales
Information Capture / Scanning World Digital Library www.wdl.org Information Capture / Scanning Bit Depth [informational and artifactual value]
Information Capture / Scanning World Digital Library www.wdl.org Information Capture / Scanning Color Modes - RGB [red, green, blue] - CMYK [cyan, magenta, yellow, key (black)]
Information Capture / Scanning World Digital Library www.wdl.org Information Capture / Scanning Color Space and Profile - sRGB sRGB is the only appropriate choice for images uploaded to the web since most web browsers don't support any color management. - sRGB IEC61966-2.1 - Adobe RGB (1998) Adobe RGB images that are uploaded to websites without conversion to sRGB will generally appear dark and muted. - Grayscale - Gray Gamma 2.2
Information Capture / Scanning World Digital Library www.wdl.org Information Capture / Scanning Resolution [PPI/DPI] - 2400+ ppi (slides) - 300-600 ppi (archival) - 72/96 ppi (on screen/download) Rule of thumb: Longest side of the image should be 3000 pixels 4 inch x 5 inch photo @ 600 ppi = 2400 x 3000 pixels File size = (width pixels x height pixels x bit depth)/8 (2400 x 3000 x 24)/8 = 21,600,000 bytes or 20.6 MB
File Formats Master [archival] RAW Access [derivatives] JPG World Digital Library www.wdl.org File Formats Master [archival] RAW TIF JPEG2000 Access [derivatives] JPG PNG PDF Compression LZW ZIP
Storage Media CD Backups Issues DVD Hard Drive Tape Cloud Capacity World Digital Library www.wdl.org Storage Media CD DVD Hard Drive Tape Cloud Backups Issues Capacity Transfer errors Bit rot Sustainability, obsolescence, data migration
Quality Review / Post-Processing World Digital Library www.wdl.org Quality Review / Post-Processing Identify digital issues Image validation Color correction Sharpening
Problems Noise Artifacting Vignetting Chromatic aberration World Digital Library www.wdl.org Problems Noise Artifacting Vignetting Chromatic aberration Over-sharpening Depth of field Color reproduction Lens distortion Clipping - Overexposure / Underexposure
World Digital Library www.wdl.org Problems Noise
World Digital Library www.wdl.org Problems Artifacting
World Digital Library www.wdl.org Problems Vignetting
World Digital Library www.wdl.org Problems Chromatic aberration
World Digital Library www.wdl.org Problems Over-sharpening
World Digital Library www.wdl.org Problems Depth of Field
World Digital Library www.wdl.org Problems Color Reproduction
World Digital Library www.wdl.org Problems Lens Distortion
Problems Clipping - Overexposure / Underexposure World Digital Library www.wdl.org Problems Clipping - Overexposure / Underexposure
Audio and Video The format should be publicly and openly documented. World Digital Library www.wdl.org Audio and Video The format should be publicly and openly documented. The format is not proprietary. The format is in widespread use. The format can be opened, read, and accessed using readily-available tools. WAV or AIFF files [uncompressed] MP4 or MOV
Other Considerations Copyright Funding Staffing World Digital Library www.wdl.org Other Considerations Copyright Funding Staffing
Expert Descriptions: Nominal Process for Review and Evaluating Quality World Digital Library www.wdl.org Expert Descriptions: Nominal Process for Review and Evaluating Quality Question: What is an Expert Description on the WDL? Answer: It is a summary that explains each item and highlights its significance. It illuminates the material in a way that is accurate, succinct, and engaging. It is developed by working with information provided by partners, subject specialists, and other authoritative resources. Jason Yasner WDL Operations Manager
Expert Descriptions: Nominal Process World Digital Library www.wdl.org Expert Descriptions: Nominal Process 1. Do we have adequate information from the partner, both metadata and what the partner has provided as descriptions, to produce the descriptions we need for the WDL? Does the description answer the following questions: “What is this object and why does it matter?”
Expert Descriptions: Nominal Process World Digital Library www.wdl.org Expert Descriptions: Nominal Process 2. If the initial description (and metadata) are inadequate to create a usable WDL description, is there additional information on the Partner Institution’s Web site directly linked to this item that could be used to do so? This may not be initially known because there may be different people involved in creating the metadata than were involved in creating these sites.
Expert Descriptions: Nominal Process World Digital Library www.wdl.org Expert Descriptions: Nominal Process 3. If the answers to 1 and 2 both turn out to be “no,” i.e., the Partner Institution does not have a description that can be used, is there an existing, authoritative source (online or in hard copy) that contains a usable description for the item or items in question, or the raw material to create such a description? Description writers need to remember that they are dealing with digitized versions of real objects, many rare and unique. Description writers are trying to highlight the significance of these objects and provide context. They shouldn’t write just about the objects (as in the bibliographic literature) or just about the context (as in general sources such as Wikipedia), but must do a bit of both, relating one to the other.
Expert Descriptions: Nominal Process World Digital Library www.wdl.org Expert Descriptions: Nominal Process 4. If the answers to 1, 2, and 3 are all “no,” we have no alternative but to produce a new description from scratch. This can either be done by going back to the Partner Institution and asking them to produce a new description, or, if for some reason the partner cannot produce a description (it may not have available staff or language expertise, for example) to engage WDL staff and expert contractors to do the description and then run by the partner for comment/approval. Examples of research strategies used by WDL.
Discussion on “Expert Descriptions based on Arab Heritage Resources” World Digital Library www.wdl.org Discussion on “Expert Descriptions based on Arab Heritage Resources” Mr. Mohammed Hammam Fikri Senior Heritage Specialist Cultural Advisor Office, Qatar Foundation
Image Files: Transfer Live Demo Sandy Bostian WDL Content Manager World Digital Library www.wdl.org Image Files: Transfer Live Demo Sandy Bostian WDL Content Manager
Discussion: Increasing the Usage of WDL in the Arab World World Digital Library www.wdl.org Discussion: Increasing the Usage of WDL in the Arab World How do we promote WDL to users in the Arab World? What can your institution do to support this? How do we recruit more institutions to join WDL? Other ideas?
SUMMARY Jason Yasner WDL Operations Manager World Digital Library www.wdl.org SUMMARY Jason Yasner WDL Operations Manager
Overview of WDL Production Process World Digital Library www.wdl.org Overview of WDL Production Process Content Selection Content Transfer Content Processing Cataloging (consistent metadata, descriptions) Translation Publishing
New Developments January 2011: 1,350 items online World Digital Library www.wdl.org New Developments January 2011: 1,350 items online Comprising 98,278 images November 14, 2011 (Partner Meeting): 4,049 items Comprising approx. 212,000 images 200% increase since January 1, 2011 End of 2011: 4,550 items 237% increase since January 1, 2011 Early 2012: Expected approx. 6,000 items Satisfy medium-term goal in WDL Business Plan 344% increase since January 1, 2011
Opportunities for Technical Staff World Digital Library www.wdl.org Opportunities for Technical Staff New Metadata Management Application Code-named “Cupcake” Improved metadata creation and maintenance Currently, also supports Translation and Quality Review processes WDL 2 Overhaul of code for better management and optimization Into the cloud… Better user experience worldwide Continuous Process Improvement Project Management, standards and best practices Efficiency, reliability, and timeliness
Improve Communication and Collaboration World Digital Library www.wdl.org Improve Communication and Collaboration Increase Partner Collaboration Enable more Self-Service via online tools to: Increase online content transfers Track transfers with inventory integration Improve content processing and validation Allow Partner access and review of descriptions, metadata, and translations Get more people in the process where needed Improve Detailed Standards and Guidelines http://project.wdl.org Content guidelines (digitization, file-naming, object structure) Metadata guidelines (mapping from MARC, MODS, Dublin Core) Improve Integration with Authority Files Metadata controlled vocabularies, authority files Translation controlled vocabularies, authority files
Improve Communication and Collaboration World Digital Library www.wdl.org Improve Communication and Collaboration Improve User Interface (UI) and User Experience (UX) WDL 2 is a starting point Richer user interface and experience Full-Text Search (FTS) Enhanced Web 2.0 Sharing features Capacity Building Continue support of digitization centers in Egypt, Uganda, and Iraq Plan meetings and training workshops Use this workshop in Doha, Qatar, as a prototype @WDLorg Please follow
World Digital Library www.wdl.org Contact Information Sandy Bostian, WDL Content Manager sbos@loc.gov Chris Masciangelo, WDL Digital Conversion Specialist cmas@loc.gov Ted Waddelow, U.S. Fulbright Scholar, Univ. of Bahrain theodore.waddelow@gmail.com Jason Yasner, WDL Operations Manager jyas@loc.gov
World Digital Library www.wdl.org Thank you!