Presentation is loading. Please wait.

Presentation is loading. Please wait.

Overwhelmed by Large-scale Digitization Projects

Similar presentations

Presentation on theme: "Overwhelmed by Large-scale Digitization Projects"— Presentation transcript:

1 Overwhelmed by Large-scale Digitization Projects
Xiaocan (Lucy) Wang Digital Repository Librarian Eric Holt University Archivist Cunningham Memorial Library Indiana State University

2 Agenda Project background Implementation Outcome Lesson learned
Equipment Software choices Process Ingestion Workflow Outcome Lesson learned Conclusion

3 Project Background Indiana State University

4 Project Background ETD (electronic theses and dissertations)
ETD Digital Initiative 2010 and onward Access

5 Project background (cont.)
RTD (retrospective theses and dissertations) Number: 3,802 Where: Archives + Library basement Condition: most in usable condition, but… Access

6 Project Background (cont.)
Purposes Centralize: ETD & RTD Improve access, search and retrieval Support teaching, learning and research Improve preservation

7 Project Background (cont.)
Consideration Format Copyright Privacy

8 Equipment Bookdrive DIY

9 Disclosure Not currently or previously an employee of the corporations whose products I discuss I am not compensated for my comments or opinions Older software version being used

10 Capture New Book window

11 Capture in action

12 Batch entry

13 Irfanview

14 GIMP Open source equivalent to Photoshop
Batch processing requires additional plugin Supervisor unfamiliarity

15 Photoshop Can record action to perform batch processing
Graphical interface while setting up recorded action



18 Changing DPI






24 Color Grayscale B/W

25 PDF Compression All items being converted are compressed
Some formats compress better than others Compression artifacts can also become visible


27 Original image of page is visible
Searchable text layer is hidden


29 First Review All pages present? All text legible?
No shadows covering text? Page in focus? Essential color elements retained?

30 PDF/a Copy saved to Archives server Only accessible to staff



33 Final Review and cleanup
Review metadata Correct if necessary Approve and publish Remove original camera images, processed images, and extra copies of pdf

34 Workflow Imaging original theses or dissertations

35 Workflow (cont.) Processing image files

36 Workflow (cont.) Converting to PDF/A

37 Workflow (cont.) Publishing on ISU IR

38 Outcomes Volume finished: 848 Average volume size: 96 pages
Average student time: 1.3 hours Average supervisor time: 5-10 minutes Average file size: 5.5 MB Total Disk Space: 4.6 GB Approximate cost: $15-18

39 Worth It? Centralize Improve access Via digital repository
Search engines Digital repository registries WorldCat

40 Worth it? (cont.) Support teaching, learning and research
Improve preservation strategies Multiple digital copies Backup Bitstream preservation Distributed preservation network via MetaArchive Cooperative

41 Lesson learned Control quality: Supervise students Add MARC 856 field
monochrome and grayscale Supervise students Add MARC 856 field Secure continued funds

42 Conclusion Complex Various issues In-house vs. outsourcing Funding
Technical standards Quality control Format selection In-house vs. outsourcing Metadata Delivery Preservation Rights management Workflow development


44 Contact info Xiaocan (Lucy) Wang Eric Holt

Download ppt "Overwhelmed by Large-scale Digitization Projects"

Similar presentations

Ads by Google