Presentation is loading. Please wait.

Presentation is loading. Please wait.

One Document at a Time: Small-scale Digitization Projects Peter Brueggeman, Scripps Inst. of Oceanography Janet Webster, Hatfield Marine Science Center,

Similar presentations


Presentation on theme: "One Document at a Time: Small-scale Digitization Projects Peter Brueggeman, Scripps Inst. of Oceanography Janet Webster, Hatfield Marine Science Center,"— Presentation transcript:

1 One Document at a Time: Small-scale Digitization Projects Peter Brueggeman, Scripps Inst. of Oceanography Janet Webster, Hatfield Marine Science Center, OSU Barbara Butler, Oregon Inst. of Marine Biology, UO 33 rd Annual IAMSLIC Conference, Sarasota Florida

2 Legacy Publication Digitization @ Scripps Peter Brueggeman

3 Past endeavors Vendor produced PDFs from encoded text: smallest file size; costly; time spent on vendor interaction / proofing / revisions Utilizing ILL staff, other staff: lower resolution scanning with routine ILL; quality issues Do It Yourself: better results; least effort

4 Current Equipment Setup Hewlett Packard ScanJet 7800 document scanner: dedicated sheet feeding scanner; double sided scanning Plustek OpticBook 3600 Corporate flatbed book scanner: six millimeters between scan and edge; good for books with tight bindings Adobe Acrobat: PDF optimization; OCR

5 Scan specification Scan from disbound trimmed originals Scan from photocopy if no disbound original in order to sheetfeed 600 ppi black/white two-bit scanning for text pages Small file size, better text appearance with b/w scans (not for photos) 600ppi scan time OK with sheet feeding

6 300ppi grayscale vs 600ppi b/w @ 200%

7 300 ppi vs 600 ppi B/W @ 200% 4 pages: 157K vs 328K

8 Scan Specification 300 ppi grayscale scanning for halftone black and white photographs 300ppi color scanning for color photographs Large PDF file size accumulates for pages scanned grayscale or color Same 4 page PDF is 1,150K @ 300ppi grayscale, whereas 157K @ 300ppi b/w or 328K @ 600ppi b/w

9 Scan Specification For pages comprised partially of a photograph, You may wish to paste photos scanned grayscale / color onto black/white scanned text pages in order to save some file size while ensuring photo quality

10 600ppi black/white scan 300ppi grayscale scan

11 Scan Specification One page with photo on partial page 600ppi black/white PDF with unacceptable photo = 170K 300ppi grayscale PDF with less than acceptable text = 760K 600ppi black/white text & 300ppi grayscale photo PDF = 1,275K 600ppi grayscale PDF = 1,436K

12 Document Production For yellowed/browned original, adjust the lightening setting in the scanning software to get white pages Adobe Acrobat RECOGNIZE TEXT USING OCR not highly accurate Save final PDF, then save it again via FILE- SAVE AS to reduce “document overhead” Page through and proof PDF

13 Document Production Compress via PDF Optimizer if desired Try different settings to judge results My target upper file size is 20 megabytes Save original uncompressed version of PDF

14 Digitization Initiatives at Oregon State University Janet Webster

15 A cog in OSU digitization process Librarian is one player Identify candidates Investigate copyright Send to the Digital Production Unit DPU is the main dealer Sliced if possible Scanned & OCRed Rebound, tied or dumped Entered into appropriate digital collection/space All projects/items fit into bigger collection scheme

16 How it works

17 Another twist on how it works.

18 Oregon Birds Donated journal from a retired faculty member. Posted to the Cyamus list and was prompted to think about digitizing. Contacted the Oregon Field Ornithologists who were interested. Generated a budget with help from my Technical Services Department chair. Now, are negotiating with OFO.

19 Considerations I have access to a good digitization unit. I use it. I promote it and thank those involved. I work with others. I couldn’t do it on my own at the branch.

20 Digitization Initiatives at University of Oregon (OIMB) Barb Butler

21 The OIMB Approach Add to Scholars’ Bank OR Oregon Explorer Shared Collection Development with OSU Long-term goal: Full-text Coos Bay Bibliography (Oregon South Coast) Geo-spatially referenced (Yaquina Bay Bibliography model) Primary targets in initial phase: Student reports and theses Documents already in digital format

22 The OIMB Approach (in the beginning) Student assistant Ariel software Flatbed scanner 100 pages per hour Reviewed by staff OCR by Adobe Uploaded

23 Example 1:

24 Example 2:

25 Example 3: 1941 Printing: OCLC: 15 libraries Z39.50 Distributed Library AIMS Hopkins MBL/WHOI Aquatic Commons: Submitted 10/2007

26 The OIMB Approach (refined) Same as part two with improvements: Document feeder with duplex capability (Epson GT-2500) Native scanner interface or Ariel interface Also inputting into Aquatic Commons Challenges still exist: Lack of dithering option Still scanning at 300 dpi, b/w and grayscale OCR and collating documents


Download ppt "One Document at a Time: Small-scale Digitization Projects Peter Brueggeman, Scripps Inst. of Oceanography Janet Webster, Hatfield Marine Science Center,"

Similar presentations


Ads by Google