Presentation is loading. Please wait.

Presentation is loading. Please wait.

“Think. Learn. Succeed.” Ver 1.2 Methods for Knowledge Management & Digital Preservation The Theory and Practice of Digital History Carl A. Young, M.A.

Similar presentations


Presentation on theme: "“Think. Learn. Succeed.” Ver 1.2 Methods for Knowledge Management & Digital Preservation The Theory and Practice of Digital History Carl A. Young, M.A."— Presentation transcript:

1 “Think. Learn. Succeed.” Ver 1.2 Methods for Knowledge Management & Digital Preservation The Theory and Practice of Digital History Carl A. Young, M.A. in waiting 1 December 2009

2 “Think. Learn. Succeed.” Ver 1.2 Project Overview Resource and skill-constrained historians and archivists require efficient methods for capturing, analyzing, and sharing original artifacts. Multi-phase project Develop a low-cost process for digitally archiving documents Store them in a standards-based data storage platform Set the conditions to scale with future phases Creating a collaborative, accessible, online digital repository Phase I – Prototyping Phase II- Capture Phase III- Web Access Phase IV- Initial Expansion Phase V- Infinite Expansion Major PhasesMethodology Challenge

3 “Think. Learn. Succeed.” Ver 1.2 Completed in November 2009, this phase established a usable, affordable methodology for project development by prototyping the capture and conversion of an original artifact for testing and exploration purposes. 3 Phase I: Prototype

4 “Think. Learn. Succeed.” Ver 1.2 4 Demonstration Phase I: Prototype (cont.) Original Digital Camera.JPG file format 2 MB Treatment w/Photoshop.TIFF 29 MB Adobe Conversion.pdf 278 KB Time elapsed: Photo: <1 min Treatment: ~3 min Conversion: <1min

5 “Think. Learn. Succeed.” Ver 1.2 5

6 “Think. Learn. Succeed.” Ver 1.2 6 Phase I: Prototype (cont.) Process Flowchart Legend

7 “Think. Learn. Succeed.” Ver 1.2 Completed in November 2009, this phase performed and documented a low-budget document capture, artifact preservation, and conversion to a distributable format where a historic text is extracted from the original document, archived, and presented to the user in both the original capture (.jpg or.tiff) and distributable (.pdf and.xml) format with an evaluation of optical character recognition (OCR) and transcription requirements. 7 Phase II: Capture

8 “Think. Learn. Succeed.” Ver 1.2 Select Area Image –Adjustments –Curves “Digitization” Channel - RGB Output-203 Input-160 8 Phase II: Capture (cont.) Image Treatment Filter Blur Smart Blur Radius-100 Threshold-100 Quality- High Mode- Normal Surface Blur Radius-100 Threshold-25 Surface Blur (if needed) Radius-100 Threshold-25 Lens Blur Shape - Octagon Radius - 5 Blade Curve - 50 Rotation - 300 Brightness -10 Threshold - 75 Noise- 3 Distro –Uniform Select Select Color Range Modify Shadows No Invert Modify Expand2 Cut File New * Width-1600 Height - 2500 Resolution- 300 CM - RGB 16bit * Recommend saving as a preset. Paste Flatten Clean up as needed Save As.TIFF

9 “Think. Learn. Succeed.” Ver 1.2 Select Area Image –Adjustments –Curves “Digiszation” Channel - RGB Output-203 Input-160 Filter –Blur –Smart Blur –Radius-100 Threshold-100 Quality- High Mode- Normal –Surface Blu Radius-100 Threshold-25 –Surface Blur Radius-100 Threshold-25 –Lens Blur Shape - Octagon Radius - 5 Blade Curve - 50 Rotation - 300 Brightness -10 Threshold - 75 Noise- 3 Distro –Uniform Select Color Range Modify Shadows –No Invert –Expand2 Cut New File Width-1600 Height - 2500 Resolution- 300 CM - RGB 16bit CP Pix - Sq Paste Flatten Delete Clean up as needed Save As.TIFF 9 Phase II: Capture (cont.) Image Treatment

10 “Think. Learn. Succeed.” Ver 1.2 10 OCR and Transcription Demo Phase II: Capture (cont.) OCRTranscriptionTime elapsed: OCR: <1 min Transcription: ~5min

11 “Think. Learn. Succeed.” Ver 1.2 11 OCRTranscription

12 “Think. Learn. Succeed.” Ver 1.2 12 TEI Demo Phase II: Capture (cont.) Time elapsed: Preliminary Data: ~45 min Page: ~5 min

13 “Think. Learn. Succeed.” Ver 1.2 13 Phase II: Capture (cont.) Methodology Flow Chart Legend

14 “Think. Learn. Succeed.” Ver 1.2 Phase II: Capture (cont.) Militiaman’s Guide 155 pages total, type text, fair condition 40 hours (optimal) / 5 Gbs Per Page Estimates Photography: –~30 sec –2.5 Mbs @ 5Mpxl.tiff Conversion –~3 min –23 Mbs.pdf Conversion –~1 min –300 Kbs OCR - ~45 sec Error Correction/Transcription: ~5 min TEI - ~5 min (~45 min overhead) 14 Labor Estimates Case Estimates Photography: –~1:15 –~ 400 Mbs.tiff Conversion –~7:45 –3.5 Gbs.pdf Conversion –~2:30 –50 Mbs OCR - ~2 hours Error Correction/Transcription: ~13 hrs TEI - ~14 hrs

15 “Think. Learn. Succeed.” Ver 1.2 Consumer-grade HP 5Mpxl digital camera ($125) Slightly above consumer-grade PC ($1100) –4 GB RAM –1 GB VRAM –500 GB, SATA HD –Dual Screens Consumer Software ($600) –Adobe Creative Suite 3 15 Equipment Baseline

16 “Think. Learn. Succeed.” Ver 1.2 Use a Tripod/Mount Use consistent lighting Safely flatten pages as much as possible Use a mounting frame Highest Resolution available OCR is NOT reliable Need an efficient method for TEI 16 Lessons Learned

17 “Think. Learn. Succeed.” Ver 1.2 This phase is the subject of this grant funding request. A team of professional developers will construct a suitable multi-media database for storage and access of original artifact captures, distributable.pdf versions, and XML-based data and metadata derived from the original. The team will also develop a working prototype web site to access the data. Fundamental to this phase will be data archiving and disaster recovery for the data. Successful conclusion of this phase will yield a working version 1.0 available for release and continued development. 17 Phase III: Web-Access

18 “Think. Learn. Succeed.” Ver 1.2 18 Phase III: Web-Access (cont.) Flow Chart

19 “Think. Learn. Succeed.” Ver 1.2 19 Work Breakdown Structure Phase III: Web-Access (cont.) Database Development Prototype Evaluation Prototype Web Development Alpha Test & Mod Beta Test & Mod RC1 Test & Mod v1.0 Documentation Disaster Recovery Testing Estimated Cost: $52,000

20 “Think. Learn. Succeed.” Ver 1.2 20 Project Gantt Chart Phase III: Web-Access (cont.)

21 “Think. Learn. Succeed.” Ver 1.2 Beyond the scope of this grant request, this phase seeks to develop partnerships and data shares across multiple institutions with similar projects in development or production. The level of participation directly influences the scale of this phase. It is anticipated that the minimal costs will be shared across participating institutions. 21 Phase IV: Initial Expansion

22 “Think. Learn. Succeed.” Ver 1.2 Conduct Lifecycle Management Review Documentation Disaster Recover Testing Publish Methodology Find Partners Large Scale Capture Leverage v1.0 Update Code and Processes 22 Work Breakdown Structure Phase IV: Initial Expansion (cont.) Estimated Cost: $8,000

23 “Think. Learn. Succeed.” Ver 1.2 Optionally, and depending on the success of the earlier phases, this phase will greatly expand collaborative efforts by potentially make this capability available to amateur and resource- constrained archivists and historians by providing a standards-based methodology and data capture technique and a collaborative platform to share the data once stored. This aspect of the final phase will be limited only by technology maintenance and scalability costs. 23 Phase V: Infinite Expansion

24 “Think. Learn. Succeed.” Ver 1.2 24 Work Breakdown Structure Phase V: Infinite Expansion (cont.) Publish Updated Methodology Publish Membership Schema Open Data Models Leverage Current Version Conduct Lifecycle Management Review Documentation Disaster Recover Testing Estimated Cost: $82,000 Release New Version(s)

25 “Think. Learn. Succeed.” Ver 1.2 Summary 5-Phase Approach “How-To” –Digitization –TEI –Manage the project Sets the stage –Broad/ambitious goals and plan –Manageable pieces Phase III support: –$51,733.33 –Prototype Validation –Database Development –Web Development –Hosting –Disaster Recovery Phase IV and V templates –Future expansion as desired –Flexible Planning 25 Project SummaryGrant Request / Funding Summary

26 “Think. Learn. Succeed.” Ver 1.2 QUESTIONS 26

27 “Think. Learn. Succeed.” Ver 1.2 CONCLUSION 27

28 “Think. Learn. Succeed.” Ver 1.2 Man had always assumed that he was more intelligent than dolphins because he had achieved so much... the wheel, New York, wars, and so on, whilst all the dolphins had ever done was muck about in the water having a good time. But conversely the dolphins believed themselves to be more intelligent than man for precisely the same reasons. - Douglas Adams 28 Dead Guy Quote

29 “Think. Learn. Succeed.” Ver 1.2 BACKUP 29

30 “Think. Learn. Succeed.” Ver 1.2 30 Phase I: Prototype (cont.) Work Breakdown Structure Image Capture Image Preservation Image Manipulation Database Development TEI Process Development Data Development Static Web- Page Prototyping Documentation Disaster Recovery Testing Estimated Cost: $5,000

31 “Think. Learn. Succeed.” Ver 1.2 31 Gantt Chart Phase I: Prototype (cont.)

32 “Think. Learn. Succeed.” Ver 1.2 32 Phase II: Capture (cont.) Work Breakdown Structure Image Capture TEI Prototype Database Input Documentation Disaster Recovery Testing Estimated Cost: $2,000

33 “Think. Learn. Succeed.” Ver 1.2 33 Phase II: Capture (cont.) Gantt Chart


Download ppt "“Think. Learn. Succeed.” Ver 1.2 Methods for Knowledge Management & Digital Preservation The Theory and Practice of Digital History Carl A. Young, M.A."

Similar presentations


Ads by Google