Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton.

Similar presentations


Presentation on theme: "1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton."— Presentation transcript:

1 1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton

2 2 Overview: BOPCRIS today Move to work natively with standards Interoperability Preservation Design project procedures from ground up with metadata in mind File-naming and directory structuring Metadata capture processes Production workflow that automates where possible Minimize possibility for human error / subjectivity Final package of digital object that records preservation information on the digital shelf and aims for maximum interoperability between systems, all in one place

3 3 Overview: technical details File-naming / directory structure Incorporating project-specific unique ids Final package (digital object) Internally consistent tarball [*.TAR] Relative path-naming conventions METS wrapper Extension formats for metadata: descriptive (MODS); technical (MIX); process (PREMIS) Production workflow Automated production of final package Metadata recording Dynamic input by scanner operators

4 4 History Eighteenth Century Parliamentary Papers Project under Phase 1 of JISC Digitization Programme Proprietary system and data formats (Agora) Manual input of metadata o Descriptive and Structural Advantages and Disadvantages

5 5 History: Advantages Proprietary system with advanced functionality: OCR workflow Web presentation Highly customizable Metadata fields specified and modified at will

6 6 History: Disadvantages Non-standard metadata fields No mapping to standard formats difficulties: interoperability; metadata harvesting Translation Between systems, or between use and archive formats introduces possibility of versioning issues No scope for preservation metadata Separation between workflow / presentation system and preservation strategy Resulted in disparate collection of scripts and tools to manage data

7 7 Present: Metadata Standards Bibliographic database export File-system level Directory structure File-naming conventions Scanning level TIFF headers Additional descriptive metadata METS profile Tailored to project needs Extension formats (MODS, MIX, PREMIS) Checksums (MD5)

8 8 Present: Metadata Origins Scanned Images TIFF headers METS OCR (Agora / ABBYY) MIX (Z39.87) File-naming Directory structure (TAR) Other metadata Process Additional descriptive PREMIS Bibliographic Metadata MARC21 / MODS / etc. File formats TIFF master / Derived JPEG Flat text (TXT) & Word-co-ordinated OCR Custom dmdSec PRECURSORS GENERATED

9 9 Future One tool for entire process, from scanned images to METS Tool would: Extract technical metadata Include descriptive metadata Build flat-structure METS Tool would require: File-naming, directory-structuring conventions Image file sources

10 10 Future: Advantages Abstraction = standardization All digitization projects will produce metadata in similar formats interoperability Certain technical base-standards will be present preservation Any centrally developed preservation or presentation systems would be able to ingest output from any project Saves wasted effort developing similar solutions many times, when one solution can be developed once and adapted

11 11 Future: Questions… Usefulness of such a tool? Relevance to your project? Problems / obstacles? How much flexibility is necessary? Manual input / editing? Main points: Abstraction, functionality, flexibility

12 12 Further information Ed Fay, Software Developer BOPCRIS, Hartley Library University of Southampton ef1@soton.ac.uk 023 8059 3575


Download ppt "1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton."

Similar presentations


Ads by Google