Presentation on theme: "Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off."— Presentation transcript:
URI Short Title Abstract Creation Date Contact Information Related Profiles Extension Schema Rules of Description Controlled Vocabularies Structural Requirements Technical Requirements of Content, Behavior and Metadata Files Tools and Applications Appendix: Example Document Whats in a METS profile?
Currently registered profiles Oxford Digital Library METS Profile UCB Imaged Object Profile UCB Paged Text Object Profile Model Imaged Object Profile Model Paged Text Object Profile The University of Waikato Digital Library Group - Greenstone Project METS Profile [Draft] Library of Congress METS Profile for Audio Compact Discs
Putting together your own profile Descriptive metadata Administrative metadata File section Structural map
Descriptive metadata Embed within the METS file, or hold externally and reference from it? One metadata section or several? Which schemes? Which content rules to follow (AACR2, ISAD-G etc)?
Embed or reference? Referencing – Allows metadata not in XML to be used (as a last resort) – Allows metadata files to be distributed and held anywhere (including different repositories) – Means that when metadata is updated, only the referenced file is changed, not the METS file Embedding – Requires metadata to be in XML – Keeps everything in one place for easier archiving (OAIS) – Prevents dead links – Allows easier processing
One metadata section? Multiple sections are allowed in a METS file Possible uses of multiple sections:- – Multi-lingual objects, with descriptions in each language in separate sections – Different schemes revealing different facets of the object (iconography, intellectual content etc). – A simple main description and more detailed supplementary descriptions
Which schemes to use? If possible, use schemes recommended by the METS Editorial Board (METS Extenders) – Dublin Core – MARCXML MARC 21 Schema (MARCXML) – Metadata Object Description Schema (MODS)
Dublin Core 15 basic fields Can be qualified A set of suggested qualifiers published by DC Problems:- – Unqualified DC too vague for detailed descriptions – Qualifying DC reduces its interoperability Title Creator Subject Description Publisher Contributor Date Type Format Identifier Source Language Relation Rights
MARC-XML A translation of MARC to the XML schema format Can move losslessly from MARC to MARCXML and vice versa
MODS Metadata Object Description Schema A subset of MARC intended particularly for digital items Richer than unqualified Dublin Core but more interoperable Easier for non-librarians than MARC-XML Generally seen as a good compromise solution for digital objects
Content rules To ensure interoperability, metadata content should be controlled if possible Some possibilities:- – AACR2, particularly if collection digitizes library materials (allowing compatibility with OPAC) – LCNAF for name authorities – ISAD (G) for archival materials – National Council of Archives rules for name authorities?
Administrative metadata Most of the same considerations apply to administrative as to descriptive metadata – Embed within the METS file, or hold externally and reference from it? – One metadata section or several? – Which schemes?
Schemas for administrative metadata Still images – MIX: NISO Technical Metadata for Digital Still Images Standards Committee Text – Schema for Technical Metadata for Text Video – VIDEOMD: Video Technical Metadata Extension Schema
What files will you include in your and how will they be arranged? Archival images – Uncompressed TIFFs (colour or greyscale) – Group IV compressed bitonal TIFFs (bitonal) – Held on archival file server Deliverable images – JPEGS or GIFS – Possibly more than one to allow viewing at differing resolutions Thumbnails – JPEGS or GIFS
How will you arrange your ? Probably no internal structure if each METS file contains metadata for a single image only Possibly treat METS file as holder for collection of images – Group into categories? – Work out a logical sequence
The file inventory Which files to include, and in what format? – Image files Archival format (TIFF) Delivery format (JPEG) Thumbnails (JPEG) – Text XML-marked up text (preferably in TEI) Word files etc? – AV materials Video files (MPEG, MOV, WMV) Sound files (WAV, MP3?)
The file inventory Embed or reference? – Content may be embedded within METS file (as XML or Base 64 encoded data) – Embedding allows all data and metadata to be held together for archival purposes, but files can be huge! – Embedding is feasible with text, probably best avoided with image, sound, or video! How to organise them? – Group by referent? – Or by file type?
Grouping by referent Each element contains the files for a given unit (page of a book, slide, section of video) Point at the element from the within the structural map corresponding to this unit Use the GROUPID attribute to differentiate between the types of file
Grouping by file type All files of the same type are listed under the same, eg. – All archival images – All delivery images – All thumbnails The GROUPID attribute is used to indicate the referent (eg. page) of each image Each file is referenced separately in the structural map
Organising the structural map Need to work out how users will want to browse through item and design structure accordingly – Images – should these be put into a sequence or collated into collections? – Book -> chapters -> sub-chapters -> page – Video -> sections -> segments -> timecodes
One structural map or many? Do you need separate hierarchies? – eg. Physical vs logical hierarchies Usually one is sufficient if hierarchies nest neatly If more than one hierarchy is used, how are they linked together?