Adventures in Digital Asset Management: Fedora at the National Library of Wales Glen Robson National Library of Wales
Contents The National Library of Wales Why the NLW choose Fedora The pilot Theoretical look into preservation Data Models
The National Library of Wales Nature of NLW Collecting –Variety of data types and formats Preserving –Obsolescence –Lack of context information –Persistent identifiers –Integration Access –Open collections
Why we choose Fedora Comparison with D-Space Fundamental issues – Suitability for wide range of data types – Suitability for distribution of data types – Support for collection structures – Scalability – `Future-proof’ architecture
The Pilot Understand the Fedora System Experiment with different data types Allow access to Digital Assets Investigate workflows for moving digital material into the repository
Examples Ingested Digitised Images E-Thesis Ingested web pages Born Digital Object Basic authentication and rights management
E-Thesis Abstract Word Document Thesis – Original, PDF, Text, HTML and Tiff page images Video Composition – Original, DivX and Web Viewer
Web Pages Complex Digital Objects –Arrive in a Compressed File Dissemination 1 Uncompress tar and serve –Simple –Difficult to migrate formats Dissemination 2 Extracted and ingested into Fedora –More complicated –Can do format migration without breaking links –HTML converted to XHTML –Meta data can be assigned to each page, image or movie.
Digitised Images Problems: – Obsolete Formats – Loss of context information – Persistent identifiers and URLs – Integration – Access
Digital Images – Obsolete Formats What if we move from jpeg to jpeg2000 – Website would have to be updated – All links would break: – Special Viewer? Fedora’s Solution – Find all images – Add a disseminator to convert jpg files to jpeg2000 – Links not file specific: – Record conversion in meta data – History automatically saved
Digital Images – Context information Fedora’s Solution – Mets Document in object as Data Stream – Version history so changes saved – Can store any type of Meta data: Mets Rights PREMIS Preservation Meta data – Could even store the intro page located on the Digital Mirror
Digital Images – Persistent Identifiers Fedora’s Solution –Data Type independent URLs –Fedora PID constant even through upgrades –Can add any Identifiers using Fedora relationships –URLs link to Servlets for redirection GetMedium Servlet1 Find pid llgctest1:189 Get Mid sized image Data Stream Convert to JPEG2000 Return Image GetMedium Servlet2 Find pid llgctest1:189 Resize large image Data Stream Convert to JPEG2000 Return Image
Digital Images - Integration Existing Digital Content from the Digital Mirror &locale=en&mode=thumbnail – Ingest existing Mets documents into Fedora No change to existing workflow – Ingest images into Fedora Better preservation – Allow original look and feel to website One line change to configuration file – Enhanced Version (PDF of Book)
Digital Images - Access 3 Types –Through Catalogue Difficult with Geac New System OAI Harvesting? –By Browsing Current Digital Mirror Relationships – View all digitised collection: –By searching repository Ambfish Indexes Mets Documents
Data Models Object: Fish Book PID: llgctest1:108 DS1.0 Page 1 Large DS2.0 Page 1 Mid DS3.0 Page 2 Large DS4.0 Page 2 Mid DS5.0 Page 2 Large DS6.0 Page 2 Mid DS7.0 MIX Meta for Page 1 DS8.0 MIX Meta for Page 2 DS9.0 MIX Meta for Page 3 DS10.0 METS Document
New Model Object: Page1 Fish Book PID: llgctest1:109 DS1.0 Image Large DS2.0 Image Mid DS3.0 MIX Meta about DS1.0 Object: Page2 Fish Book PID: llgctest1:110 DS1.0 Image Large DS2.0 Image Mid DS3.0 MIX Meta about DS1.0 Object: Fish Book PID: llgctest1:108 DS1.0 Mets Doc Is Part Of
Summary Fedora as DAMs Fedora Community Moving towards OAIS and Trusted Digital Repository status
Questions and Answers Glen Robson