Developing an Ingest Service for Fedora Ryan Scherle Muzaffer Ozakca.

Slides:



Advertisements
Similar presentations
October 28, 2003Copyright MIT, 2003 METS repositories: DSpace MacKenzie Smith Associate Director for Technology MIT Libraries.
Advertisements

Goals for RUcore o Flexible, extensible cyberinfrastructure for Rutgers University o Integrating platform for legacy information systems o Support preservation.
Building A Digital Asset Management System With And Around Fedora 4 Stefano Cossu, Director of Application Services, The Art Institute of Chicago DC Fedora.
Interoperability and Preservation with the Hub and Spoke (HandS) Matt Cordial, Tom Habing, Bill Ingram, Robert Manaster University of Illinois Urbana-Champaign.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Fedora 3.0 and METS: A Partnership for the Organization, Presentation and Preservation of Digital Objects Open Repositories Georgia Tech, Atlanta,
Depositing e-material to The National Library of Sweden.
Building a Digital Library with Fedora International Conference on Developing Digital Institutional Repositories Hong Kong December 9, 2004.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
WMS: Democratizing Data
Dspace – Digital Repository Dawn Petherick, University Web Services Team Manager Information Services, University of Birmingham MIDESS Dissemination.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Introducing Symposia : “ The digital repository that thinks like a librarian”
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation Mike Smorul, Joseph JaJa, Yang Wang, and Fritz McCall.
Demonstration of repositories Fedora (Flexible Extensible Digital Object Repository Architecture) Marie Lagerwall MIDESS Partners Meeting February 9, 2007.
Incompatible or Interoperable? A METS bridge for a small gap between two digital preservation software packages Lucas Mak Metadata & CatalogLibrarian
A Digital Preservation Repository for Duke University Libraries Jim Coble Digital Repository Developer Open Repositories 2013.
July 29 and August 11, 2015 How CONTENTdm works: A demonstration Ron Gardner OCLC Digital Services Consultant.
Digital Asset Management for All? Visualising a Flexible DAMS Solution for Small and Medium Scale Institutions Paul Bevan Llyfrgell Genedlaethol Cymru.
RMIS - Building a Research Management Information System at the University of Glamorgan Leanne Beevers & Neil Williams.
Digital Asset Management Strategies at NLW Digital Asset Management Strategies at the National Library of Wales 18 th September 2007 Paul Bevan
Foundations of Excellence DSpace vs Fedora: Or what I do on my summer vacation.
Building a Fedora Architecture to Support Diverse Collections Jon Dunn Ryan Scherle Digital Library Program Indiana University.
Managing the Record of Research At the Smithsonian Using SIdora SAA Research Forum August 12, 2014.
1. 2 introductions Nicholas Fischio Development Manager Kelvin Smith Library of Case Western Reserve University Benjamin Bykowski Tech Lead and Senior.
5-7 November 2014 DR Workflow Practical Digital Content Management from Digital Libraries & Archives Perspective.
Glen Robson Head of Systems Unit National Library of Wales
Research Data Management At the Smithsonian Using SIdora Nano Tech Working Group May 15, 2014.
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Paul Bevan DAMS Implementation Manager
IUScholarWorks is a set of services to make the work of IU scholars freely available. Allows IU departments, institutes, centers and research units to.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.
University of Illinois at Urbana-Champaign A Unified Platform for Archival Description and Access Christopher J. Prom, Christopher A. Rishel, Scott W.
Overview of IU Digital Collections Search Hui Zhang Jon Dunn Indiana University Digital Library Program IU Digital Library Brown Bag October 19, 2011.
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
NCSU Libraries 27 March 2006 Digital Preservation in State Government – Wilmington, NC North Carolina Geospatial Data Archiving Project Workflow, Tools,
DAMS Implementation at NLW DAMS Implementation at NLW 20 th February 2007 Paul Bevan
Introduction to metadata
IUScholarWorks Technical Overview Randall Floyd Digital Library Program Programmer/Database Administrator.
Selene Dalecky March 20, 2007 FDsys: GPO’s Digital Content System.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
The New DRS Introduction. What is DRS? Digital repository for preservation and access – Maintains integrity of deposited content – Preserves content for.
Interoperability and Collection of Preservation Metadata for Digital Repository Content Matt Cordial, Tom Habing, Bill Ingram, Robert Manaster University.
Digital Repository Service Update ___________________________ Yale University Library Roy Lechich, ILTS Audrey Novak 15 Aug 2007.
ARROW Institutional Repositories for Managing e-Theses Presentation to ETD September 2005 Geoff Payne, ARROW Project Manager.
A Fedora 3 to 4 Migration Case Study for UNSW Australia Library Fedora 4 Training Workshop, eResearch Australasia 2015, Brisbane UNSW Library Arif Shaon,
A Fedora 3 to 4 Migration Case Study for UNSW Australia Library Fedora 4 Training Workshop, eResearch Australasia 2015, Brisbane UNSW Library Arif Shaon,
Research Data Management At the Smithsonian Using Sidora CNI December 10, 2013.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
Research Data Management At the Smithsonian PASIG, Washington, DC May 24, 2013.
Fedora Metadata The Basics 9/9/2008. Mini Glossary Fedora: ‘ Flexible Extensible Digital Repository Object Architecture;’ asset repository, metadata architecture.
Collection Management Systems
Building flexible workflows with Fedora at the University of York Julie Allinson and Frank Feng The 5 th International Conference on Open Repositories.
Fedora Service Framework Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
Developing a Dark Archive for OJS Journals Yu-Hung Lin, Metadata Librarian for Continuing Resources, Scholarship and Data Rutgers University 1 10/7/2015.
The world’s libraries. Connected. CONTENTdm ® Digital Collection Management Solutions Learn what to consider when outsourcing your library’s digitization.
Breeda Herlihy, IR Manager, UCC Library. UCC selected DSpace in 2008 Software selection group Staff from Library IT, Computer Centre, Special Collections,
Eliot Wilczek University Records Manager Digital Collections and Archives Tufts University Repositories: How are They Evolving? A NERCOMP Workshop September.
Building Foundations: Fedora, Fez, and the ADR prepared by Jessica Branco Colati ADR Project Director, Colorado Alliance of Research Libraries
Fedora, Fez, and the ADR an ePoster presented at Institutional Repositories: Disseminating, Promoting, and Preserving Scholarship Utah State University.
Joseph JaJa, Mike Smorul, and Sangchul Song
Overview: Fedora Architecture and Software Features
Building Search Systems for Digital Library Collections
Flexible Extensible Digital Object Repository Architecture
Flexible Extensible Digital Object Repository Architecture
Archiving and Delivery of Student Portfolios
Malte Dreyer – Matthias Razum
Presentation transcript:

Developing an Ingest Service for Fedora Ryan Scherle Muzaffer Ozakca

IUDL infrastructure project 2-year project funded by University Information Technology Services to reengineer digital library infrastructure around Fedora Builds on experience with Fedora in context of EVIA Digital Archive (ethnomusicology video) 2 full-time staff, plus part-time from many others Dozens of legacy collections with roughly 100,000 objects New collections: some content-focused, some research-focused

Diversity Multiple media types Multiple brands Multiple tools

The goal Ingest Aajk fs jkflsf jkds s jfs sdkf Jkl id jid whi ahin inpa aialw hwiwl

Required features Ingest common content types: ▫ Images ▫ Paged documents ▫ Textual documents Allow for easy creation of new content types Must support several workflows ▫ Metadata or media may be primary ▫ Most objects include derived media ▫ Systematic changes to metadata may be desired ▫ May need to connect with external tools for metadata generation, validation, etc. ▫ A workflow engine may sit on top of the ingest system

Existing Ingest Tools

Criteria Ease of install Native content models Custom content models (e.g. paged) Workflow neutrality, including object modification Batch ingest Remember, we’re evaluating object ingest only, not object delivery!

But first, some disclaimers… This is not an objective evaluation, just our experiences We’re not experts in these systems We’re evaluating ingest only, not delivery! We’re evaluating ingest with a focus on our needs We believe in community

Fedora admin client Comes with Fedora Geared towards admins rather than end users No systematic way of entering data or attaching files Very flexible The only way to create disseminators Tedious

Fez End-to-End GUI system Highly customizable content models, workflow, security Customizable role and group based access control Growing community Originally developed as an Institutional Repository Many preset content models Can create “extension” metadata based on an XSD External MySQL database for workflow/vocabulary data GPL

Fez - ingest Single object ingest ▫ Through Web UI ▫ ImageMagick/JHOVE integration Bulk ingest: ▫ Upload files to a directory ▫ Also can import existing Fedora objects in bulks ▫ Templates for metadata common to all objects, manual updates for the rest ▫ Batches possible, but only one file per object No disseminators Custom metadata can be stored as a simple XML file Objects must use “compound” content model File Custom MD Fedora

Fez – object organization Content level Collection level Community level CommunityCollectionImage DOPaper DOCollection DO with Custom MD

Elated overview End to end complete system for digital collections Simple customizable metadata and a simple workflow supported GPL “Elated is a lightweight, general-purpose application for managing digital files. ELATED is built on top of the Fedora Repository System, and could be used as a digital assets management system, an institutional repository, or to meet other collection archiving, publishing and searching needs.”

Elated ingest Single object ingest ▫ Through Web UI ▫ Focused on DC metadata, custom fields can be added Multi object ingest via zipped folders and files ▫ Metadata template + manually ▫ Batches possible, but only one file per object Simple content model Manually-attached disseminators File DC + Custom MD Fedora

Elated object organization Level n Level 2 Level 1 Top level CollectionFolder Image DO PDF DO Image DO PDF DOFolder Image DO

Valet for ETDs A component of the VTLS VITAL product focused on ETD submission Allows submission of thesis and a simple workflow for approval Part of a larger framework Highly focused on ETDs

DirIngest overview Ingests objects from a structured ZIP file Highly flexible User must create METS structure by hand Doesn’t handle disseminators Can create some RELS-EXT data, but not fully flexible Cannot modify existing objects/collections Easy to use OhioLink Bulk Ingest

DirIngest CollectionImages Image File TextsText File Zip Archive METS.xml Fedora Crules.xml Content level Folder level Top level CollectionImages Image DO TextsText DO

Batch modify A method of controlling API-M with simple XML statements Can create “empty” objects and change them in systematic ways. Requires manual (or programmatic) creation of the modify scripts Can be used in conjunction with other tools…

Summary FezElatedValetDir Ingest Batch Modify Admin Client Ease of install Native CM Custom CM Workflow Neutrality Batch ingest

Indiana Ingest Tool

A structured interface between a workflow management or repository management GUI and the Fedora repository Focused on simple input formats for maximum flexibility Keeps the tools independent of the repository architecture Builds the FOXML, rather than requiring a full structure to be pre-built Binds disseminators Creates RELS-EXT relationships Can create and/or alter items in a collection Auto-generates technical metadata with JHOVE or XSLT.

Ingest Tool Fedora MODSEADPDF DatastreamsFOXML Image Cataloging ToolSheet Music Cataloging Tool JPGSIP

Performing an ingest Place source metadata in an accessible location (filesystem, website) Place media files (both master and derivative) in an accessible location Define the "collection configuration" Run the ingest process Receive report

Sample collection config file Hoagy Carmichael Correspondence paged hoagy iudl:6 {path to master images}.tif {path to dreivative images here} -thumb.jpg -screen.jpg -full.jpg {path to ead} Collection defn File defn Desc. metadata Tech. metadata What to do If item exists

Ingest Config MODSImages Link to Parent Ingest Tool Fedora Tech MD FOXML Datastreams: Images METS RELS-EXT Example – Sheet Music

Ingest Config AES31 Metadata Audio Link to Parent Ingest Tool Fedora Tech MD FOXML Datastreams: Images METS RELS-EXT Example – preservation package SIP

Summary FezElatedValetDir Ingest Batch Modify Admin Client Ease of install Native CM Custom CM Workflow Neutrality Batch ingest IU Tool

Major difficulties in any ingest tool Providing flexibility in “style” of content model Matching filenames with metadata records Indicating the sequence of files in complex objects Abstracting over differing local metadata standards (even in our own collections)

Topics for future discussion What is the best structure for an ingest tool? ▫ Is our tool of interest to others? ▫ Would it be better to combine our capabilities with an existing tool? Can we agree on some core content models?

Thank You! Infrastructure project wiki: ▫ Contact info: ▫ Ryan Scherle ▫ Muzaffer Ozakca