A Multi-Tiered Architecture for Distributed Data Collection and Centralized Data Delivery Stacy Kowalczyk and James Halliday April 28, 2008.

Slides:



Advertisements
Similar presentations
How to Set Up a System for Teaching Files, Conferences, and Clinical Trials Medical Imaging Resource Center.
Advertisements

How to Author Teaching Files Draft Medical Imaging Resource Center.
IRRA DSpace April 2006 Claire Knowles University of Edinburgh.
Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
A REST-ful Web Services Approach to Library Federated Search using SRU Kevin Reiss Rutgers-Newark Law Library CALI 2005 – June 11th.
October 28, 2003Copyright MIT, 2003 METS repositories: DSpace MacKenzie Smith Associate Director for Technology MIT Libraries.
The UM Libraries’ Frost Concert Archive Documenting the Performance History of the University of Miami Frost School of Music Amy Strickland University.
Online sheet music Jenn Riley Metadata Librarian Indiana University.
The Documentum Team Lance Callaway, Brooke Durbin, Perry Koob, Lorie McMillin, Jennifer Song Missouri University of Science and Technology Rolla, Missouri.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Building Collections Using Greenstone Tod A. Olson Sr. Programmer/Analyst Digital Library Development Center University of Chicago Library
WMS: Democratizing Data
UCLA Digital Library UC Digital Library Forum August 5, 2002 UCLA Digital Library Presenter: Curtis Fornadley Senior Programmer/Analyst.
River Campus Libraries Find Articles A Web Redesign for ENCompass David Lindahl Web Initiatives Manager River Campus Libraries University of Rochester.
Mgt 240 Lecture Website Construction: Software and Language Alternatives March 29, 2005.
Persistent Digital Archives and Library System (PeDALS) A Guide for Wisconsin State Agencies.
Live Meeting APIs Robert Devine Program Manager Microsoft Corporation.
Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Chapter 5 Application Software.
Databases & Data Warehouses Chapter 3 Database Processing.
UNIT-V The MVC architecture and Struts Framework.
Variations On Video project update DLF Fall Forum 2010 Jon Dunn, Indiana University Claire Stewart, Northwestern University November 2, 2010.
“This presentation is for informational purposes only and may not be incorporated into a contract or agreement.”
Digital Library Architecture and Technology
Making the SHiFt: Using Sufia with Hydra/Fedora for collection management and access James Halliday Programmer/Analyst, Library Technologies Juliet L.
Open Source Software Sustainability: A Case Study of Indiana University's Variations Software Jon W. Dunn, Phil Ponella, and Robert H. McDonald Indiana.
Trimble Connected Community
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Building a Fedora Architecture to Support Diverse Collections Jon Dunn Ryan Scherle Digital Library Program Indiana University.
WORKFLOWS AND OTHER CONSIDERATIONS FOR DIGITIZATION  Steve Bingo  Processing Archivist Washington State University Libraries  Alex Merrill  Assistant.
1. 2 introductions Nicholas Fischio Development Manager Kelvin Smith Library of Case Western Reserve University Benjamin Bykowski Tech Lead and Senior.
Web based METS creation Ralf Stockmann case study.
IUScholarWorks is a set of services to make the work of IU scholars freely available. Allows IU departments, institutes, centers and research units to.
Organizing Internet Resources OCLC’s Internet Cataloging Project -- funded by the Department of Education -- from October 1, 1994 to March 31, 1996.
Technology Choices for the JSTOR Online Archive Presented by Chang Feng Department of Computer Engineering and Computer Science, University of Missouri-Columbia,
Searching Sheet Music: IN Harmony Final Report Stacy Kowalczyk Digital Library Program Brownbag Spring Series February 13, 2008.
The Portal to Texas History: Harnessing Technology to Enable Collaboration with Small Museums and Libraries CNI, December 6, 2005 Cathy Nelson Hartman.
Overview of IU Digital Collections Search Hui Zhang Jon Dunn Indiana University Digital Library Program IU Digital Library Brown Bag October 19, 2011.
Digital Filing A Simple Way to Digitally Centralize and Distribute Documents.
Integrating a Statewide Web Gateway With Digital Collections ______________________ Eric Weig and Beth Kraemer University of Kentucky and KCVL.
Implementing a Data Publishing Service via DSpace Jon W. Dunn, Randall Floyd, Garett Montanez, Kurt Seiffert May 20, 2009.
Robin L. Dale Director of Digital & Preservation Services LYRASIS Getting Started with the Digital Commonwealth.
EVIA Digital Archive New Tools William G. Cowan Mike Durbin Digital Library Program EVIA Digital Archive DLP Brown Bag 20 September 2006.
© Paradigm Publishing Inc. 5-1 Chapter 5 Application Software.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
Emory ▪ notre dame ▪ oregon state ▪ virginia tech The OCKHAM Project And Digital Library Services Registries.
January 31, 2007 DLP Brownbag IN Harmony Brownbag Series January 31, 2007 Stacy Kowalczyk, Jenn Riley, Nikki Roberg.
Best Practices for Digital Imaging and Metadata Roy Tennant The Library, University of California, Berkeley
Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Libraries and Museums Jenn Riley Metadata Librarian Indiana University Digital Library.
MOODy :) Investigations into Massive Open Online Discovery at IU Juliet Hardesty Courtney Greene McDonald Bryan J Brown
Persistent Digital Archives and Library System (PeDALS)
Chapter 29 World Wide Web & Browsing World Wide Web (WWW) is a distributed hypermedia (hypertext & graphics) on-line repository of information that users.
April 25, 2012 Making the Most of Library Collaboration and Cooperative Projects Partnering for Discovery: Jennifer LissErika Dowell Metadata/Cataloging.
CIS 210 Systems Analysis and Development Week 8 Part II Designing Distributed and Internet Systems,
How to Set Up a System for Teaching Files, Conferences, and Clinical Trials Medical Imaging Resource Center.
The library is open Mobile Applications Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business Development.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
A technical overview Image Collection Workflow and Tools Michael Durbin 2010 Brown Bag Presentation Series April 21, 2010.
1 « Luxembourg, 18 April 2007 « Virtual Library of Official Statistics « Dissemination Working Group.
Memory Masters Preserving Digitized Histories— for today, for tomorrow, and for the future This project is made possible by a grant from the federal Institute.
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
Metadata V1 By Dick M.A. Schaap – technical coordinator Oostende, June 08.
MIRC Overview Medical Imaging Resource Center John Perry RSNA 2009.
MIRC Overview Medical Imaging Resource Center. RSNA2006 MIRC Courses Overview of the RSNA MIRC Software Installing MIRC on Your Laptop Using MIRC for.
EVIA Digital Archive Technical Overview EVIA Digital Archive DLP Brown Bag: 7 December 2005.
Bentley Project Reel Digitization Bentley Historical Library t
UNC Digital Library Project
Library Technology Conference: Building Exhibits
eCulture Science Gateway – reloaded
DIGITAL LIBRARY.
Presentation transcript:

A Multi-Tiered Architecture for Distributed Data Collection and Centralized Data Delivery Stacy Kowalczyk and James Halliday April 28, 2008

Project Overview IN Harmony is An IMLS funded grant Awarded in Fall 2004 To be competed in Fall 2008 A partnership of Indiana University Digital Library Program Indiana University Lilly Library Indiana State Library Indiana State Museum Indiana Historical Society April 28, 2008IN Harmony – DLP Spring Forum 2008

Project Goals 1.To provide a model for fostering collaborative digital library development by partnering with institutions with complementary collections; 2.To digitize a portion of the sheet music from these collections and offer access to these materials free of charge on the web; 3.To bring these materials and their attendant metadata together on a single web site, offering both federated searching of the entire collection and searching of one or more selected collections; April 28, 2008IN Harmony – DLP Spring Forum 2008

Deliverables Tools to Process the images Capture metadata Provide search and display functions 10,000 pieces of sheet music scanned and cataloged 4,000 Indiana University Lilly Library 2,000 Indiana State Library 2,000 Indiana State Museum 2,000 Indiana Historical Society April 28, 2008IN Harmony – DLP Spring Forum 2008

Cataloging and Imaging Workflow Goals Data integrity Quality of the scans Quality of the metadata Accuracy of the links between page images Accuracy of the links between metadata and images Simplicity of use Balance of flexibility and constraints April 28, 2008IN Harmony – DLP Spring Forum 2008

Cataloging and Imaging Use Cases 1.Catalog first 2.Scanning first 3.Metadata created in another system and imported into IN Harmony April 28, 2008IN Harmony – DLP Spring Forum 2008

Digitizing Quality Control 2 phased Quality Control Process Automated QC process verifies: All TIFF tags of every digital file TIFF must be uncompressed Files names Embedded profile appropriate to its bit depth Consistency of pixel dimensions within a score Appropriate resolution April 28, 2008IN Harmony – DLP Spring Forum 2008

Digitizing Quality Control (2) Manual QC – at 100% pixel display, verify: Correct page orientation and order Correct color balance Sharp and in-focus scan No digital artifacts When all QC is passed, derivative files are created Large and small jpgs for screen delivery PDF sized for 8.5 x 11 printing April 28, 2008IN Harmony – DLP Spring Forum 2008

Digitizing Quality Control Software

Designing the metadata model User studies Work with the partners Define fields Write cataloging guidelines with partner input Representation in MODS April 28, 2008IN Harmony – DLP Spring Forum 2008

Types of fields Title elements Name elements Publication elements Subject elements Identification elements Note elements Cover information April 28, 2008IN Harmony – DLP Spring Forum 2008

Metadata Collection Tool

Public Search and Discovery System Demo Demo December 13, 2015Customize footer: View menu/Header and Footer

A RCHITECTURE O VERVIEW J IM H ALLIDAY December 13, 2015Customize footer: View menu/Header and Footer

IN Harmony Technical Overview Fedora Web Browser SRU and http Mass Storage System Oracle Cataloging Client Quality Control Scanner Authentication Service Java Swing MODs Export FTP Perl Web Application

Getting Data Into IN Harmony 2 primary data sources Cataloging client Image QC/upload application Other data sources XML data exported from other cataloging systems Score images exported from older systems April 28, 2008IN Harmony – DLP Spring Forum 2008

Image QC/upload application 1.User scans scores and uploads to IN Harmony server 2.User accesses Perl-based web application to initiate automated quality control 3.A second user proceeds with manual QC, then uses web application to signal that manual QC is finished 4.The application moves and backs up the files, creates derivatives, and alerts both Fedora and the internal database that the process is complete April 28, 2008IN Harmony – DLP Spring Forum 2008

IN Harmony Derivatives Three sizes of JPG’s produced per page Full (1200px high) Screen (600px high) Thumb (200px high) Multi-page, playable PDF Approx. 1MB for an average score April 28, 2008IN Harmony – DLP Spring Forum 2008

IN Harmony cataloging client Standalone Java Swing based client Connects to Oracle database and outputs MODS for Fedora ingestion Implemented as a client-server application via web services using Axis Specialized UI components (such as ‘smart’ combo boxes) assist with quick, correct data entry April 28, 2008IN Harmony – DLP Spring Forum 2008

Internal IN Harmony database Oracle database stores record and user data in our own internal format Communicates with upload/QC application, and cataloging client Cataloging client and internal scripts can output to MODS format for ingestion into Fedora April 28, 2008IN Harmony – DLP Spring Forum 2008

IN Harmony authentication CAS (IU’s Central Authentication Service) is used to authenticate all users Non-IU users must create IU Guest Accounts to authenticate All account/password maintenance in user’s control April 28, 2008IN Harmony – DLP Spring Forum 2008

Fedora and IN Harmony Fedora used as a single storage and infrastructure solution for Digital Library Program projects as IU Data (score images and metadata) ingested into Fedora and referenced as METS objects Master images sent to IU’s mass storage system Derivatives stored internally Objects indexed using Lucene for SRU-based searching April 28, 2008IN Harmony – DLP Spring Forum 2008

Fedora Object Model Collection Sheet music Copy Page

IN Harmony end-user interface - Java Struts based web application - Offers searching, browsing, and record display - Each partner institution is offered a personalized view of their data only Interaction with Fedora - Application sends CQL queries to Fedora and retrieves MODS data which is transformed via XSLT - PURLs (persistent URL’s) are used to access image derivatives April 28, 2008IN Harmony – DLP Spring Forum 2008

METS Navigator METS Navigator is used to page through scores online Uses METS structmap to facilitate navigation Allows views of multiple sizes of images Released by IU as open source – see April 28, 2008IN Harmony – DLP Spring Forum 2008

IN Harmony Technical Overview Fedora Web Browser SRU and http Mass Storage System Oracle Cataloging Client Quality Control Scanner Authentication Service Java Swing MODs Export FTP Perl Web Application

IN Harmony Links IN Harmony Public Interface IN Harmony Project Information Cataloging Tool Release date – June 2008 April 28, 2008IN Harmony – DLP Spring Forum 2008

Questions? April 28, 2008IN Harmony – DLP Spring Forum 2008