Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program

Similar presentations


Presentation on theme: "Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program"— Presentation transcript:

1 Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu

2 October 2, 2003ALI Digital Library Workshop Storage Why is storage an issue? Space requirements Persistence Accessibility Needs depend on purpose of storage Capture/encoding Access/delivery Preservation

3 October 2, 2003ALI Digital Library Workshop Storage: Working Space Space for storage of digital files during capture/encoding/quality control process Possibilities PC hard drive File server / LAN Issues Capacity, backup, speed, accessibility

4 October 2, 2003ALI Digital Library Workshop Storage: Access/Delivery Storage of derivative files for web delivery Image, audio, video, text files, etc. Possibilities Local web server Commercially-hosted web site Consortial service provider Issues: capacity, backup, performance, software integration, maintenance/migration

5 October 2, 2003ALI Digital Library Workshop Storage: Preservation Much harder problem Longer term Issues of longevity of media, hardware, file format “Where did we put the files?” Larger files Hard disk storage, traditional backup methods not cost-effective Infrequency of access Problems do not become immediately evident

6 October 2, 2003ALI Digital Library Workshop Long-Term Storage Options Removable media stored offline Optical CD-R (CD-Recordable) DVD-R (DVD-Recordable), DVD+R, DVD+RW, DVD-RW, … Tape DLT, 8mm, DAT, … Pros: cheap, easy, produces tangible item Cons: Low capacity, physical space requirements, unknown longevity, migration, potential format obsolescence Online/nearline storage systems HSM: Hierarchical Storage Management Combine disk and automated tape storage with software to keep track of where files are located Locally managed or remote provider Pros: high capacity, migration can be handled by software, Cons: expensive, complex, network bandwidth issues, must trust service provider, potential single point of failure

7

8

9 October 2, 2003ALI Digital Library Workshop HSM Example: IU’s Massive Data Storage Service (MDSS) HPSS (High Performance Storage System) software Developed as collaboration of IBM and US national labs Four tape robots 2 in Bloomington, 2 in Indianapolis Data can be mirrored 540 terabytes (TB) total storage ~75 TB used as of April 2001

10 October 2, 2003ALI Digital Library Workshop A digital object is more than just a file! Hi-res page image files (TIFF) Delivery page image files (JPEG) Text file (TEI/XML) Metadata

11 October 2, 2003ALI Digital Library Workshop A digital object is more than just a file! EAD Finding Aid

12 October 2, 2003ALI Digital Library Workshop DL Objects Digital library “objects” have many parts Metadata Preservation/archival files Delivery files How do we keep them connected? Now: Good practice in file naming, directory organization, project documentation -not scalable! Future: Digital object repository

13 October 2, 2003ALI Digital Library Workshop Data Persistence Key is migration Keeping the bits alive Physical media Logical media format Keeping the bits understandable File format Metadata Small “pockets” of digital content pose a problem for migration

14 October 2, 2003ALI Digital Library Workshop DL Object Repository Preservation version in HSM Delivery version(s) on web server Metadata records Repository System Users and applications

15 October 2, 2003ALI Digital Library Workshop Web Delivery Functions Searching Metadata Full text Browsing By subject, date, author, … Navigation Page turning, image panning/zooming, … Streaming For audio/video Reuse Downloading, format conversion Linking, persistent naming Access control If necessary

16 October 2, 2003ALI Digital Library Workshop Digital Collection Delivery Software Very complex systems Need to integrate data from databases, full-text search engines, file systems, and other sources Cross-collection searching Commercial ContentDM, Luna Insight, various library management system addons Open source UMich DLXS, Greenstone, Eprints, MIT DSpace, … Homegrown

17

18 October 2, 2003ALI Digital Library Workshop Demonstration Hoagy Carmichael Collection, IU Digital Library Program http://www.dlib.indiana.edu/collections/hoagy/

19

20 October 2, 2003ALI Digital Library Workshop Exposing Digital Resources Broadly Pay services RLG Cultural Materials, Archival Resources Free services University of Michigan OAIster www.oaister.org UIUC Digital Gateway to Cultural Heritage Materials oai.grainger.uiuc.edu OAI-PMH Open Archives Initiative Protocol for Metadata Harvesting www.openarchives.org Google

21 October 2, 2003ALI Digital Library Workshop OAI Metadata Harvesting Extract metadata from various sources Build services on local copies of metadata user... search for “Indiana” local copy of metadata harvested offline metadata harvested offline metadata harvested offline metadata harvested offline all searching, browsing, etc. performed on the metadata here Data providers Service provider

22 October 2, 2003ALI Digital Library Workshop More Information Bibliography to be made available at: http://www.dlib.indiana.edu/workshops/alioct03/


Download ppt "Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program"

Similar presentations


Ads by Google