Presentation is loading. Please wait.

Presentation is loading. Please wait.

Storage Why is storage an issue? Space requirements Persistence Accessibility Needs depend on purpose of storage Capture/encoding Access/delivery Preservation.

Similar presentations


Presentation on theme: "Storage Why is storage an issue? Space requirements Persistence Accessibility Needs depend on purpose of storage Capture/encoding Access/delivery Preservation."— Presentation transcript:

1 Storage Why is storage an issue? Space requirements Persistence Accessibility Needs depend on purpose of storage Capture/encoding Access/delivery Preservation

2 Storage: Working Space Space for storage of digital files during capture/encoding/quality control process Possibilities PC hard drive File server, e.g. marengo (LIT) DLP file server Issues Capacity, backup, speed, accessibility

3 Storage: Access/Delivery Storage for web delivery of images, audio, text, etc. Possibilities UITS web server, under library account UITS streaming media server (audio/video) DLP web server Issues: capacity, backup, performance, software integration, maintenance/migration

4 Storage: Preservation Much harder problem Longer term Issues of longevity of media, hardware, file format Where are the files? Larger files Hard disk storage, traditional backup methods not cost-effective Infrequency of access Problems do not become immediately evident

5 Long-Term Storage Options Removable media e.g. CD-R, DVD-R Pros: cheap, easy, produces tangible item Cons: Low capacity, physical space requirements, unknown longevity, migration Nearline storage UITS Massive Data Storage Service

6 UITS MDSS Massive Data Storage Service HPSS (High Performance Storage System) software Developed as collaboration of IBM and US national labs Four tape robots (two at IUB, two at IUPUI) Data can be mirrored 540 TB total storage ~75 TB used as of April 2001

7 MDSS – A Sense of Scale 2 Kilobytes : A typewritten page 5 Megabytes : Complete works of Shakespeare OR 30 seconds of TV quality video 1 Gigabyte (1000MB) : 1 pickup truck filled with paper OR a symphony in hi-fi sound 1 Terabyte (1000GB) : All the X-ray films in a large hospital OR paper from 50,000 trees 10 Terabytes : The printed collection of the US Library of Congress 50 Terabytes : The contents of a large mass store system 8 Petabytes (8000TB) : All information available on the web 200 Petabytes : All the printed material (in the world!)

8 MDSS Storage Infrastructure

9 MDSS Access FTP/PFTP: (Parallel) File Transfer Protocol DFS: Distributed File System (being phased out) HSI Not practical for delivery Hierarchical storage (metadata on disk, data on tape -> 30-90 second to start transfer.) File size – chunks of 50 MB or greater work best Small files aggregated into larger.tar or.zip files

10 DL Objects Digital library “objects” have many parts Metadata Preservation files Delivery files How do we keep them connected? Now: Good practice in file naming, directory organization, project documentation -not scalable! Future: Digital object repository

11 Data Persistence Key is migration Keeping the bits alive - MDSS responsibility Physical media Logical media format Keeping the bits understandable - MDSS user responsibility File format Metadata Small “pockets” of digital content pose a problem for migration

12 DL Object Repository Preservation version in MDSS Delivery version on web server Metadata records Repository System Users and applications

13 DL Repository Models OAIS: Open Archival Information System Reference model Fedora: Flexible and Extensible Digital Object and Repository Architecture Developed at Cornell and UVa IU DLP in deployment group

14 DLP Storage Services Consulting Server space for production and access Persistent naming service (PURL server) Facilitation of access to UITS services Streaming media MDSS Developing repository service Contact: diglib@indiana.edu


Download ppt "Storage Why is storage an issue? Space requirements Persistence Accessibility Needs depend on purpose of storage Capture/encoding Access/delivery Preservation."

Similar presentations


Ads by Google