Presentation is loading. Please wait.

Presentation is loading. Please wait.

Besser--Planning (Brazil) 31/5/01 1 Planning to Maximize Longevity of Digital Information Howard Besser UCLA School of Education & Information

Similar presentations


Presentation on theme: "Besser--Planning (Brazil) 31/5/01 1 Planning to Maximize Longevity of Digital Information Howard Besser UCLA School of Education & Information"— Presentation transcript:

1 Besser--Planning (Brazil) 31/5/01 1 Planning to Maximize Longevity of Digital Information Howard Besser UCLA School of Education & Information http://www.gseis.ucla.edu/~howard

2 Besser--Planning (Brazil) 31/5/01 2 Planning to Maximize Longevity of Digital Info-  Access and Preservation  Why are you Managing this Information?  Key Considerations for Imaging Projects  Important Planning Considerations  Models for Digital Collections  Importance of Metadata Standards  Digital Longevity Issues  More Planning Issues

3 Besser--Planning (Brazil) 31/5/01 3 Access and Preservation _ Digitizing can serve both Access and Preservation –E.g. Access to digital surrogates saves wear & tear on originals _ But Digitization for Access can be quite different than Digitization for Preservation –Level of detail, scanning quality, extensiveness of resources –And long-term retention of digital works is still an open issue

4 Besser--Planning (Brazil) 31/5/01 4 Why are you Managing this Information?  Organizational mission & type  Users  Uses

5 Besser--Planning (Brazil) 31/5/01 5 Key Considerations for Imaging Projects-  Users' Needs  Image Quality  Intellectual Property  Standards  Topology  Tools & Processes

6 Besser--Planning (Brazil) 31/5/01 6 Key Considerations for Imaging Projects (1 of 3)  Users' Needs – Quality of Digital Surrogate – Interoperable desktop applications  Image Quality – Archival – Current online delivery

7 Besser--Planning (Brazil) 31/5/01 7 Key Considerations for Imaging Projects (2 of 3)  Intellectual Property  Standards – Modular and Layered Architecture – Terminology – Technical imaging information  Topology

8 Besser--Planning (Brazil) 31/5/01 8 Key Considerations for Imaging Projects (3 of 3)  Tools & Processes – Scanners – Compression techniques – Linking files – Workflow – Interoperable desktop applications

9 Besser--Planning (Brazil) 31/5/01 9 Some nuts-and-bolts Planning Considerations  Think about users (and potential users), uses, and type of material/collection  Scan at the highest quality that does not exceed the likely potential users/uses/material  Do not let today’s delivery limitations influence your scanning file sizes; understand the difference between digital masters and derivative files used for delivery  Many documents which appear to be bitonal actually are better represented with greyscale scans  Include color bar and ruler in the scan  Use objective measurements to determine scanner settings (do NOT attempt to make the image good on your particular monitor or use image processing to color correct)  Don’t use lossy compression  Store in a common (standardized) file format  Capture as much metadata as is reasonably possiple (including metadata about the scanning process itself)

10 Besser--Planning (Brazil) 31/5/01 10 Why Scale is important

11 Besser--Planning (Brazil) 31/5/01 11 Important Planning Considerations  File Formats  Choosing Interoperable Systems  Adhere to standards  Vendors with large installed base  Refreshing and/or Migration

12 Besser--Planning (Brazil) 31/5/01 12 Key problems we’re facing  Discovery  Longevity-  Interoperability-

13 Besser--Planning (Brazil) 31/5/01 13 Serious Longevity Problems  What we know from prior widespread digital file formats  Images separating from their metadata  Inaccessibility of software needed to view an image  Inability to even decode the file format of an image  …return to Longevity problem later-

14 Besser--Planning (Brazil) 31/5/01 14 Traditional Digital Library Model DL user search & presentation

15 Besser--Planning (Brazil) 31/5/01 15 Ideal Digital Library Model DL user search & presentation

16 Besser--Planning (Brazil) 31/5/01 16 For Interoperability Digital Libraries Need Standards  Descriptive Metadata for consistent description  Discovery Metadata for finding  Administrative Metadata for viewing and maintaining  Structural Metadata for navigation ... Terms & Conditions Metadata for controlling access...

17 Besser--Planning (Brazil) 31/5/01 17 Why are Standards and Metadata consensus important?  Managing digital files over time  Longevity  Interoperability  Veracity  Recording in a consistent manner  Will give vendors incentive to create applications that support this

18 Besser--Planning (Brazil) 31/5/01 18 Why Standards?  Why do we need standards? – To make information universally available to users – facilitate sharing and interchange of information – To preserve information (make it safe from changes in hardware and software)  Standards only work if communities widely accept them, but they’re necessary for communities to work together

19 Besser--Planning (Brazil) 31/5/01 19 Questions to Ask  What communities is this standard designed for?  What type of information is this standard designed to handle?  What functions is this standard designed to serve?  What previous standards is it built upon?  Does the standard prescribe how to create new records (or parts of records), or how to map from existing records?  How far does the standard go? Semantics: Does it define element sets? Rules? Syntax?-

20 Besser--Planning (Brazil) 31/5/01 20 Semantics/Syntax/Structure _ Semantics – meaning, as defined by a community to meet their particular needs (DC) _ Syntax – a systematic arrangement of data elements for machine processing – facilitates the exchange and use of metadata among various applications (HTML, XML, RDF) _ Structure – a formal arrangement of the syntax with the goal of consistent representation of the semantics (rules defining field contents like 1/11/99)

21 Besser--Planning (Brazil) 31/5/01 21 The Short Life of Digital Info: Digital Longevity Problems-  Disappearing Information  The Viewing Problem  The Scrambling Problem  The Inter-relation Problem  The Custodial Problem  The Translation Problem

22 Besser--Planning (Brazil) 31/5/01 22 The Viewing Problem  Digital Info requires a whole infrastructure to view it  Each piece of that infrastructure is changing at an incredibly rapid rate  How can we ever hope to deal with all the permutations and combinations

23 Besser--Planning (Brazil) 31/5/01 23 The Scrambling Problem Dangers from:  Compression to ease storage & delivery  Container Architecture to enhance digital commerce

24 Besser--Planning (Brazil) 31/5/01 24 The Inter-relation Problem  -Info is increasingly inter-related to other info  -How do we make our own Info persist when it points to and integrates with Info owned by others?  -What is the boundary of a set of information (or even of a digital object)?

25 Besser--Planning (Brazil) 31/5/01 25 The Custodial Problem  How do we decide what to save?  Who should save it?  How should they save it? – -methods for later access: emulation, migration, etc. – -issues of authenticity and evidence

26 Besser--Planning (Brazil) 31/5/01 26 The Translation Problem  Content translated into new delivery devices changes meaning – -A photo vs. a painting – -If Info is produced originally in digital form in one encoded format, will it be the same when translated into another format? – Behaviors

27 Besser--Planning (Brazil) 31/5/01 27 Pieces of the Solution (1/2)  -We need to insist upon clearly readable standardized ways for digital objects to self- identify their formats  -We should discourage scrambling  -We need to better understand information inter-relates to other Info, and what constitutes “boundaries” of Info objects

28 Besser--Planning (Brazil) 31/5/01 28 Pieces of the Solution (2/2)  -People and organizations wishing to make information persist need guidelines of how to go about doing it  -We need to better understand how translating from one storage or display format to another affects the meaning of a work  -We need to save the “behaviors” of a digital object, not just it’s “contents”

29 Besser--Planning (Brazil) 31/5/01 29 Conceptual Approaches to Digital Preservation _ Refreshing always necessary due to volatility of physical strata –Impact on evidential value _ Migration -- advantages & disadvantages _ Emulation -- advantages & disadvantages

30 Besser--Planning (Brazil) 31/5/01 30 Metadata can be the first line of defense  Can tell you – where the file is (if you can’t find the file) – where more info about the file is (if you have the file but most other metadata has become separated) – what the file format is – what the compression scheme is – what application program and version is needed for the file

31 Besser--Planning (Brazil) 31/5/01 31 Groups Working on the Big Problem http://sunsite.berkeley.edu/Longevity/  CPA Task Force  Getty “Time & Bits” Conference & Follow-ups-  Emulation experiments in US and Europe  NEDLIB, CURL, Michigan  Mellon-funded E-Journal Archive experiments  Internet Archive  Long Now

32 Besser--Planning (Brazil) 31/5/01 32 Time & Bits

33 Besser--Planning (Brazil) 31/5/01 33 Time & Bits Participants  Steward Brand  Howard Besser  Brian Eno  Danny Hillis  Peter Lyman  Brewster Kahle  Kevin Kelly  Jaron Lanier  Doug Carlston  John Heilemann  Ben Davis  Margaret MacLean  Bruce Sterling  Paul Saffo

34 Besser--Planning (Brazil) 31/5/01 34 Groups Working on Pieces of the Big Problem http://sunsite.berkeley.edu/Longevity/  Internet Archive  Long Now  Emulation experiments in US and Europe  NEDLIB, CURL, Michigan

35 Besser--Planning (Brazil) 31/5/01 35 Journal Archiving _ License, don’t own; may not be even able to obtain right to make archival copy _ Increasingly no paper back-up at all _ Usually we don’t have the important redundancy factor _ Stanford’s LOCKSS Project (Lots of Copies Keeps Stuff Safe) and its problems (http://lockss.stanford.edu)

36 Besser--Planning (Brazil) 31/5/01 36 Migration/Refreshing  Impact on evidential value

37 Besser--Planning (Brazil) 31/5/01 37 More Planning Issues _ Image Families _ Behaviors _ Persistent Identification

38 Besser--Planning (Brazil) 31/5/01 38 Identification/Provenance (Images)-  The number of variant forms of a work can be enormous  Image Families  A digital image frequently has many layers of parentage  Information about the parentage that can indicate the quality and veracity of the image (Dublin Core "Source" and "Relation")  how to deal with different versions derived from the same scan or different encoding schemes  Vocabulary Standards to express this

39 Besser--Planning (Brazil) 31/5/01 39 The number of variant forms of a work can be enormous  different views of the same object  different scans of the same photo  different resolutions  different compression schemes  different compression ratios  different file storage formats  different details of the same image ...

40 Image Families

41 Besser--Planning (Brazil) 31/5/01 41 Identification/Provenance  how to deal with different versions (browse, hi-res, medium res) derived from the same scan or different encoding schemes (TIFF, PICT, JFIF)  Vocabulary Standards to express this – VRA Surrogate Categories – CIMI's "Image Elements”

42 Besser--Planning (Brazil) 31/5/01 42 MOA II Behaviors  Navigation  Display/Print

43 Besser--Planning (Brazil) 31/5/01 43 MOA II Best practices  Use/Users/Collection:  Benchmarking  Masters vs. Derivatives  Scanning-  Administrative Metadata-  Structural Metadata-

44 Besser--Planning (Brazil) 31/5/01 44 To deal with Immediately _ Persistent IDs _ Metadata

45 Besser--Planning (Brazil) 31/5/01 45 Persistent IDs--the Problem _ Need to separate work ID from work location _ URNs probably won’t be ready until 2003 _ Becomes a business process issue when one organization maintains the resource and another organization references it (ie. licensed from vendors or managed by separate administrative structures)

46 Besser--Planning (Brazil) 31/5/01 46 More Persistent IDs --the Approach for today _ PURLs _ Handles _ HTTP redirects _ And worry about costs now and conversion costs when URNs become feasible

47 Besser--Planning (Brazil) 31/5/01 47 Data Set Management More issues with referencing IDs _ References for mirror sites _ References for back-up sites when main site is down or bottle-necked _ References for off-site copies and archival copies

48 Besser--Planning (Brazil) 31/5/01 48 One Final Question: Who will collect the digital works of today that should become the Special Collections of tomorrow? _ web sites _ zines _ electronic journals _ listserve and email discussions _ drafts of works that later become famous

49 Besser--Planning (Brazil) 31/5/01 49 Howard Besser UCLA School of Education & Information  http://sunsite.berkeley.edu/Longevity/  http://www.gseis.ucla.edu/~howard  http://sunsite.berkeley.edu/moa2  http://lockss.stanford.edu  http://www.longnow.com/10klibrary/TimeBitsDisc/  http://www.archive.org/ Planning to Maximize Longevity of Digital Information


Download ppt "Besser--Planning (Brazil) 31/5/01 1 Planning to Maximize Longevity of Digital Information Howard Besser UCLA School of Education & Information"

Similar presentations


Ads by Google