Presentation is loading. Please wait.

Presentation is loading. Please wait.

National Digital Information Infrastructure and Preservation Program by the Library of Congress (US)

Similar presentations


Presentation on theme: "National Digital Information Infrastructure and Preservation Program by the Library of Congress (US)"— Presentation transcript:

1 National Digital Information Infrastructure and Preservation Program by the Library of Congress (US)

2 Since digital Age has became somehow real, publishing electronically has become much cheaper and easier than publishing with traditional technics. Therefore the number of Publications has grown exponentially in new media.

3 The Problem has already become urgent, because if information is so easy to publish it is also very likely, that it disappears quickly. Experts estimated for the time between 1998 and 1999, that 44% of the websites at that time had already disappeared one year later. The average life-span of a websites was at that time 44 days!

4 The Challenge is to decide which information should be stored and how. Therefore the Library of Congress presented a Plan providing some considerations and suggesting a few solutions to those problems: Authorised by American Congress the Library of Congress describes in this plan how to capture, select and organize digital objects, which are of historical and social significance preserve them and make them accessible.

5 View on objects, that are "born digital“ (=Relating to a document that was created and exists only in a digital format), or at the time existing as digital content. Both are at risk of disappearing. Digital information is extremely fragile, inherently impermanent, and difficult to assess for long-term value. Besides there is in almost any kind of this digital information the question of intellectual properties rights that is more complex than in traditional information media.

6 The LC brought together different stakeholders, representing different departments and made interviews with experts in a number of relevant fields. The meetings established some baseline areas of consensus on: - the need for the national preservation initiative, - the need for a distributed or decentralized solution, - the need for more research into the technologies for digital preservation, - the recognition that technology is an important part of the solution.

7 Difference between analog and digital preservation: Digital preservation contains larger amounts of information created in a greater variety of formats and distributed in new venues to a broader and more heterogeneous user base.This complex environment demands an infrastructure that will: - support the needs of multiple communities over long periods of time, - respond to rapidly changing technologies and innovative behaviors, - be transparent and trustworthy.

8 Plans for action in the NDIIPP - Selecting and Collection Department - Intellectual Property Department - Business Models Department - Standards and Best Practices Department - Communication and Outreach Department - Digital Preservation Architecture Department

9 Selecting and Collection Department - the scope of collecting national materials - developing agreements with libraries, archives and other institutions - developing guidelines for assessing content of enduring value - examine curatorial best practices for selecting dynamic objects, - defining the boundaries of Web-based content for preservation purposes

10 Intellectual Property Rights Department - investigation of the options and authorities necessary to preserve digital, - development of acceptable methods of access to digital content, - investigation of the effects of obligatory deposit for digital content, - investigation of the effects of security and protection devices for preservation - development of a better understanding of the international context of copyright, scope, responsibility, and reach of applicable law.

11 Standards and Best Practices Department - coordinating and documenting standards that support key preservation services, - cultivate research and best practice recommendations for formats and encoding schemes, - cultivate research and development of strategies, such as migration and emulation, that will ensure sustainability of digital content - developing a communication strategy to track technology changes and their impact on preservation.

12 Communication and Outreach Department - maintaining the NDIIPP Web site, featuring current information on the program's status, - outreach to professional groups through participation in professional meetings and contributions to professional literature - outreach to the public through print and Web-based general interest publications and through the broadcast media.

13 Digital Preservation Architecture Department - convene a design group to further develop the components of the preservation architecture, - ask for proposals to test and model components of the system - evaluate project outcomes to inform a next generation of implementations.

14 A proposed preservation architecture has four layers: -a Repository layer, for the long-term storage of digital data, -a Gateway layer, which provides protection and control for the Repositories, -a Collection layer, where agreements and decisions about the acquisition, access, and context of preserved digital materials are made -an Interface layer, where those materials that visitants are allowed to access are made available

15 Challenges of Collecting and Preserving Digital Content Today - rapid changes in technology - the multiplicity of formats, - hardware and software obsolescence - storage media that promise stability, such as CDs, are subject to unpredictable degradation - technical differences among the formats and technical standards - for film, television and sound, as well as early computer files, problems associated with playback. - Importance to clarify what content is being created with new technologies and what problems arise from the nature of these digital objects.

16 Transforming Content for the future: The formats explored: e-books, e-journals, digital music, digital television, digital video, and Web sites. They present enormous technical challenges in their creation, distribution, and preservation: -They redefine genres: even such things as books and journals are redefined online. - They blur the line between published and unpublished

17 - The new formats demand new approaches for selecting and cataloging. - Some genres break traditional ties between ownership and preservation, - They require careful thought to what constitutes the so-called „best edition“ of a given work, that is, what should be deposited for copyright, in what format, and under what conditions. - there is no one solution for preservation, that works for all any more

18 E-BOOKS AND E-JOURNALS Difference between E-Books and printed Books: - Because e-books are often read on handheld devices there is a difference between the size of a printed page and an e-book screen - so even elementary things, such as pages and page numbers, need to be rethought for the e-book. - need of new standards for online books that allow conformity among different proprietary software approaches. - the problem is competing standards: the future will be shaped by competition

19 Intellectual Property Rights Problems with E-Books /-Journals - Sharing of text has been the norm in print, protected by the doctrine of first sale. - advertising rates for journals, for example, are calculated to the estimate of how many times one copy will be shared. Solution at the present: - given through proprietary software and hardware devices -> but that are barriers to cost-effective and scalable preservation approaches.

20 Journals: Def: serial publications that aggregate articles by different authors. Journals usually comprise a great variety of information, from articles and short features, to editorial board listings, graphics, photographs, and advertising.

21 Intellectual Property Rights Problems with E-Journals: - What is a library to preserve? - What are implications for science and the need to test and reproduce results if linked data in one article are later unavailable? - To preserve an article, must one preserve all the links? Does one have a right to? - What should a publisher do about correcting errata online? - Does the corrected version supersede the first one? - Are they both necessary for the historical record?

22 Online advertisements: -ads are targeted for specific audiences, often created "on the fly" (dynamically), and frequently updated. - In printed magazines, advertisements are rich original resources that provide social, economic, artistic, and other context for the contemporary content they accompany.

23 Problems with advertisments: - How should libraries preserve the technically complex advertising that supports so many digital periodicals? - The growing variety of supplemental materials is much more complicated than texts - These all should be prime collecting targets for libraries.

24 DIGITAL SOUND RECORDINGS - The preservation of digital recordings of music are exponentially more complex than those for print materials. - The problems are technical, legal, and economic in scope, - The technology + media in sound recording are more complicated and fragile than a print on paper, and the rights regime surrounding uses more layered than for print publications,

25 DIGITAL SOUND RECORDINGS II - it's difficult to determine who owns rights in the recording itself (opposed to the music), because there is no central registry of such information before 1972. - Because of the fragility of analog tape, wax cylinders, acetate discs, and other media on which sounds have been recorded, reformatting is necessary to secure access to all forms of analog recordings into the future. - Therefore Preservation reformatting should and will be digital.

26 Solution Proposal (Property Rights): Creation of protective digital-rights managementsystems such as the Secure Digital Music Initiative (SDMI). SDMI = a digital watermark system, developed to be read by compatible hardware in an effort to prevent illegal duplication of files.

27 Solution Technically: -The future of audio preservation is systematically managing files in a repository (digital mass-storage systems). - Standards for preservation and repository-related metadata are now being developed. Work will result in refinements of DC definitions as they relate to sound and guidelines for documentation of technical preservation information. - In the field of repository management, the Digital Library Federation’s Metadata Encoding and Transmission Standard project (METS) is especially promising.

28 DIGITAL TELEVISION AND VIDEO Contemporary history cannot be told without a full record of television (Examples: Gulf War or September 11)

29 DIGITAL TELEVISION AND VIDEO II - Preserving digital television and video is related to machine and media dependencies - Problems are a rapid and expensive pattern of technical innovation and obsolescence, with the constant need to refresh and reformat from one medium and machine dependency to another, with the need for massive storage systems and the succession of formats demanding new standards

30 DIGITAL TELEVISION AND VIDEO III - Digital TV and video nearly always embed a mix of elements, each with their own requirements for preservation - that demand very large scale storage systems. - An amount of additional information needs to be preserved with the files - There are digital rights management systems that need to be integrated into the management infrastructure in archives.

31 Solutions: An ‘item-free’ method of distribution will have a great impact on preservation. Instead of moving digital information to tapes for distribution, data will simply consist of a file transfer to some temporary storage device, which might periodically be wiped clean. A Failure to assign clear responsibility for preserving these materials may result in losses. -> Importance of a strong network of institutions willing to claim responsibility for preservation.

32 Storage of Videos: MPEG-7 recognizes the value of metadata and provides intellectual property protection for the descriptors themselves as well as for the video content. Of even greater interest will be information- visualization schemes that collect metadata from numerous video clips and summarize those descriptors together.

33 Storage of Television -Two approaches of storing digital video images. We can store whole programs and create databases that contain metadata. And we can store all of the clips that are included in the program as separate files and then rely on edit decision lists (EDLs) to serve as blueprints for our broadcasts. Both options rely on stratification. Stratification is a system of video annotation that uses time-codes to identify marking points within an audio or video object. Descriptions can be linked to these points by storing them with the time- codeinformation.

34 WEB SITES It is very important for libraries to collect and preserve the content on the Web that is adequate for the institution and for cultural memory, - Anyone can create a publicly available Web page, no prior authorization needed. -In 2002, the Web comprised more than 550 billion public pages and linked documents. - The Web grows by 7 million pages a day. - The mortality rate of Web sites is high: 44 % of the sites available in 1998 were gone by 1999

35 Problems: - The "surface Web" is limited in many cases because the linked materials require a license or other authorization to enter. - The deep Web, where much of the complex and culturally rich materials lies, is quite inaccessible to harvesting technologies. - An average Web page contains 15 links. - How does one define the boundaries of a Web site in difference to another?

36 The Web is familiar to librarians as a medium that contains text, numbers,and images, that indifferently carries content. The challenge of selecting from the Web may sound as simply as with other media, but the scope of materials is vaster. The biggest Challenge is to identify and capture content of enduring value on the Web. Proposals: It may be advisable to start capturing online journals, government information and other items that have a known value

37 Intellectual Property Rights: While we can safely say that content on the Web is protected by copyright, can we determine whether or not a document on the Web is published or unpublished? The answer will have significant impact on alibrary’s ability to capture, preserve, and provide access to that site.

38 OTHER MEDIA There are other media even more complex with the need to be collected and preserved: - those that produce documents “on the fly” as a result of a query to a database. - example is the GIS, GIS will displace the mass production of maps on paper. There are also other formats that currently fill the stacks and storage shelves of libraries that are disappearing. Example: Correspondence is being replaced by e-mail.

39 - Definitions of genres: What is a digital object and what are its boundaries? - Dynamism of data: How does one select and curate digital objects built of dynamic data? - Assessment of value: How does one identify enduring value? - Intellectual property rights:How does one comply with the terms and conditions of use and payment when necessary? Implications of these changes

40 - What are the advantages and disadvantages of Web harvesting versus deposit of source file? - Best editions: What should be deposited for copyright and in which format(s)? - User studies: Who is using digital content and how? In what formats do they prefer it?

41 Thinking about Solutions The Open Archival Information System (OAIS) reference model.It's a powerful abstract model for digital archiving. OAIS defines roles for three players in archiving: creators (that are not only authors), archive operators (in this case maybe more than one company or institution), and end users.

42 Technical Solutions: We consider archiving more than the raw content. Formats used for ready rendering on the Web frequently differ from the format of content in the underlying publishing system. A publisher may have text marked up in SGML or XML in its asset management system, but deliver HTML or PDF formats to users. The SGML or XML marked-up text will be less sensitive to technological change, but ensuring the ability to re-render it as it was originally displayed will be technically complex.

43 Should Content Be Normalized? Two Levels of normalization: 1. File formats: Controlling the number of file formats will reduce the complexity of format monitoring and migration. 2. Document formats: An archive may choose to normalize all XML-marked-up documents into a common DTD, reducing the complexity of documentation, migration, and interface software.But Normalization and translation always involve the risk of information loss.

44 Conclusion Managed through a process of reaching out to an ever larger number of creators, conservators, and publishers who each day make decisions about digital content the Library of Congress will begin to transform the US information infrastructure. The Library is helping to make decisions that determine whether or not that information will survive into the future. But the Library alone cannot achieve any of the aspirations. Its power lies in its ability to influence the efforts of others to create an environment in which collaboration is effective.

45 Thank You Questions?


Download ppt "National Digital Information Infrastructure and Preservation Program by the Library of Congress (US)"

Similar presentations


Ads by Google