Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mass digitisation? Astrid Verheusen Projectmanager Research & Development Division National library of the Netherlands LIBER-EBLIDA Workshop on Digitisation.

Similar presentations


Presentation on theme: "Mass digitisation? Astrid Verheusen Projectmanager Research & Development Division National library of the Netherlands LIBER-EBLIDA Workshop on Digitisation."— Presentation transcript:

1 Mass digitisation? Astrid Verheusen Projectmanager Research & Development Division National library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Royal Library, Copenhagen, Denmark 25 October 2007

2 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation What is mass digitisation? Millions of books rather than millions of pages No selection/no collections (digitise everything!) Mainly books Exclusion of special collections Low quality standards Ignore copyright issues Ignore long term preservation issues 2

3 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Koninklijke Bibliotheek - Digitisation in the past 3 Experience with digitisation since 1995 Webexpositions / highlights of collections Small-scale digitisation projects Mainly visually attractive images Emphasis on techniques / trial and error Exploration of possibilities Co-operation on a small scale

4 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation 4

5 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Koninklijke Bibliotheek - Digitisation 2000-2005 55 Shift in emphasis: From highlights to larger collections Project based (Inter)national co-operation Established methods and techniques Awareness of digital preservation More text material & audio/video Further exploration of possibilities  applications made with the digitised material

6 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation 66 Memory of the Netherlands

7 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Koninklijke Bibliotheek - present & future -1 77 Strategic plan 2006-2009:”Development of a national programme for the mass digitisation of sources for research in the humanities” Target audience Scientific research Public at large Development of standards and services Particular attention for digital preservation Preservation imaging No commercial partners for funding

8 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Koninklijke Bibliotheek - present & future -2 88 Text digitisation Until recently: on a small scale Printed and typed sources (not handwritten) Issues differ from images Structure / navigation Conversion to full text (OCR) Scanning from microfilm Search & Retrieval

9 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation 9 ProjectNumber of pages Budget Dutch parliamentary papers 1814-19952.300.000 M€ 10.5 Dutch daily newspapers 1618-19958.000.000 M€ 12.5 Special collections – books before 18001.300.000 M€ 3.0 Radio news bulletins1.500.000 M€ 0.5 Metamorfoze - preservation imaging28.000.000? M€ 24 Atjeh200.000 M€ 0,3 Memory of the Netherlands350.000 M€ 3,5 Totaal42.150.150 M€ 54,3 Koninklijke Bibliotheek - Projects 2007-2011

10 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Koninklijke Bibliotheek - Issues 10 Costs of digitisation: € 1.3 per page Costs of exploitation: millions per year from 2011 onwards Technical infrastructure Storage (1 PB needed) Processing 2 million files per month Search & retrieval is not effective enough Organisational infrastructure is not efficient The process is too slow, we want to digitise faster and more...

11 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation 11 We cannot slow down to make things perfect The rising tide will lift all boats Mass Digitization: Implications for Information Policy Report from “Scholarship and Libraries in Transition: A Dialogue about the Impacts of Mass Digitization Projects” Symposium held on March 10-11, 2006 University of Michigan, Ann Arbor

12 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation 12 Project management & Organization Finance

13 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Content: Selection & Preparation Old approaches Much effort spent on selection Ignorence of copyright issues… Minute assessment of missing material Replacement of torn pages 13

14 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Content: Selection & Preparation New approaches Less effort on the selection process (integral collections) Negotiation/co-operation with publishing sector Limited effort on retrieving missing pages/issues Limited effort on restoration 14

15 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Content: Digital imaging & metadata Old approaches Very high quality images Capture as much detail from the original as possible Minimize damage to the original Master & access images Lossless compression (TIFF) Experiment with our own scanners 15

16 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Content: Digital imaging & metadata New approaches One format for both access and preservation New formats to save storage (JPEG2000) Outsource all imaging activities Consider.txt as a master… 16

17 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Processing: Quality assurance Old approaches High standards for quality assurance (often manual) Expensive Document Management System for quality control 17

18 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Processing : Quality assurance New approaches Not realistic to check quality for all files We need automatic quality assurance tools OCR often not involved in quality assurance 18

19 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Search & Retrieval Old approaches Find the best search engine Search in metadata Digitise text without OCR We decide what the user wants 19

20 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Search & Retrieval New approaches All text digitisation projects include OCR Search through millions of pages of text Experiment with tools for enhanced access & textmining Growing awareness that we have to involve our users 20

21 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Storage Old approaches Storage on CD Rom and DVD Master files in e-Depot: 1 Petabyte needed Storage of all master files for the long term Access files are stored in a different system 21

22 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Storage New approaches Storage strategy which balances costs, access and preservation Alternative file formats to minimize storage costs & increase throughput for delivery and transfer Use one file both as master and access file 22

23 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Finance All costs are now specified Division of budget 30 % Staff 10 % Hard- & software 10 % Research & Development 50 % Digitisation, OCR & metadata Exploitation costs are becoming ‘dramatic’ New business models 23

24 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Organisation All digitisation activities in R&D department Involvement of other parts of the library is necessary Digitisation & digital preservation are separate activities Integration is necessary Digitisation activities are all project based Integration with standing organisation is necessary 24

25 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation ‘Holding out for an ideal solution is often not feasible; moreover, implementing less-than-perfect solutions can enable us to be flexible, modular, and nimble so that we can continue to refine our strategies as new options become available’. Preservation in the Age of Large-scale Digitization A white paper By Oya Y. Rieger Council on Library and Information Resources 25 Conclusion

26 Koninklijke Bibliotheek – National Library of the Netherlands LIBER-EBLIDA Workshop on Digitisation Thank you! Astrid.Verheusen@kb.nl


Download ppt "Mass digitisation? Astrid Verheusen Projectmanager Research & Development Division National library of the Netherlands LIBER-EBLIDA Workshop on Digitisation."

Similar presentations


Ads by Google