Digital library projects in the Nordic national libraries Juha Hakala Helsinki University Library – The National Library of Finland.

Slides:



Advertisements
Similar presentations
Harvesting and archiving the Web Nordunet2000, Juha Hakala Helsinki University Library.
Advertisements

1 of 16 Information Access The External Information Providers © FAO 2005 IMARK Investing in Information for Development Information Access The External.
CrossAsia at the Staatsbibliothek zu Berlin an approach to organise access to research material in the field of Asian studies.
Strategic issues for digital projects... …or, what are we doing here?
Strategic issues for digital projects... …or, what are we doing here?
While You Were Out: How Students are Transforming Information and What it Means for Publishing Kate Wittenberg The Electronic Publishing Initiative at.
Electronic Theses and Dissertations: Benefits, Issues, and the University of Waterloo Approach
OCLC Digital Archive Overview Judith Cobb LIPA Meeting July 2006.
Providing collections, tools and services for digital humanities A national library perspective Clément Oury Head of Digital Legal Deposit Bibliothèque.
Networking of Legal Information Websites in Europe - experiences and challenges Aki Hietanen Ministry of Justice, Finland.
BUILDING DIGITAL WEB ARCHIVES FOR FUTURE SCHOLARS Jani Stenvall
Chapter 2. Slide 1 CULTURAL SUBJECT GATEWAYS CULTURAL SUBJECT GATEWAYS Subject Gateways  Started as links of lists  Continued as Web directories  Culminated.
1 Strategies for Collecting and Preserving Open Access Materials on the Web William Y. Arms Cornell University Federal Library and Information Center Committee.
Challenges for the DL and the Standards to solve them Alan Hopkinson Technical Manager (Library Systems) Learning Resources Middlesex University.
CSC Grid Activities Arto Teräs HIP Research Seminar February 18th 2005.
1 Minerva The Web Preservation Project. 2 Team Members Library of Congress Roger Adkins Cassy Ammen Allene Hayes Melissa Levine Diane Kresh Jane Mandelbaum.
Consortia Portal for Sharing Resources of Russian Libraries Alexander Plemnek, Natalia Sokolova St. Petersburg State Polytechnic University, St. Petersburg,
Developing PANDORA Mark Corbould Director, IT Business Systems.
Software Documentation Written By: Ian Sommerville Presentation By: Stephen Lopez-Couto.
The capture and preservation of websites at the National Library of New Zealand Gillian Lee Alexander Turnbull Library.
Digital Objects Management Arbicon Visit, June 7, 2007 Esa-Pekka Keskitalo, Senior Analyst esa-pekka.keskitalo [at] helsinki.fi.
Metadata and identifiers for e- journals Copenhagen Juha Hakala Helsinki University Library
WebArchiv Czech Web Archive IIPC 2007, Paris.
Resource Sharing Development and Challenge in Academic Libraries: the Case Study of CALIS Yao XiaoXia CALIS Administrative Center , PUL , shanghai.
Svein Arne Brygfjeld National Library of Norway Nordic Web Archive.
Serenate1 Non-standard users: The Library Raf Dekeyser K.U.Leuven.
An Example of Multinational Cooperation with a Special View on Multilinguality and Interoperability Hagelin, Ritva and Myllys, Heli, Viikki Science Library,
WebArchive – Archive of the Czech Web Mgr. Jan HUTAŘ.
Piero Attanasio mEDRA: the European DOI agency The DOI as a tool for interoperability between private and public sector Athens, 14 January.
Introduction to Worldcat (OCLC) Presentation for PGDILIT Course By Dr.D.N.Phadke Coordinator,PGDILIT Contact: Mob
European digital repositories: an overview ELAG 2006, Bucharest Juha Hakala Helsinki University Library.
Interoperable Digitised Content “Discover, search, extract, link, associate, and view digitised content” Les Carr.
Kulturarw³ The Swedish WWW Archive Eller, att fånga den V ärlds V ida V även
Databases and Library Catalogs Global Index Medicus/Global Health Library PubMed Source Bibliographic Database: International Health and Disability.
Copy cataloguing in Finland Juha Hakala The National Library of Finland
Recordkeeping for Good Governance Toolkit Digital Recordkeeping Guidance Funafuti, Tuvalu – June 2013.
Cataloging and Metadata at the University Library.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
The role of Parthenos for CLARIN ERIC Steven Krauwer CLARIN ERIC Executive Director 1.
Introduction to electronic resources management Workshop introduction and overview.
1 Archive-It: Archiving and Preserving Born Digital Content NDIIPP June 2009 Molly Bragg Partner Specialist Internet Archive.
LOGO 2 nd Project Design for Library Programs Supervised By Dr: Mohammed Mikii.
The Information Challenge Exponential growth of resources New researchers with new needs Multiple communication options New expectations and opportunities.
Digital Archiving in the Hungarian Széchényi Library The story and the plans of the Hungarian Electronic Library Rome, 21. Oct István Moldován OSZK,
Chapter One Orientation: The world of digital libraries How to Build a Digital Library Ian H. Witten and David Bainbridge.
Subject Gateway KIV SUBJECT GATEWAY – WHAT IS IT? Internet based service To locate high quality information available on the Internet.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
European Commission on Preservation and Access Preservation of digital heritage Yola de Lusenet Lisbon, November
Digital Commons & Open Access Repositories Johanna Bristow, Strategic Marketing Manager APBSLG Libraries: September 2006.
1 The Future Of Union Catalogues Some BL Perspectives Neil Wilson Head of Bibliographic Development Scholarship & Collections Boston Spa 17 th March 2006.
ONE-2, SVUC, danZIG & Holdings ZIG July 2000 Poul Henrik Jørgensen, Danish Bibliographic Centre
| Ingest Levels and Persistent Identification | October Ingest Levels and Persistent Identification Services for R & D and heritage organisations.
Library Network Services Twin cities Kristiina Hormia-Poutanen National Library of Finland.
Preservation Program Digital Preservation Program Digital Preservation Services: Extending tools to meet campus needs Patricia Cruse, Director, Digital.
Corporation For National Research Initiatives Technical Issues in Electronic Publishing Corporation for National Research Initiatives William Y. Arms.
1 BCS, Oxfordshire, 19 February, 2004 WEB ARCHIVING issues and challenges Deborah Woodyard Digital Preservation Coordinator.
Serenate1 The librarian’s view Raf Dekeyser K.U.Leuven.
Open Archive Forum Rachel Heery UKOLN, University of Bath UKOLN is funded by Resource: The Council for Museums, Archives.
ARIADNE is funded by the European Commission's Seventh Framework Programme Archiving and Repositories Holly Wright.
Licensing in a European Perspective - case Finnish National Consortium ELAG 2001, Prague Kristiina Hormia-Poutanen.
Content Management Systems. Agenda Week overview Web-page basics The why and what of CMS Typo3.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
Open Science and Research – Services for Research Data Management © 2014 OKM ATT 2014–2017 initiative Licenced under.
Workshop on Web Archiving
Software Documentation
László Drótos – Márton Németh National Széchényi Library Department of Electronic Library Services Web archiving Planning a new pilot project.
Bibliographic control of the web publications in Latvia
DIGITAL LIBRARY.
Objectives, activities, and results of the database Lituanistika
Márton Németh – László Drótos How to catalogue a web archive?
Presentation transcript:

Digital library projects in the Nordic national libraries Juha Hakala Helsinki University Library – The National Library of Finland

Contents Introduction Current projects –Nordic Web Archive –Scandinavian Virtual Union Catalogue –Identification of electronic resources Some shared challenges –Legal deposit –Long-time preservation of electronic resources

Introduction Nordic national libraries have important roles in their communities –Format & cataloguing rules maintenance: all but DK shift towards MARC21 –National bibliography: all (DK: music) –Article index: FI, IS, NO, SE (as part of Libris) –Union catalogue host: FI, IS, NO, SE –Large-scale digitisation: all (especially NO) project TIDEN

Introduction (2) There is a long tradition of co-operation between the Nordic libraries in general –Meetings of the national librarians (NORON) –Topical meetings since at least 1920’s, ranging from library science students to diverse professional communities, including e.g. ILL specialists –Joint projects, funded e.g. by Nordinfo & Nordunet2

Nordic Web Archive Partners: all Nordic national libraries Funding: libraries + Nordunet2 Aim: archive the freely available Web documents for future generations as a part of each library’s legal deposit obligation Free access to the index, limited access to the deposited documents

NWA – Background Kulturarw3 project, Kungliga biblioteket –proved the feasiblity of the Web archiving –the Swedish Web space has been archived several times with altered Combine harvester NEDLIB project, –EU-funded initiative, many national libraries involved –developed the NEDLIB harvester, using the KB experiences as the starting point

NWA tools Web harvesting and archiving is done by the NEDLIB harvester (except in SE) –it is open source, optimised for the Web archiving purposes archiving module, weird scheduling principles, archive metadata (MD5 checksum, time stamp, original URL) –Multiple users – better maintenance and development –“Combat proven” – strengths and weaknesses are known reasonably well

NWA tools (2) Indexing is done by a search engine built by a Norwegian company FAST –Can process billions of files Present need: tens of millions files –Can handle >200 file formats via conversion to HTML prior to indexing –Can recognise large number of languages

NWA tools (3) Diverse additional modules are under development in national libraries in order to facilitate access to the archived files These tools will most likely be available in the public domain, like the harvester –The search engine is the only commercial module in the package

Archiving results: Finland Harvesting of *.fi was completed in 3/2002 –A few weeks of processing with Sun E million URLs, 9.4 million files –Same proportion of duplicates as in Iceland After compression, the archive is 340 GB –Storage on tape robot in CSC (Finnish NCSA) Next step: Finnish pages in other domains –Co-operation with InfoCenter Finland

Archiving: experiences The Internet is a dirty place –Quality of data and (some) applications is appalling –Any tool dealing with a large number of Web resources must be extremely robust Very important to have an encompassing list of start pages

Archiving: problems The cost of storing the bits is small, for now –Ratio of the size of the Web and storage price in the future? Preserving access is easy for HTML, JPEG and GIF (97 % of the archive content) –The rest will be a problem in the future How to get to the “deep Web”?

Scandinavian Virtual Union Catalogue Partners: national libraries, Bibsys (Norway) & Dansk BiblioteksCenter Funding: partners + Nordinfo Aim: free use of national union catalogues to the consortia maintaining these databases –Each partner “pays” with its data for access to all other systems 1st contract

SVUC – databases DK: Danbib FI: Linda & Manda IS: Gegnir NO: Bibsys, Sambok SE: Libris Approximately15-20 million records New databases and partners may be added in the future

SVUC – services Searching – directly via Z39.50 connection, possibly also via Web portals Copy cataloguing –Via Z39.50, using the Bath profile; One-2 profile support also possible Future extensions (e.g ILL and document delivery) are likely; no schedule yet

Identification of electronic resources Nordic national libraries participate actively in development of e.g. ISBN, ISSN and Uniform Resource Names (URNs) Common principles and some shared software development in implementation of URNs based on nat. bibliography numbers –Initial development of e.g. the URN generator SW in co-operation with Netlab

Shared challenges: legal deposit All Nordic countries have either recently revised their legal deposit acts or are in the midst of the process –Lots of sharing of ideas is taking place Revision of the Copyright Act (in order to align it with the EU Copyright Directive) is under way Legal platform for deposit and preservation of electronic resources will be built; lobbying needed to guarantee its suitability for (national) libraries

Long-time preservation NEDLIB provided a good starting point; since then the activities have shifted to domestic level –There is a risk of re-inventing the wheel, e.g. in development of preservation metadata Need for European / Global co-operation –OCLC/RLG Preservation metadata WG –development and evaluation of tools

Summary Shift from bibliographic data to ”full text” is well under way This creates legal and technical challenges, which the Nordic national libraries are solving together For historical and organisational reasons local priorities differ, but there are a lot of shared activities

Links NWA - NEDLIB - NEDLIB harvester SVUC - URN - charter.html