Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 WEB ARCHIVING IN THE BRITISH LIBRARY John Tuck Head of British Collections February 2004.

Similar presentations


Presentation on theme: "1 WEB ARCHIVING IN THE BRITISH LIBRARY John Tuck Head of British Collections February 2004."— Presentation transcript:

1 1 WEB ARCHIVING IN THE BRITISH LIBRARY John Tuck Head of British Collections February 2004

2 2 BRITISH LIBRARY: CONTEXT  Created by British Library Act 1972.  National Library of the United Kingdom.  Origins from 1753.  One of world’s greatest research libraries.  160 million collection items.

3 3 BRITISH LIBRARY: COLLECTION DEVELOPMENT  Building as completely as possible the UK national published archive - current and retrospective gap filling; print and electronic.  Collecting research-level English- language material published world-wide in the humanities, social sciences, STM.  Buying foreign-language material selectively  Material acquired through: legal deposit, voluntary deposit from publishers, purchase, donation, exchange.

4 4 LEGISLATION  Legal Deposit Libraries Act 2003: enabling legislation.  VDEP: Voluntary Deposit of Electronic Publications.

5 5 DOMAIN.UK  Six-month experiment to select and capture 100 UK web-sites, 2001.  audit change, loss, links, etc.  determine next steps.

6 6 DOMAIN.UK: Why?  Short-lived nature/changing content of many web-sites.  loss of information.  increasing reference to web-sites in research/scholarship.

7 7 DOMAIN.UK: Voluntary/Rights Cleared Approach  Voluntary.  Requiring explicit agreement of website publishers to take part in pilot.  No public access.

8 8 DOMAIN.UK: Selection  Websites of historical or cultural significance.  Cross-section of Dewey Decimal Classification.

9 9 DOMAIN.UK: Process  E-mail selected sites for approval and to check whether already archived.  Measure sites for links, size, change, etc.  Frequency of visits: every three weeks or more in some cases.  Supported by those sites approached.  Report recommended scaling up.

10 10 BRITISH LIBRARY WEB ARCHIVING PROGRAMME  Building on Domain.uk.  BL to play leading role in collecting UK web presence in partnership with other institutions nationally and internationally.  Selective approach.

11 11 BRITISH LIBRARY WEB ARCHIVING PROGRAMME contd.  Co-ordinate a snapshot of entire UK web presence at occasional intervals.  Achieve more regular capture of limited and well-defined range of sites.  Sites judged to be research-level, whether in terms of stated intentions of sites themselves or of potential to be primary resources for research.

12 12 WEB ARCHIVING PROGRAMME  Comprises a series of complementary projects and activities.  Based entirely on voluntary, rights-cleared basis pending secondary legal deposit legislation.  Aims to embed web archiving within the BL's overall collection development policy.  Aims to provide the infrastructure to collect, preserve and make accessible web-site material alongside material in other formats.

13 13 WEB ARCHIVING PROGRAMME STRANDS  Four main strands:  Definition of collection development policy.  UK Web Archiving Consortium.  International Internet Preservation Consortium.  Internet Archive: incunabula of the internet.

14 14 COLLECTION DEVELOPMENT  Appointment of Curator, Web Archiving.  Extension of policy defined for Domain.uk.  Sites of national, historical and cultural significance.  Research level now/in the future.

15 15 UK WEB ARCHIVING CONSORTIUM  Two-year project.  Six partners: BL (lead); National Library of Scotland, National Library of Wales, National Archives, Joint Information Systems Committee, Wellcome Library.  Plan to use PANDAS software developed by National Library of Australia.  Rights to use individual sites to be cleared with rights-holders.

16 16 UK WEB ARCHIVING CONSORTIUM contd.  Procurement exercise in process to recruit supplier to host service.  Intention to let contract in April 2004 and to be operational in summer 2004.  Sites to be made accessible to users.  Each partner to collect up to 500 sites per year, i.e. 6,000 during project.

17 17 INTERNATIONAL INTERNET PRESERVATION CONSORTIUM  Project involving national libraries.  Led by Bibliotheque Nationale de France.  Also includes BL, Library of Congress, Library and Archives of Canada, Nordic countries, Italy, Australia, Internet Archive.

18 18 INTERNATIONAL INTERNET PRESERVATION CONSORTIUM contd.  Aims to develop automated web-crawler mechanism.  Open-source tools to search web at regular intervals matching agreed collection development policies.  Working groups in: access tools; content management, deep web, framework, metrics and test-beds, researcher requirements.  Developmental at this stage.

19 19 INTERNET ARCHIVE  Collecting and saving sites since 1997.  Wayback machine.  Legal, technical and procurement issues.

20 20 SOME CHALLENGES  Defining UK.  Rapid technology change.  Third party rights (not always subject to UK law).  Libel/defamation issues.  Software issues / which platform?  Validity of a snapshot.

21 21 SOME CHALLENGES contd.  Formats for archiving.  Metadata standards.  Archiving ‘look and feel’.  Authenticity.


Download ppt "1 WEB ARCHIVING IN THE BRITISH LIBRARY John Tuck Head of British Collections February 2004."

Similar presentations


Ads by Google