Presentation is loading. Please wait.

Presentation is loading. Please wait.

Web archiving at the NLA ‘ Archiving the music web’ Music Council of Australia Annual Assembly 28 September 2009 Paul Koerbin Manager Digital Archiving.

Similar presentations


Presentation on theme: "Web archiving at the NLA ‘ Archiving the music web’ Music Council of Australia Annual Assembly 28 September 2009 Paul Koerbin Manager Digital Archiving."— Presentation transcript:

1 Web archiving at the NLA ‘ Archiving the music web’ Music Council of Australia Annual Assembly 28 September 2009 Paul Koerbin Manager Digital Archiving National Library of Australia

2 1.Background – the what, why and how 2.What makes a valuable resource for archiving? 3.What can you do to help?

3 What is web archiving about and why do it? Archiving = long-term preservation and access Building collections Building ‘documentary’ historical record Creating artefacts from the web experience Discovering what is produced online An act of consciousness

4 What’s involved in web archiving? At the NLA it’s: Identifying, selecting, scoping Seeking permission to collect and make accessible Creating and recording metadata –administrative, descriptive, preservation Crawling/harvesting (including scheduling) Processing for quality assurance (best effort) Storing and maintaining the data Planning and implementing preservation strategies Preparing and rendering for public display Providing access and discovery mechanisms

5 What is the NLA doing? PANDORA Archive 1996→ –PANDORA participants NLA, state libraries (not Tas), NFSA, AWM, AIATSIS (and soon the NGA) –Highly selective, small scale, ‘quality’ collection, open access –PANDAS workflow management system, 2001→ Australian (.au) domain harvests –Annual since 2005 –Internet Archive –No access (yet)

6 Comparative statistics of NLA web collections PANDORA (selective) Files:73 million Size:3.26 TB Domain Harvest 2005200620072008 Unique files 185 million596 million516 million1 billion Hosts crawled 811,5231,046,0381,247,6143,038,658 Size 6.69 TB19.0418.47TB34.55 TB. au Domain Harvests Files:2.3 billion Size:78.75 TB

7 Music in the PANDORA Archive 500+ titles available from the PANDORA public listing of music –NFSA 33% –NLA 30% –Others 37% Musicians, bands, orchestras, composers, organisations, festivals, blogs, instrument makers, magazines … Plus 280 considered but not available –35% (no permission, rejected, yet to be selected)

8 What makes a valuable resource for archiving? Content –substantial, original Provenance ‘Long-term research value’ Cultural or social significance and interest –including events Curatorial/expert suggestion (e.g. Music Australia) Different collecting approaches based on ‘value’ Priorities, but never say never

9 How can you help? 10 tips: 1.Think about the issue of long term access – what is your intention? 2.Communicate interest and intentions – with collecting institutions; let us know about your site – respond to requests for permission 3.Organise and structure sites simply – its all about links 4.Comply with standards – limit use of proprietary technology if possible 5.Make it robot friendly – indexing, discovery, capture

10 How can you help? 10 tips: 6.Keep contributors informed and involved – make sure contributors understand and agree to long-term preservation and access from the beginning 7.Clear copyright, rights and contact information – it helps to know what and who (oh, and trust us too) 8.Maintain content online as much as possible – increases chance of it being collected 9.Learn to love and live with your past – archives are not the same as the ‘live’ web – archived versions cannot be altered 10.Do your own back up, of course

11 PANDORA Australia’s Web Archive http://pandora.nla.gov.au/


Download ppt "Web archiving at the NLA ‘ Archiving the music web’ Music Council of Australia Annual Assembly 28 September 2009 Paul Koerbin Manager Digital Archiving."

Similar presentations


Ads by Google