Presentation is loading. Please wait.

Presentation is loading. Please wait.

Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Preservation for Ongoing Accessibility: research group Professor.

Similar presentations


Presentation on theme: "Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Preservation for Ongoing Accessibility: research group Professor."— Presentation transcript:

1 Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Preservation for Ongoing Accessibility: research group Professor Ross Harvey Dr Bob Pymm Dr Anne Lloyd Geoff Fellows Jake Wallis

2

3 Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Pandora - http://pandora.nla.gov.au NLA solution to website preservation Archive of over 1.7 terabytes of data selective - identifies specific sites for harvest and gains permission to archive

4

5 Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Internet Archive - http://www.archive.org/ Automated Harvests ‘the web’ issues? – cost – reliability of the crawl eg deep web

6 Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest.au Harvest by Internet Archive first ran 2005 - producing 6.9 terabytes of data, 185 million unique files Issues? – difficulties with certain file types – password-protected sites – difficulty in accessing the ‘deep’ web

7 Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest.au Harvest September 2006 – more sophisticated crawl 19 terabytes of data, 596 million files predominant dataset for POA group

8 Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Research potential? digital preservation Australian digital culture

9 Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest 3 broad questions What are the contents of the harvests? How can access be provided to this content? What is the value of the domain harvests in relation to the NLA’s overall web preservation interests?

10

11 Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Blogs low skill threshold technology as barometer of engagement social space catalyst for online community a new and important collecting point for digital cultural heritage

12 Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Archiving and preserving blogs how to identify Australian specific material? what to capture – selection criteria? – linked material? frequency of capture to ensure accurate representation provision of access to harvested blog content

13 Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Aspirations a conceptual framework for studies in digital anthropology a broadening of voices within the Australian public sphere

14 Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Questions/comments?


Download ppt "Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Preservation for Ongoing Accessibility: research group Professor."

Similar presentations


Ads by Google