Caught in the Web: Web Archiving at U of A Libraries Geoff Harder and Kenton Good Digital Preservation Seminar | March 5, 2010 | University of Alberta.

Slides:



Advertisements
Similar presentations
K-12 Web Archiving Project Archive-It Partner Meeting November 4, 2009.
Advertisements

LIBRARY & ARCHIVES CANADA Canadas Knowledge Institution for the 21 st Century Presentation to the Conference of Directors of National Libraries August.
Libraries for Future Generations Martha Anderson Director National Digital Information Infrastructure and Preservation Program The Library of Congress.
From web archiving to web collecting The development of the KB’s web archive Anna Rademakers, May 21st 2014.
1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.
University of Alberta Libraries: Bringing in the Harvest Archive-It Partners Meeting.: October 2011.: Geoff Harder.
Building a Digital Library Rochester Public Library The Next 100 Years.
Looking Ahead Archive-It Partner Meeting November 12, 2013.
Latin American and Human Rights Web Archiving as part of Research Library Special Collections Kent Norsworthy LLILAS Benson Digital Curation Coordinator,
1 Archiving and Preserving the Web Kristine Hanna Internet Archive July 2008.
An Integration Platform of Social Networking Applications to Support Life Long Learning in Rural Territories: the “SoRuraLL Virtual Learning World” Environment.
Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Preservation for Ongoing Accessibility: research group Professor.
The FDLP Web Archive Dory Bower Archive-It Partner Meeting November 18, 2014.
Preserving the Unpreservable: Form, Content, Copyright and the Archiving of Born-Digital Newspapers Lisa Lynch Concordia University Paul Fontaine McGill.
1 Archiving and Preserving the Web Kristine Hanna Internet Archive April 2006.
The capture and preservation of websites at the National Library of New Zealand Gillian Lee Alexander Turnbull Library.
1 Archive-It Training University of Maryland July 12, 2007.
1 Advanced Archive-It Application Training: Archiving Social Networking and Social Media Sites.
Web Archiving Life Cycle Model Archive-It Partner Meeting December 3, 2012 Molly Bragg
Archive-It collection on “Occupy Movement 2011/2012” Archiving Web Content.
Joanne Archer University of Maryland Kate Odell Archive-It Abbie Grotke Library of Congress Tessa Fallon Columbia University Creating and Maintaining Web.
1 NEWSPLAN – The Way ahead Ed King, Head of Newspaper Collections, British Library NEWSPLAN LIEM Regional Council 2 October 2008.
WebArchiv Czech Web Archive IIPC 2007, Paris.
Web Archives, IDEAL, and PBL Overview Edward A. Fox Digital Library Research Laboratory Dept. of Computer Science Virginia Tech Blacksburg, VA, USA 21.
1 Archiving and Preserving the Web Dan Avery Kristine Hanna Merrilee Proffitt Internet Archive RLG April 2006.
How to Face the Challenges of Web Archiving? The experiences of a small library on the edge. Chloe Martin, Internet Memory Catherine Ryan, National Library.
Web The Internet Archive. Agenda Brief Introduction to IA Web Archiving Collection Policies and Strategies Key Challenges (opportunities for.
A centre of expertise in digital information managementwww.ukoln.ac.uk Digital Preservation / UK Web Focus Brian Kelly UKOLN University of Bath Bath, BA2.
Web Capture team Office of strategic initiatives February 27, 2006 Selecting Content from the Web: Challenges and Experiences of the Library of Congress.
The Web Archiving Service Tracy Seneca California Digital Library California Digital LibraryNew York UniversityUniversity of North Texas National Digital.
Human Rights Archives and Documentation, CHRDR Conference 4- 6 October 2007 Issues in Human Rights Web Archiving Robert Wolven Columbia University Libraries.
WHS joined Archive-It in the fall of 2010 Began capturing state information with the capture of Governor Jim Doyle’s websites at the end of the administration.
IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist.
CNI Fall Task Force, December 2007 International Internet Preservation Consortium Abbie Grotke IIPC Communications Officer Library of Congress & George.
Presentation Path  Introduction to Ved Consultancy and OpenText  Current Challenges  The Valued Customers and Sectors  Our Solutions  Demo. Together,
1 Archive-It: Archiving and Preserving Born Digital Content NDIIPP June 2009 Molly Bragg Partner Specialist Internet Archive.
Preserving Culture and Knowledge Through Archiving Web Content.
Can we be doing more? Beth Tillinghast University of Hawaii at Manoa October 19, 2011 Archive-It Partner Meeting ACCESS TO OUR ARCHIVED WEBSITE COLLECTIONS.
Preserving Digital Culture: Tools & Strategies for Building Web Archives : Tools and Strategies for Building Web Archives Internet Librarian 2009 Tracy.
Why Archiving and Preserving GIS Data Is Important Maps tell a compelling story of change over time. They document movement, progress, and change to the.
CHAPTER 1 THE READ/WRITE WEB Marquita Friend Resa Garvin October 17, 2012 EDUC 303.
IFAP Special Event: Information and Knowledge for All, Emerging Trends and Challenges Information Preservation 4000 Years of Traditions Challenged by Digital.
The Benefits of Being Online Sarah Graham UK Online Project Officer
Web Archiving at the National Library of Australia Russell Latham Senior Web Archivist, National Library of Australia.
Beth Schaefer, assistant director Client Services University Information Technology Services IT's 4 U: Putting social networking tools to work.
Chapter 8 Browsing and Searching the Web. 2Practical PC 5 th Edition Chapter 8 Getting Started In this Chapter, you will learn: − What is a Web page −
Piece of KAIC: Making a Web Archiving Consortium from Scratch Ashley Todd-Diaz Curator of Special Collections & Archives Cliff Hight University Archivist.
1 Collection Development and Web Publications at the British Library John Tuck Head of British Collections Digital Memory, Session 2, Tallinn 24 th November.
Using Social Media for Fundraising and Communication with Supporters Lindsay Boyle – Communications & Research Coordinator Claire Chapman – Information.
Metadata Extraction & Web Archives: Automating the Record Creation Process Abbie Grotke / Gina Jones /
Preservation Program Digital Preservation Program Digital Preservation Services: Extending tools to meet campus needs Patricia Cruse, Director, Digital.
Current Quality Assurance Practices in Web Archiving Brenda Reyes Ayala, Mark Phillips, and Lauren Ko University of North Texas
1 BCS, Oxfordshire, 19 February, 2004 WEB ARCHIVING issues and challenges Deborah Woodyard Digital Preservation Coordinator.
Federal Electronic Records Management: Current Trends and Tools Brave New World of E-Records Puget Sound Region Records Management Seminar June 23, 2010.
Digitization and the Infinite Archive (History 9808A) 22 September 2014.
1 Advanced Archive-It Application Training: Reviewing Reports and Crawl Scoping.
Digital Archives You Can Do It! The Collective - March 2016 Paul Kelly - Digital Archivist - The Catholic University of America.
Strategies for archiving the Danish web space Bjarne Andersen Head of Digital Resources State and University Library, Aarhus
Digital Library of the Caribbean (dLOC) & Digital Humanities LEAH R. ROSENBERG LAURIE N.
Use cases for BnF broad crawls Annick Lorthios. 2 Step by step, the first in-house broad crawl The 2010 broad crawl has been performed in-house at the.
Web Archiving Workshop Mark Phillips Texas Conference on Digital Libraries June 4, 2008.
Archiving & Preserving Digital Content
Born Digital 2016: generating public interest in digital preservation
László Drótos – Márton Németh National Széchényi Library Department of Electronic Library Services Web archiving Planning a new pilot project.
Digital Footprints.
MSC photo:  It was taken some time in the late 1930s, but we don’t have an exact date.  The college was known as MSC from 1925 until 1955 when we became.
Wisconsin County and Municipal Government Collections in Archive-It
Márton Németh – László Drótos (National Széchényi Library, Hungary)
Panel on Web Archiving Government Information: LAC’s Program Update
EPA website.
Presentation transcript:

Caught in the Web: Web Archiving at U of A Libraries Geoff Harder and Kenton Good Digital Preservation Seminar | March 5, 2010 | University of Alberta

Official children’s site of the 2000 Sydney Olympics - MIA:

GeoCities: ocities_we_forgot_you_still_existed.html

Mind the Gap - UK “If websites continue to disappear in the same way as those on President Bush and the Sydney Olympics - perhaps exacerbated by the current economic climate that is killing companies - the memory of the nation disappears too. Historians and citizens of the future will find a black hole in the knowledge base of the 21st century.” Quote: ernet-heritage

“New definitions need to be created for determining the scope of digital special collections, so that stakeholders can understand the nature of special collections professionals’ responsibilities. These include a responsibility for harvesting and preserving endangered web sites, wikis and other dynamic information resources.” Digital Special Collections Special Collections in ARL Libraries – March 2009 A Discussion Report from the ARL Working Group on Special Collections

Looking ahead…  234 million – The number of websites as of December  47 million – Added websites in  126 million – The number of blogs on the Internet (as tracked by BlogPulse).  27.3 million – Number of tweets on per day (November, 2009)  350 million – People on  4 billion – Photos hosted by (October 2009).  12.2 billion – Videos viewed per month on in the US (November 2009).

Does the web matter? Only if our cultural, historical, political, economic, and social memories matter.  Valuable BUT vulnerable – e.g. foundation losses funding; can only afford digital publishing.  Research and analysis – longitudinal view requires a complete picture.  SOMEONE needs to take responsibility for it.

Web Archiving Web Archiving is the process of collecting portions of the World Wide Web and ensuring the collection is preserved in an archive, such as an archive site, for future researchers, historians, and the public. Due to the massive size of the Web, web archivists typically employ web crawlers for automated collection. Wikipedia, “Web Archiving”

how web archiving works A web crawler (ant, bot) is a computer program that browses and harvests (captures, collects) the World Wide Web in a methodical, automated manner. A web crawler (ant, bot) is a computer program that browses and harvests (captures, collects) the World Wide Web in a methodical, automated manner.

ARCHIVE-IT

Web Archive Admin Screen

HCF Collection

Seed Management

Reports

Reports

File Type Report

Blocked Content Robots.txt

Web Archive Launch Page

Exposing Hidden Content

U of A Web Archive Partner with Internet Archive on the use of Archive-It Partner with Internet Archive on the use of Archive-It Three targets: (criteria: thematic, regional, event-based, organizational) Three targets: (criteria: thematic, regional, event-based, organizational) 1)Heritage Community Foundation (collection at risk) 2)University of Alberta websites 3) Western Canadian materials (e.g. political websites)

A few resources University of Alberta Web Archive: University of Alberta Web Archive: Archive-it! and Wayback Machine Archive-it! and Wayback Machine IIPC – International Internet Preservation Consortium IIPC – International Internet Preservation Consortium Use Cases for Access to Internet Archives, IIPC Access Working Group, Use Cases for Access to Internet Archives, IIPC Access Working Group, Special Collections in ARL Libraries, Report March 2009 Special Collections in ARL Libraries, Report March 2009 GoC Web Archive GoC Web Archive

thanks Geoff Harder Digital Initiatives Coordinator Kenton Good Web Development Librarian