1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.

Slides:



Advertisements
Similar presentations
Texas Workforce Education Course Manual (WECM) 1995 – 2012
Advertisements

1 L U N D U N I V E R S I T Y Integrating Open Access Journals in Library Services & Assisting Authors in choosing publishing channels 4th EBIB Conference.
Chapter 1 The Study of Body Function Image PowerPoint
Digital Initiatives at the University of North Texas Libraries Cathy Nelson Hartman University of North Texas Libraries Texas Conference on Digital Libraries.
Suzanne Bell and Nathan Sarr University of Rochester River Campus Libraries Re-engineering the Institutional Repository to Engage Users.
Organizing the Evaluation of Electronic Resources Lenore England, Digital Resources Librarian Li Fu, Digital Services Librarian ALCTS.
1 Web Search Environments Web Crawling Metadata using RDF and Dublin Core Dave Beckett Slides:
UNITED NATIONS Shipment Details Report – January 2006.
National Diet Library Digital Archive Portal - PORTA - Gateway to digital information in Japan April 3, 2008 Hideki Takeuchi Planning.
E-Content Service Group Virtual Meeting Digital Preservation: How to Get Started.
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination.
DRIVER Long Term Preservation for Enhanced Publications in the DRIVER Infrastructure 1 WePreserve Workshop, October 2008 Dale Peters, Scientific Technical.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
0 - 0.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
SUBTRACTING INTEGERS 1. CHANGE THE SUBTRACTION SIGN TO ADDITION
Addition Facts
Year 6 mental test 5 second questions
1 SHERPA Securing a hybrid environment for research preservation and access.
Emerging Open Data Policies in the U.S. – An Overview Heather Joseph Executive Director, SPARC JISC/CNI Meeting Edinburgh, Scotland July 2, 2010.
September Public Library Web Managers Workshop 2000 Cascading Style Sheets Manjula Patel UKOLN University of Bath Bath, BA2 7AY UKOLN is funded.
A centre of expertise in data curation and preservation London :: ARK Group Workshop: Archiving the Web :: 28 Sept 2006 Funded by: This work is licensed.
Libraries for Future Generations Martha Anderson Director National Digital Information Infrastructure and Preservation Program The Library of Congress.
- A Powerful Computing Technology Department of Computer Science Wayne State University 1.
BT Wholesale October Creating your own telephone network WHOLESALE CALLS LINE ASSOCIATED.
ABC Technology Project
SEARCHING MULTIMEDIA prepared by Literature Searching Team Library, Faculty of Medicine, UGM 2012.
Page 1 ADP Panel Presentation June 2007 ADP 2007 – OVF Presentation Democracy Begins at Home and Abroad: Voter Registration Tools for U.S. Students.
Panel: What Changes With Digital? Web Archiving ARL Forum 2009 Tracy Seneca – California Digital Library.
Copyright © AIIM | All rights reserved. #AIIM The Global Community of Information Professionals aiim.org Information Management and Social Media Jesse.
Dave Chaffey, E-Business and E-Commerce Management, 4 th Edition, © Marketing Insights Limited 2009 Slide 1.1 Introduction to e-business and e-commerce.
1 Advanced Archive-It Application Training: Quality Assurance October 17, 2013.
Addition 1’s to 20.
25 seconds left…...
Week 1.
We will resume in: 25 Minutes.
Reusability of University Digital Archives: Meeting the Needs of K-12 Teachers Felicia Poe, Assessment Coordinator California Digital Library, University.
1 PART 1 ILLUSTRATION OF DOCUMENTS  Brief introduction to the documents contained in the envelope  Detailed clarification of the documents content.
HATHITRUST A Shared Digital Repository HathiTrust current work, challenges, and opportunities for public libraries Creating a Blueprint for a National.
End User Research Project: What are they doing and why? IMA Conference, January 2005 Alisa Miller, PRI Debra May Hughes, Public Interactive Bruce Fohr,
Looking Ahead Archive-It Partner Meeting November 12, 2013.
1 Archiving and Preserving the Web Kristine Hanna Internet Archive July 2008.
Web 2.0 The Read/Write Web. Marc Prensky Terms Digital Natives Digital Natives Digital Immigrants--maintain a pre-digital accent Digital Immigrants--maintain.
1 Archiving and Preserving the Web Kristine Hanna Internet Archive April 2006.
1 Archive-It Training University of Maryland July 12, 2007.
1 Advanced Archive-It Application Training: Archiving Social Networking and Social Media Sites.
Archive-It collection on “Occupy Movement 2011/2012” Archiving Web Content.
Joanne Archer University of Maryland Kate Odell Archive-It Abbie Grotke Library of Congress Tessa Fallon Columbia University Creating and Maintaining Web.
1 Archiving and Preserving the Web Dan Avery Kristine Hanna Merrilee Proffitt Internet Archive RLG April 2006.
Web The Internet Archive. Agenda Brief Introduction to IA Web Archiving Collection Policies and Strategies Key Challenges (opportunities for.
The Web is a Mess: or How I Learned to Stop Worrying and Love Web Archiving Lori Donovan, Internet Archive.
Web Capture team Office of strategic initiatives February 27, 2006 Selecting Content from the Web: Challenges and Experiences of the Library of Congress.
The Web Archiving Service Tracy Seneca California Digital Library California Digital LibraryNew York UniversityUniversity of North Texas National Digital.
The web has revolutionized our access to information. Documents and publications that were once difficult to fin are now readily available to anyone. Government.
Caught in the Web: Web Archiving at U of A Libraries Geoff Harder and Kenton Good Digital Preservation Seminar | March 5, 2010 | University of Alberta.
1 Archive-It: Archiving and Preserving Born Digital Content NDIIPP June 2009 Molly Bragg Partner Specialist Internet Archive.
Can we be doing more? Beth Tillinghast University of Hawaii at Manoa October 19, 2011 Archive-It Partner Meeting ACCESS TO OUR ARCHIVED WEBSITE COLLECTIONS.
Preserving Digital Culture: Tools & Strategies for Building Web Archives : Tools and Strategies for Building Web Archives Internet Librarian 2009 Tracy.
Metadata Extraction & Web Archives: Automating the Record Creation Process Abbie Grotke / Gina Jones /
Current Quality Assurance Practices in Web Archiving Brenda Reyes Ayala, Mark Phillips, and Lauren Ko University of North Texas
Web Archiving Workshop Mark Phillips Texas Conference on Digital Libraries June 4, 2008.
Archiving & Preserving Digital Content
Joanne Archer University of Maryland Libraries
Creating Web Collections with Archive-It
László Drótos – Márton Németh National Széchényi Library Department of Electronic Library Services Web archiving Planning a new pilot project.
Latin American Government Documents Archive, LAGDA
Wisconsin County and Municipal Government Collections in Archive-It
Presentation transcript:

1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San Francisco California Officially designated a library by the state of California (2007)

2 First deployed in February 2006 Web based application that allows users to create, manage and preserve collections of digital web content Functions include: selection and scoping, harvesting, reports and analysis of captures, cataloging with metadata, full text search Archived content includes: text, html, video, audio, images, PDF, online newspapers, social networking and more… Includes hosting, access and storage (primary and back-up) Archived content available for viewing 24 hours after a crawl has completed Archive-It

3 Open Source Technology primarily developed by Internet Archive, the open source community, and the IIPC Heritrix: web crawler - crawls and captures pages Wayback Machine: access tool for rendering and viewing pages. Displays archived web pages--surf the web as it was. NutchWAX: Open source search engine. Standard full- text search The Tools Behind Archive-It

4 Who Uses Archive-It 130 partners in 42 states and 12 countries 35% University and College Libraries 30% State Archives and Libraries 15% Non Government Non Profits 9% National Libraries/Federal Institutions 7% K-12 Schools 2% Cities and Public Libraries 2% Museums and Art Libraries

66 Archive-It Web Application

77 Why Archive Social Networking Sites? State Agencies & Officials: An increasing number have decided that the content on these sites is a record and needs to be archived. University libraries: Used to share information with students and alumni, and contain important records about a school's culture, student body and campus events. Researchers: Used to preserve valuable social reactions and change on topics of interest Currently about 20 Archive-It partners are archiving content from these sites

8 North Carolina State Archives & State Library of North Carolina Purpose: archive state agency websites and publications Includes pages in a variety of formats: text, images, audio, video and social networking sites Archive-It Partner since 2005 (pilot partner)

9 North Carolina State Archives & State Library of North Carolina

10 North Carolina State Archives & State Library of North Carolina

11 Library of Virginia Purpose: Preserve websites relating to Virginia government and elections Collection on current Governor includes Twitter and Flickr sites Collection on Twitter, Flickr, and Facebook sites of politicians and political organizations in Virginia

12

13

14 Stanford University, Islamic and Middle Eastern Collection Purpose: Harvest and preserve Iranian Blogs Archiving over 300 blogs written by and for Iran and the Iranian people Archiving sites from Twitter, Facebook, and Youtube selected by the collection’s curators Partner since February 2008 funded by Library of Congress

16

17 University of Texas, San Antonio Purpose: Archive university websites, student organizations, academic departments, and other local topics important to their university Archiving blogs, Facebook, Twitter, Flickr, MySpace Partner since 2008

18

19

20 Typical Challenges Content behind log-ins can not be archived Content can be blocked by robots.txt files (which our crawlers respect by default) Some parts of sites are not “archive-friendly” (i.e. complex javascript, Flash, etc.) These sites tend to change both their technical structure and policy quickly and often. Structure of the sites/urls means users need to add scoping rules to only capture content you are interested in. Each site has its own unique set of challenges.

21 Overall Approaches Trial and Error: Try to harvest with a variety of settings Quality Review: review archived content thoroughly Collaborate: compare approaches and results with other Archive-It users Document detailed instructions, lessons learned, and best practices for other partners

22 Thank you! Kate Odell Partner Specialist, Internet Archive