Presentation is loading. Please wait.

Presentation is loading. Please wait.

Joanne Archer University of Maryland Libraries

Similar presentations


Presentation on theme: "Joanne Archer University of Maryland Libraries"— Presentation transcript:

1 Joanne Archer University of Maryland Libraries
Problems and Issues in Selecting, Harvesting, and Cataloging Web Resources Joanne Archer University of Maryland Libraries

2 Crawler Web Harvesting Jargon Seed Crawl Harvest

3 Wayback Machine

4 Options for Web Harvesting
i.e. Pandora, Web Curator Tool In House Program i.e. Web Archiving Service Archive-It Third Party Subscription Pro: flexibility Pro: Ease-of-use Con: $$$ Con: $ Off the Shelf Software i.e. HTTrack, Adobe Web Capture Pro: inexpensive Con: not-scalable

5 Key Questions for Harvesting Projects
uniqueness ephemerality research value harvest frequency scope

6 Maryland’s Pilot Harvests (2008-2010)
Maryland State Documents Historic Preservation

7 Why harvest these areas?
Builds on existing strengths in print collections Collections are unique Large amount of material migrating to the web

8 Key Questions for Harvesting Projects
uniqueness ephemerality research value harvest frequency scope

9 Harvesting

10 Harvesting Challenges:
Javascript Streaming media Form and database driven content Password protected sites Robot.txt files Multiple hosts/subdomains

11 Single host = www.preservemd.org
Multiple hosts =

12 End-User Access

13 general material designation
End-User Access general material designation collection note URLs subject heading uniform title

14 Conclusions Challenges Start up costs What to collect
Metadata creation BUT We are well prepared to meet the challenges


Download ppt "Joanne Archer University of Maryland Libraries"

Similar presentations


Ads by Google