Presentation is loading. Please wait.

Presentation is loading. Please wait.

How many PURLs would an URL Checker check… Millennium URL Checker in the Real World Mary M. Strouse Catholic University of America ILUG 2004.

Similar presentations


Presentation on theme: "How many PURLs would an URL Checker check… Millennium URL Checker in the Real World Mary M. Strouse Catholic University of America ILUG 2004."— Presentation transcript:

1 How many PURLs would an URL Checker check… Millennium URL Checker in the Real World Mary M. Strouse Catholic University of America ILUG 2004

2 M. Strouse ILUG 20042 Evolution 1994Field 856 |u designated for URLs 1998MARCxGen debuts 2000URLVerify (telnet version, Rel. 2000) 2002|u added to other MARC fields 2003Millennium URL Checker (Rel. 2002 ph. 2)

3 M. Strouse ILUG 20043 URLVerify Web Report http://[catalog]/screens/urlverify.html ~

4 M. Strouse ILUG 20044 Report Millennium URL Checker Report Summary of error types (uncheck to hide) Integrated with MilCat

5 M. Strouse ILUG 20045 Sort by column headers Resize columns (no truncation)

6 M. Strouse ILUG 20046 Highlight a row and click “Edit” to open record Click edit to get MARC record Access public view

7 M. Strouse ILUG 20047 Clicking “GO” opens URL in a browser window (for rechecking)

8 M. Strouse ILUG 20048 Locating Missing Links

9 M. Strouse ILUG 20049 Automatic Substitution of New URL Check boxes to select, then click preview tab

10 M. Strouse ILUG 200410 Uncheck any errors, click “process” Summary screen

11 M. Strouse ILUG 200411 Correcting URL Directly in Report 1) Type in “New URL” 2) Check replace box 3) Preview & process

12 M. Strouse ILUG 200412 Copying Old URL to Edit Window 1) Check replace box (must do first) 2) Select Old URL - New URL 3) Edit in new URL window 4) Preview & Process

13 M. Strouse ILUG 200413 Find and Replace (New URL)

14 M. Strouse ILUG 200414 Interactive Reports Create new interactive report Toggle between most recent Automatic and Interactive reports

15 M. Strouse ILUG 200415 Interactive report can run against entire database, a review file, an index range, or a keyword search

16 M. Strouse ILUG 200416 Monday Morning Recheck

17 M. Strouse ILUG 200417 Can’t minimize or work with desktop while report is running

18 M. Strouse ILUG 200418 Error Types

19 M. Strouse ILUG 200419 htp://app.comm.uscourts.gov Malformed URL (-2)

20 M. Strouse ILUG 200420 Network is unreachable (-7) New error type in Phase 3 (Millennium report only)

21 M. Strouse ILUG 200421 http://public.afca.scott.af.mil/public….

22 M. Strouse ILUG 200422 PURLs and Other Redirects Every server redirection reported as an error

23 M. Strouse ILUG 200423 Redirection can be a sign a resource has moved, and maintenance is warranted.

24 M. Strouse ILUG 200424 Missing slash after directory name reported as permanent redirect (301) Edit to eliminate from future reports

25 M. Strouse ILUG 200425 Server-side redirect to add timestamp http://library.nps/navy.mil/uhtbin.cgisirsi/Sun+A pr+20+22:28:15+PDT+2003/0/520/nss.p df

26 M. Strouse ILUG 200426 All PURLs are identified as redirects, not checked further True also of 3 rd -party link checkers (except Xenu)

27 M. Strouse ILUG 200427 I-Hate-PURLs Workflow Use automatic substitution to replace PURL with (current) underlying URL Replace box can’t be batch-selected.

28 M. Strouse ILUG 200428 Beware the “Leaving GPO” Message

29 M. Strouse ILUG 200429 URL Checker reports entire frwebgate “wrapper” as the new URL http://frwebgate.access.gpo.gov/cgi- bin/leaving.cgi?from=exitpurl.html&to=http%3 A//www.uscourts.gov/ttb/index.html

30 M. Strouse ILUG 200430 Library-editable URLBlock File Not a substitute for honoring “no robots” conventions!

31 M. Strouse ILUG 200431 Block can be a full URL, domain name or text string PURL.ACCESS.GPO.GOV III-specified blocks for major aggregators

32 M. Strouse ILUG 200432 Trust-the-Government Workflow 1. Unblock GPO PURLs and run interactive report monthly (e.g., after Marcive load)

33 M. Strouse ILUG 200433 2. Exclude working redirects, troubleshoot others Must load entire report before excluding redirects (slow)

34 M. Strouse ILUG 200434 WAIS Database searches reported as timeout errors (-6)

35 M. Strouse ILUG 200435 WAM Proxy Rewrite URLs Not Checked Host Unreachable (-5) 3 rd -party link checkers report all proxy-rewrite URLs OK even if nonexistent.

36 M. Strouse ILUG 200436 Fool-the-System Workflow 856 41|u http://heinonline.org/HeinOnline/ CollectionIndex.pl? journal-cjtl |z View via Hein Online Underlying URL in |u, PURL or proxy-rewrite URL within anchor tag in |z.

37 M. Strouse ILUG 200437 “ Multi-threading ” Rate The number of simultaneous “calls” sent to servers at a given time URL checker > 100 3 rd -party link checkers: 20-30 (often user-configurable) At issue when many resources concentrated on a few servers URL Checker activity may be perceived as an “attack”

38 M. Strouse ILUG 200438 Summary: What URL Checker Checks URLs in subfield u of 856 fields in Bib. Records (but not URLs in other subfields) URLs in 956 fields in electronic reserves (Millennium Media) records

39 M. Strouse ILUG 200439 And What it Doesn’t… URLs or domains in the URLBlock file (aggregators, etc) Purls and other redirects Proxy-rewrite URLs in WAM Electronic journal issue URLs in checkin boxes URLs in bibliographic record notes

40 M. Strouse ILUG 200440 Suggestions for Further Development – Reports & Editing Pre-configure large interactive reports (faster loading) Allow minimization during report prep Bypass summary of attached items Improve copy & paste, batch select & replace. Interactive checking of “New URL” column

41 M. Strouse ILUG 200441 Suggestions for Further Development – Functionality Follow redirects to final destination Honor page-level and server-level robot exclusions, and report with a unique status code Customize multi-threading rate Output report in CSV (comma-delimited) format

42 M. Strouse ILUG 200442 URL Checker Documentation Millennium Manual (Rls. 2003)  Permissions (#105370)  Reports (#105371)  Edit/Replace capability (#105372)  URLBlock (#105373)

43 M. Strouse ILUG 200443 URLVerify Documentation  Innopac manual, pages 102151-102153  Maintaining Hyperlinks in the WebPac: Tools and Tradeoffs (IUG 8, May 2000) http://www.du.edu/~ttyler/iug2000/ct w/index.html  Tom Tyler’s freeware http://www.du.edu/~ttyler/freeware/

44 M. Strouse ILUG 200444 URL Display WWWOptions DISPLAY_856 – Defines the order and placement of subfields that form the hypertext link in an OPAC display (default is |z then |u) Multiple subfields (including access and usage notes) display as a single underlined link. Enhancement request: separate WWWoptions to control display of link and notes.

45 M. Strouse ILUG 200445 URL Display WWWOptions LINK856TEXT – Defines the phrase that appears above the hypertext link in a full display (Default is “Click here to:”) ICON_856LINK – controls display of 856 link in a brief display (Manual #102168)

46 M. Strouse ILUG 200446 Contact: Mary M. Strouse DuFour Law Library, Catholic University of America strouse@law.cua.edu Thank you!


Download ppt "How many PURLs would an URL Checker check… Millennium URL Checker in the Real World Mary M. Strouse Catholic University of America ILUG 2004."

Similar presentations


Ads by Google