Presentation on theme: "Kathryn Lybarger Fourth Friday February 22, 2013."— Presentation transcript:
Kathryn Lybarger Fourth Friday February 22, 2013
A data format used to encode and share bibliographic data Developed in the 1960’s, still quite popular
Catalog Library of Congress OCLC or SkyRiver Original Cataloging
Vendor MARC Catalog
Title: CESMM3 price database 2009, edited by Franklin + Andrews 100 1_ Franklin CESMM3 price database 2009 ‡h [electronic resource] / ‡c edited by Franklin and Andrews. 500 __ Ebook. 516 __ Document. 538 __ PDF: Adobe PDF 700 1_ Andrews …
Data may be unhelpful, or misleading Links may not work This may change over time
From one book: Title Author Series Subject headings From another book: Notes ISBN Link to e-book
Provider-neutral records may have URLs from multiple vendors An OCLC search for records with URLs from eblib, ebrary, ebscohost AND myilibrary returned over 43,000. Even if they are labeled, patrons don’t know which vendor we’re using
Some ebooks on a frontlist may never appear on the site Individual ebooks may just disappear
But not forthcoming. You may have to periodically dig several levels deep on the website to get them:
After we confronted one vendor: “Of the 15 accounts that I spot checked the most usage for (TITLE) was 5 views. However, I see that the 2 titles listed below are in the top 5 most viewed for UKY in 2012.”
Start with the best records you can find Edit MARC records to conform with local standards Verify access to all titles (periodically) Report problems when you find them
But how do you predict what you will need?
Developed by Terry Reese MARC editing in a friendly yet powerful text editor Z39.50 client (Binary editor!)
Efficiency, Consistency .mrk format is text, so you can process with your favorite programming language Don’t have a favorite language (yet)?
NORMalization and Access Checking Open source software, releasing soon Configure per vendor: Fields to add/delete Changes to make How to check links
Ebook errors can be valid web pages, and errors don’t mean you should give up! HTTP/ OK Full text ebook Web site form to buy the book HTTP/ Not Found No such page on server Broken DOI (that you should report)
Database holds a list of links to be checked Script checks each according to site profile (pausing 10 seconds between each link): Is it a PDF? Does it contain the phrase “This is not part of your subscription”? Can you click through to fulltext chapters?
Let me know! I have to know what broken ebooks look like (from a given vendor) before I can detect them If a vendor has many broken ones I can do a systematic check
MarcEdit dit/html/index.php dit/html/index.php Code Academy My GitHub repository https://github.com/zemkat/