Presentation is loading. Please wait.

Presentation is loading. Please wait.

Through the Bytes Darkly Through the Bytes Darkly, Management Information and the Digital Library Joe Zucca Assessment, Planning and Publications Librarian.

Similar presentations


Presentation on theme: "Through the Bytes Darkly Through the Bytes Darkly, Management Information and the Digital Library Joe Zucca Assessment, Planning and Publications Librarian."— Presentation transcript:

1 Through the Bytes Darkly Through the Bytes Darkly, Management Information and the Digital Library Joe Zucca Assessment, Planning and Publications Librarian University of Pennsylvania Library Information Technology Interest Group ACRL, New England Chapter May 17, 2002

2 Through the Bytes Darkly 1. Environmental Audit: Key Factors That Influence Our Ability to Measure Digital Information Use 2. From Low Resolution to High Resolution Data: Mining the Server Logs 3. The Data Farm Experiment: Tools That Serve Access Can Also Serve Measurement 4. Why the Data Are Important Four Sections of This Presentation:

3 Through the Bytes Darkly Strategic Focus Base planning, goal setting/assessment on empirical evidence. From 1996- an element of Penn’s Strategic Plan Operational Imperatives 1) Make evaluation and measurement a component of each program and project 2) Construct relays that feed data to people who need quantitative information to strategize and manage Experimental Attitude Leverage the data you have; usually they’re “good enough” to validate organizational experience and knowledge 1. Organization and Culture Measuring Electronic Use at Penn: Environmental Influences

4 Through the Bytes Darkly 2. Proliferation of Electronic Resources Article indexes, e-journals and other full-text resources Measuring Electronic Use at Penn: Environmental Influences

5 Through the Bytes Darkly 2.1. Growth of Expenditures for Electronic Resources Measuring Electronic Use at Penn: Environmental Influences 1991 1993 1996 1999 2000 2001 3.7% 3.2% 5.5% 13.2% 13.9% 15.7% E-Resources as a percent of acquisitions budget Annual Growth of Expenditures for Electronic Information Based on 1991 0% 100% 200% 300% 400% 500% 600% 700% 800% 900% 1000% 199119931996199920002001 PCT Increase in Expenditure

6 Through the Bytes Darkly  Volatile metrics (“The new system doesn’t count that way!”)  Ever-changing data elements (“sets are out “searches” are in)  No common metrics (log-ins, sessions, searches, browses, page hits…)  No measurement standards (What’s a “search”?, What’s a Web “session”?)  Non existent or inaccessible data (the vendor problem)  Approximate & hard to obtain statistics (lots of data, no information)  Fleeting benchmarks 3. Technology’s Hostility to Measurement Measuring Electronic Use at Penn: Environmental Influences

7 Through the Bytes Darkly From Low Resolution to High Resolution Data: Mining the Server Logs for Descriptive Statistics dial-123-130.dial. indiana.edu - - [04/ Feb/2001 :00:18:02 -0500] "GET /special/ photos/ theater/504.html HTTP/1.0" 200 3247 "http://www.library.upenn. edu /special/photos/ theater /503.html" "Mozilla/4.7 C-CCK MCD {C-UDP; EBM-APPLE} (Macintosh; I; PPC)” dialin1085. upenn.edu--[04/Feb/ 2001:00:18: 04 -0500]"GET/facilities/count_ use.html?resource =China%20Economic%20 Review& method= ejs& url= http://www.sciencedirect.com/ science/journal/ 1043951XHT TP/1.0" 200 2027 "http:// www.library.upenn.edu/webbin 5/ resources/ejspubl ic5.cgi?homepage=http:// www. library.upenn.edu/lipp incott/&community= Business" "Mozilla/ 4.0 (compatible; MSIE 5.0; Windows 98; DigExt; SPIKE 5)” 203.197. 226.240 - - [04/Feb/2001:00:18:07 -0500] "GET /etext/sasia/aiis/ architecture/khajuraho/ 010a.jpg HTTP/1.0" 200 89117 "http://www.library.upenn.edu/etext/sasia/ aiis/arch itecture/khajuraho/010.html" "Mozilla/4.7 [en] (Win95; I)”

8 Through the Bytes Darkly Records in locally-managed databases (including the OPAC)………………………26,332,138 Number of journal article indexes & full-text files (e.g. Academic Index)…….……………...267 Number of e-journals (from publishers such as Elsevier and free sources)…..…………..6,608 Number of digital books (locally created, aggregated and licensed)…….……………...110,000 Number of locally digitized and accessible images (e.g. fine art slides, ms facsimiles)..82,356 Number of records in the OPAC ……………………………….....……………………...2,879,696 Number of pages, forms and directories constituting the library web site……………….32,000 Inputs Low Resolution

9 Through the Bytes Darkly Low Resolution Web Pages Served 1995-2001 from www.library.upenn.edu. 3-month moving averagewww.library.upenn.edu The Load on Our Machines

10 Through the Bytes Darkly 0 5,000,000 10,000,000 15,000,000 20,000,000 25,000,000 199619971998199920002001 OPAC Web Changing Machine Demand Pages Served by the Main Library Web Server + OPAC Server 2002 BlackBoard Low Resolution Projected

11 Through the Bytes Darkly Search Activity Over Time searches Annual Searches in Licensed Databases (e.g., MEDLINE), FY97-01 Low Resolution

12 Through the Bytes Darkly Correlation Matrix of Use Metrics Available for Ovid Files Pearson r for Sessions, Connect Time, Sets, Documents Viewed 99 cases Sessions TimeSetsDocs.Viewed Sessions 1.00 Time.980 1.00 Sets.905.9711.00 Documents Viewed.844.932.9831.00

13 Through the Bytes Darkly Correlation Matrix of Use Metrics Available for SilverPlatter Files Pearson r for Sessions, Connect Time, Searches, Documents Viewed 94 cases Sessions TimeSearchesAbs. Viewed Sessions 1.00 Time.975 1.00 Searches.899.9011.00 Abstracts Viewed.840.870.8551.00

14 Through the Bytes Darkly  Are we choosing the right information sources for our audiences?  …optimizing the delivery of electronic information?  …making access as easy and seamless as possible?  …spending our dollars wisely?  …able to detect and respond to change in the patterns of resource use? High Resolution Data + User Input + Good Program Liaison and Knowledge Support Resource Management, and Inform Basic Questions, e.g.:

15 Through the Bytes Darkly Using the Architecture of the Web to Increase Data Resolution www.library.upenn.edu/facilities/count_use.html

16 Through the Bytes Darkly dial-123-130.dial. indiana.edu - - [04/Feb/2001:00:17:38-0500] "GET/special/photos /theater/505.html HTTP/1.0" 200 3086 "http://www.library. upenn.edu/special/photos/theater/504.html" "Mozilla/4.7C-CCK-MCD {C-UDP; EBM-APPLE} (Macintosh; I; PPC)” recrawler 1.bos2.fastsearch.net - -[04/Feb/2001:00: 18:21- 0500] "GET /etext/ sasia/skt-mss/1549 /15a.html HTTP/1.0" 200 2736 "-" "FAST -WebCrawler/2.2-pre27 (crawler@ fast.no; http://www.fast.no/faq/ faqfastweb search/faqfastwebcrawler.html)" 130.91.196.245.in-addr.arpa--[04/Feb/2001:00 :17:40 - 0500] "GET /facilities/count_use.html?resource =ABI/Inform %20 %20Ovid &method= Ovid&url=http:// www.abi-ovid.library.upenn.edu/ovid web/ovidweb.cgi? T=JS& PAGE =main&MODE=ovid& D=infoz HTTP/1.1" 200 2039 "http://www.library.upenn.edu/webbin5/resources/ databases.cgi? business" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0)” 203.197.226.240 - - [04/Feb/2001:00:17:41 -0500] "GET /etext/sasia/aiis/architecture /khajuraho/010.html HTTP/1.0" 200 4427 "http://www. library.upenn.edu/etext/ sasia/ aiis/architecture/ khajur aho/" "Mozilla/4.7 [en] (Win95; I)” 203.197.226. 240- -[04/Feb/200 1:00:17:44 -0500] "GET /images/banner. gifHTTP/1.0" 404 2814 "http://www.library. upenn. edu/etext/sasi a/aiis/architecture /khajuraho/010.html" "Mozilla /4.7 [en] (Win95; I)"pub237.lib.upenn.edu - - [04/Feb/ 2001:00:17:48 -0500] "GET / HTTP/1.0" 200 8070 "-" "WebTrends Alert” dial-123-130.dial. indiana.edu - - [04/ Feb/2001 :00:18:02 -0500] "GET /special/ photos/ theater/504.html HTTP/1.0" 200 3247 "http://www.library.upenn. edu /special/photos/ theater /503.html" "Mozilla/4.7 C-CCK MCD {C-UDP; EBM-APPLE} (Macintosh; I; PPC)” dialin1085. upenn.edu--[04/Feb/ 2001:00:18: 04 - 0500]"GET/facilities/count_use.html?resource=China%20Economic%20 Review& method= ejs& url= http://www.sciencedirect.com/ science/journal/ 1043951XHT TP/1.0" 200 2027 "http:// www.library.upenn.edu/webbin 5/ resources/ejspubl ic5.cgi?homepage=http:// www. library.upenn.edu/lipp incott/&community= Business" "Mozilla/ 4.0 (compatible; MSIE 5.0; Windows 98; DigExt; SPIKE 5)” 203.197. 226.240 - - [04/Feb/2001:00:18:07 -0500] "GET /etext/sasia/aiis/ architecture/khajuraho/ 010a.jpg HTTP/1.0" 200 89117 "http://www.library.upenn.edu/etext/sasia/ aiis/arch itecture/khajuraho/010.html" "Mozilla/4.7 [en] (Win95; I)” Beginning with a stream of unprocessed log data...

17 Through the Bytes Darkly Æ |http://www.uqtr.uquebec.ca/AE/index.html|World||||History of Art|F- T|No|07-16-1999 : 11:11|10-25-2000 : 11:30|| ABA Bank Compliance |http://proquest.umi.com/pqdlink?Ver=1&Exp=07-01- 2003&REQ=3&PUB=14954&Cert=0CEccdp7 aMS6kuCDmdhPNL%2bQ2tTOLTrDEHAz%2bYmHN172RUqZPCJ2SvAT X%2bFGA7htIYkVlFVWSyawE0NvKlpBZ%2bO%2f%2bLEWBnchnwLT9% 2b%2fdGGHSlx0PO3dxUQd3g2S9QP2FghKaQ2ncl5EdDKBum2vykhvxsy RQutjuMGKfxAKHOA4-|Penn|ABI/Inform|||Business,Finance|F-TPI| No|03- 13-2001: 00:01|03-14-2001 : 11:31|mw| ABA Journal |http://proquest.umi.com/pqdlink?Ver=1&Exp=07- 012003&REQ=3&PUB=27585&Cert=PfySiFXf1 0i6kuCDmdhPNL%2bQ2tTOLTrDEHAz%2bYmHN172RUqZPCJ2SvATX% 2bFGA7ht1pGvDP%2bFxrGwE0NvKlpBZ%2bO%2f%2bLEWBnchnwLT9% 2b%2fdGGHSlx0PO3dxUQd3g2S9QP2FghKaQ2ncl5EdDKBum2vykhvxsy RQutjuAyIsegc4Y7Y-|Penn|ABI/Inform|||Finance|F-TPI|No|03-13-2001: 00:01||mw| ABI/Inform |http://www.umi.com/pqdauto|Penn||||Biomedic al Research,Management,Business,Clinical Medicine,Clinical Medicine,Nursing, Econo mics, Health Care Policy & Management| F-TSDb|No|07-16-1999 :11:11|02-09-2001 12:14|| …and information culled from databases that generate our Web pages...

18 Through the Bytes Darkly …to extracting, parsing, storing, and mining for significant content.

19 Through the Bytes Darkly Database Log-ins Pct Total Cost Per Login Use of Licensed Resources What Databases Do Our Clients Use at What Cost? 15 Most Frequently Used Index/Abstract/Full-text Databases in FY 2001 MEDLINE205,15022.9% $ 0.10 LEXIS/NEXIS63,8177.1% $ 0.42 Academic Index52,4075.9% $ 0.58 Dow Jones39,8284.5% $ 0.68 ISI Citation Indexes39,7534.4% $ 2.75 ABI/Inform36,1904.0% $ 1.09 PsycINFO27,6363.1% $ 0.89 Investext17,6952.0% $ 0.68 Business & Industry16,7971.9% $ 0.55 CINAHL/Nursing16,2321.8% $ 0.36 PubMed15,6101.7% $ - MLA International13,3591.5% $ 0.41 Multex12,1961.4% $ 0.10 ERIC10,8521.2% $ 0.54 EconLit8,9401.0% $ 0.80 Hoovers Online8,9051.0% $ 0.22 Inter Bibliog Soc Science8,1520.9% $ 0.38 Sociological Abstracts7,7030.9% $ 1.58 S&P Industry Surveys7,3460.8% $ 0.63 D&B Million $ Database6,3760.7% $ 1.74 All others894,416100.0%

20 Through the Bytes Darkly Science4,2321.5%3,1141,057 Nature4,0811.4%2,8801,173 Journal of Biological Chemistry2,4080.8%1,883519 Journal of the American Chemical Society2,4050.8%2,153247 New England Journal of Medicine1,9940.7%1,359620 Angewandte Chemie (international edition)1,8360.6%1,665167 Journal of Organic Chemistry1,6600.6%1,504150 Proceedings of the National Academy of Sciences1,6080.6%1,246360 Tetrahedron Letters1,3610.5%1,218143 Organic Letters1,3080.5%1,20899 Proceedings of the National Academy of Sciences, U.S.1,2850.5%1,017266 Journal of Molecular Biology1,0600.4%850210 JAMA: The Journal of the American Medical Association1,0230.4%650352 Journal of Chemical Physics9920.3%819172 Journal of Finance8870.3%423378 Lancet8670.3%637227 American Journal of Sociology8600.3%384373 Medicine8490.3%580263 Applied Physics Letters8340.3%75183 Physical Review B8260.3%72798 What Are the High Use E-Journals, Data for FY2001 Title Log-ins Pct Total Log-ins Log-ins On Campus Off Campus Use of Licensed Resources

21 Through the Bytes Darkly ScienceDirect139,72727.1%$0.63 ECO 70,73013.7%$0.09 JSTOR 48,6689.4%$0.35 Wiley 38,2557.4%$0.09 ACS 31,8656.2%$0.12 Ideal 30,5685.9%$5.51 Blackwell/Munksgaard 28,9405.6%$0.27 Journals@Ovid 26,9825.2%n/a Oxford 14,8192.9%$0.20 SpringerLINK 13,5072.6%n/a ABI/Inform 12,7852.5%$3.08 Project Muse 11,4382.2%$1.22 AIP 7,8731.5%$5.01 Cambridge 7,8351.5%n/a Annual Reviews 7,2151.4%$0.08 IEEE 7,1321.4%$6.73 RSC 5,6611.1%n/a Others† 11,4512.2% Total515,451100% † 11 publishers Publisher Log-ins Pct of Total Cost Per Login How Much Bang Do We Get on the Dollar For E-Journals? E-Journal Subscription Costs Per Log-In, FY2002 (July-April) Use of Licensed Resources

22 Through the Bytes Darkly Use of Licensed Resources How Does Use Scatter Across Databases Use Measured in Log-ins for FY 2001

23 Through the Bytes Darkly 0 5 10 15 20 25 30 35 40 45 50 55 ASC SSW MED NUR WHRT VET ADM SEAS SAS GSE GSFA DENTAL LAW† Log-ins Per Capita †Does not include resources licensed by the Law Library for Law school affiliates Use of Licensed Resources How Does Database Use Distribute By Communities? School and Center Domains Per Capita Use of Databases by Penn’s Schools and Centers, FY 2001 9.3%Dorms 0.5%Law 0.9%Dental 1.0%Fine Arts 1.3%Commnctn 1.8%Social Wrk 1.4%Education 2.5%Veterinary 3.5%Nursing 4.4%Enginrng 4.8%Admin 12.3%Wharton 12.8%In-Library 20.4%Arts & Sci 23.2%Medicine School Pct of Log-ins Database Use by Penn’s Schools & Centers

24 Through the Bytes Darkly Human.LifeSocialBusinessPhysicalTotal ScienceScienceScience Administration21.1%36.5%13.9%07.0%21.6%100.0% Wharton02.9%74.3%03.2%19.2%00.5%100.0% Annenberg 15.2%32.1%42.3%08.9%01.5%100.0% Medical02.3%86.0%01.9%01.0%08.8%100.0% Dental01.8%87.7%08.9%00.2%01.4%100.0% Veterinary01.7%96.0%00.6%00.4%01.3%100.0% Dialin08.5%63.2%09.9%15.4%02.9%100.0% Education24.6%13.1%61.5%00.8%00.0%100.0% Fine Arts29.0%18.5%45.7%5.6%01.2%100.0% Law13.0%26.6%20.9%37.0%02.4%100.0% Library21.3%54.8%09.1%08.5%06.3%100.0% Nursing15.9%73.1%07.8%03.2%00.0%100.0% Student Residences18.9%57.0%12.6%09.0%02.5%100.0% Arts and Sciences08.2%26.3%5.7%09.9%49.9%100.0% Engineering0 1.5%29.5%2.3%01.2%65.6%100.0% Social Work20.6%29.1%41.6%06.1%02.7%100.0% Unresolved18.9%44.7%17.8%10.0%08.6%100.0% Total14.7%50.7%11.9%8.6%14.1%100.0% Network Domain Subject focus Database & E-Journal Log-ins by Subject (based on log samples from FY2001) Use of Licensed Resources

25 Through the Bytes Darkly On-Campus Depts 50% In-Library 25% Off-Campus 15% Campus Residences 10% Use of Licensed Resources Where Do Our Clients Access Information? Database Log-ins by Domain, FY2001

26 Through the Bytes Darkly Database Log-ins from Off Campus as a Percent of Total Log-ins, FY2001 0.0% 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 70.0% 80.0% 90.0% 100.0% SSW GSE WHRT NURS SAS DENTL GSFA SEAS ASC MED LAW ADM† VET On Campus Off-Campus School or Center Use of Licensed Resources Where Do Communities of Clients Work? Pct. of Log-ins

27 Through the Bytes Darkly When Are They Working? Use of Licensed Resources Database Use by Time of Day, FY2001 0 5000 10000 15000 20000 25000 12-1AM 1-2 AM2-3 AM3-4 AM4-5 AM5-6 AM6-7 AM7-8 AM8-9 AM 9-10 AM 10-11 AM11-12 AM 12-1 PM 1-2 PM2-3 PM3-4 PM4-5 PM 5-6 PM 6-7 PM7-8 PM8-9 PM 9-10 PM 10-11 PM11-12 PM Attempted Logons In-LibraryStudent HousesSchoolsCampus Modem Pool

28 Through the Bytes Darkly Use of Licensed Resources How Does Audience Composition Change Through the Day? Database Use by hour, FY2001

29 Through the Bytes Darkly The Data Farm Experiment: Tools That Serve Information Access Can Also Serve Measurement

30 Through the Bytes Darkly Schematic of the Data Farm As of May 2002

31 Through the Bytes Darkly To Demonstrate Accountability: Is the library spending the Schools’ money effectively? (Pressures of Penn’s responsibility center budget environment) To Understand and Describe the Transfer of Technology: Is the academic information universe a digital universe (as some at Penn believe)? Is the digital universe more cost efficient than the paper one (as some at Penn believe)? To Guide the Improvement of Existing and the Development of New Services To Ensure the Successful Fulfillment of Our Mission Why Are the Data Important? “If you don’t know where you’re going, you’ll probably end up somewhere else” - Casey Stengel

32 Through the Bytes Darkly Through the Bytes Darkly, Management Information and the Digital Library Joe Zucca University of Pennsylvania Library zucca@pobox.upenn.edu


Download ppt "Through the Bytes Darkly Through the Bytes Darkly, Management Information and the Digital Library Joe Zucca Assessment, Planning and Publications Librarian."

Similar presentations


Ads by Google