Presentation on theme: "What we can learn about virtual scholars from usage data obtained from deep log analysis Professor David Nicholas, Dr Tom Dobrowolski and Paul Huntington."— Presentation transcript:
What we can learn about virtual scholars from usage data obtained from deep log analysis Professor David Nicholas, Dr Tom Dobrowolski and Paul Huntington CIBER, University College London http://www.ucl.ac.uk/ciber/
Structure of talk Why we are studying the virtual scholar The techniques we use (DLA) Research projects and analyses undertaken What we have discovered Implications of our research
The problem: everything has changed and got really big From control to no-control, from mediated to non mediated From bibliographic systems to full-text, visual, interactive ones From niche to universal systems From a few searchers to everybody From little choice to massive choice From little change to constant change
Which can mean – paradigm shift, no grip, floundering Existing knowledge base obsolescent, flawed, wholly inadequate And there are huge issues to deal with – OA, IR, Big Deals We dont even know what questions to ask anymore We are left generalising about too many people Should be spending lots of time and money researching the user…but are not
Mechanisms needed to provide grip and understanding – deep log analysis (DLA) Digital fingerprints/CCTV – refine and relate –Proprietary software too limiting, misleading and report structure insufficiently focused on your needs –With DLA raw logs are edited/parsed and directly imported into SPSS and usage (and search) data are analysed according to (bespoke) need –Log data then related to demographic datasets – generated by subscriber/user databases or questionnaires and then triangulated with focus group/observation etc data
Deep log analysis: attractions Size and reach. Enormous data set; no need to take a sample Direct & immediately available record of what people have done: not what they say they might, or would, do; not what they were prompted to say, not what they thought they did Data are unfiltered and provide a reality check sometimes missing from questionnaire and focus group Data real-time and continuous. Creates a digital lab environment for innovation and the monitoring of change Raises the questions that need to be asked by questionnaire, focus group and interview
CIBER deep log studies 1.Maximising library investments in digital collections through better data gathering and analyses (MaxData): OhioLINK study. Institute of Museum and Library Studies, 2005-2007 2.Virtual Scholar research programme – use and impact of digital libraries in academe. Blackwell/Emerald, 2003-2004. 3.Characterising open access journal users and establishing their information seeking behaviour using deep log analysis: case study OUP Open. OUP, 2005-2006 4.Physics journals: a deep log analysis of IoPP journals. Institute of Physics, 2005-2006 5.Core scholarly research trends study: deep log analysis of Elsevier ScienceDirect users. Elsevier, 2005. 6.Digital journals – site licensing, library consortia deals and journal use statistics. The Ingenta Institute, 2002.
Kinds of analysis conducted Use analysis By number of items viewed, number of sessions conducted, site penetration, repeat visits, time online, kind of items viewed, pattern of item use (TOC, abstract, full-text) User analysis By age, gender, occupation (student, practitioner) organisational affiliation, heavy/light, referral link used, type of university (research/teaching), subject/discipline of journal, subject discipline of the user, department of the subnet, search approach adopted, geographical location; whether purchased online or not; use of additional functions
What have we learnt We have never had such a large data set of usage data. From the digital fingerprints of millions of users and tens of millions of transactions from a wide range of digital journal platforms we have drawn some interesting and controversial conclusions about the behaviour of the virtual scholar I dont recognise the users you are describing.
Information seeking characteristic 1 Phenomenally active and interested In case of Blackwell Synergy, about half a million people used the site a month; nearly 5 million items viewed during the same period In case of OhioLINK 6000 journals available and all bar about 5 not used within a month Two-thirds of EmeraldInsight visitors non- subscribers
Information seeking characteristic 2 Shallow searchers, suggesting a checking- comparing, dipping sort of behaviour that is a result of easy access, a shortage of time and huge digital choice Flicking Over two thirds typically view no more than three items in a session and then leave; Scientists view less (66% view no more than three items) and Humanities scholars more (56%); overall just 10% view more than ten items Differences in what they view when online
A digital consumer trait…scholarly journal users Type of user/session Number of items viewed Emerald Insight (Jan-Dec 2002) Blackwell Synergy (February 2004) Bouncer/checker1 to 3 70 67 Moderately engaged 4 to 10 20 26 Engaged11 to 20 6 5 Seriously engaged Over 21 4 2 Total 100
Type of item requested by subject category of article (Synergy)
Information seeking characteristic 3 Unpredictable form of behaviour in which there appears to be little user loyalty, repeat behaviour or use of memory Within a year it appeared that two-thirds of people did not come back Some more likely to return….
Some more likely to return (Synergy)
Information seeking characteristic 4 Search a variety of sites to find what they want…together with characteristic 2 this makes them promiscuous in information seeking terms Younger scholars more promiscuous
Information seeking characteristic 5 A bouncing, checking, promiscuous and consumer form of behaviour creates enormous volatility and unpredictability Digital visibility, sales mentality I may read books, surf, ask, watch telly even - the answer could come from anywhere
Information seeking characteristic 6 Increased visibility leads to increased exposure and use of older scientific material History downloads to material older than 5 years old (54%) – same for language and literature ; Materials Science (59%) Physiology (64%)
Information seeking characteristic 7-9 Untrusting: trust up for grabs, authority to be won (and checked). Brand problems - Tesco Seemingly lazy and easily lead in retrieval terms - determined by digital visibility, promotion, search engines and poorly thought through search expressions Search approach/form of navigation taken has an enormous impact on what is seen/used. People using the search engine were: far more likely to conduct a session that included a view to an old article; more likely to view more subjects, more journals, and also viewed more articles and abstracts too.
Conclusions and implications Choice and a common and multi-function retrieval platform is changing us all, making us all a little bit more similar and should question strongly our assumptions about the scholar We are not good at using the evidence…digital concrete and digital fog…so big questions here for our funders, libraries etc We need to get closer to the user but we are moving further apart and data enables us to get closer Evaluation is actually part of a system and not separate from it
References Nicholas, D., Huntington, P. and Watkinson, A. Scholarly journal usage: the results of deep log analysis. Journal of Documentation, 61(2), 2005, 246- 280. Nicholas, D., Huntington, P., Dobrowolski, T., Rowlands, I., Jamali, H. R. & Polydoratou, P. Revisiting obsolescence and journal article decay through usage data: an analysis of digital journal use by year of publication, Information processing and Management, 41(6), 2005, 1441-1461. Nicholas D, Huntington P, Monopoli M and Watkinson A. Engaging with scholarly digital libraries (publisher platforms): the extent to which added- value functions are used. Information Processing & Management. 42(2), 2005, pp?? Nicholas D, Huntington P, Williams P and Dobrowolski T. The Digital Information Consumer in New directions in human information behaviour. Edited by A Spink and C Cole. Kluwer Academic, 2005 Nicholas D, Huntington P, Russell B, Watkinson A, Hamid R. Jamali, Tenopir, C. The big deal: ten years on. Leaned Information 18(4) October, 2005, pp??
Number of different journals viewed by access method (OhioLINK)
Journal Subject categories viewed by Sociology (OhioLINK)
Subject of journal by date of material viewed (OhioLINK)