Managing the Electronic Collection with qualitative and quantitative data A case study: the Wiley-Blackwell collection at the University of Milan Tiziana Morocutti and Federica Zanardini
Part 1. Context and methodology SISTEMA BIBLIOTECARIO DI ATENEO
University of Milan (year 2009) 65,000 students 2,500 professors and researchers 2,000 staff 9 faculties 139 programs of study 20 doctoral schools and 73 specialization schools
UniMi Library system (SBA*) 250 FTE staff 1.5 M books 25,500 print journals (7,500 current subscriptions) 120,000 loans/y 9,000 purchased e-journals 170 databases (bibliographic and FT) 1.6 M downloads FT (e-journals)/y SFX / Metalib / Ezproxy * Sistema Bibliotecario di Ateneo
SBA and Digital Library costs 2009 budget: –UniMi 710 M –SBA 8.1 M (permanent staff costs excluded) Bibliographic materials: –ER: 1 M (2005) 3.2 M (2009) –Print (print&online included): 5.1 M (2005) 3.7 M (2009) ER acquisitions: –Directly from publishers 40% –CILEA DL (consortium) 30% –CARE (national contracts) 30%
Managing the economic crisis prolonged reduction in budget levels cancellation process / reshaping of the electronic collection development a cross-functional multi-skilled work team for supporting decisions a deeper knowledge of the collection, beginning with the Big Deal portfolio an assessment method to identify strenghts and weaknesses of e-journal packages
Analysis of the relationships among… usage perceived value subject coverage content usefulness prices IF behaviours users (gears or mines?) impact
SISTEMA BIBLIOTECARIO DI ATENEO The first case study: the Wiley-Blackwell collection contract print subscriptions + online access to the Full collection (1,188 titles in 2009) renewal - E-only deal - consortium purchase (mirroring and backfiles ownership) Data: Usage statistics (2008 JR1) Economic data (2009 price list, contract terms) Bibliographic data (2009 titles, subject coverage) Demographic data about users (2009) Results of a qualitative survey among users (2009)
A web-based survey to assess the perceived value Population: entire faculty (2,440 units) Respondents: 25% in 40 days Limits: Subjective evaluation Ambiguity (importance referred to users actual activity or to relevance in the research field?) a generic perceived value that could be used together with usage data to measure usefulness No information about impact SISTEMA BIBLIOTECARIO DI ATENEO Users were asked to select Wiley titles they consider important, specifying if essential or simply useful
Part 2. The Wiley-Blackwell collection: data analysis SISTEMA BIBLIOTECARIO DI ATENEO
Subject coverage SISTEMA BIBLIOTECARIO DI ATENEO
Title price ranges SISTEMA BIBLIOTECARIO DI ATENEO
Usage distribution 2008 downloads = 157,606 30% of titles => 85% of usage (subscribed and unsubscribed) 4% of titles never used Below the threshold value 100: downloads in a long tail-like distribution (unlike Andersons model) SISTEMA BIBLIOTECARIO DI ATENEO journals
SISTEMA BIBLIOTECARIO DI ATENEO Perceived value distribution (qualitative survey) Which Wiley-Blackwell journals are important for you? 94% of titles were selected by users 35% of titles were selected at least by 10 users
Relationship between usage and perceived value Usage and perceived value are related? Data are displayed on a scatter plot as a collection of points corresponding to titles => higher density in the area where there are low-usage low-valued journals SISTEMA BIBLIOTECARIO DI ATENEO selections downloads
Correlation between usage and perceived value Is there a linear correlation between the two variables? Pearsons index of linear correlation (0< R < 1) –R = 0.55 titles with #downloads > 100 –R = 0.35 titles with #downloads <= 100 A higher linear correlation between usage (actual usefulness) and perceived value (perceived usefulness) in case of high-usage journals could be interpreted in terms of conscious usage of the resources? SISTEMA BIBLIOTECARIO DI ATENEO
Ratio between downloads and selections Are there anomalies in the relationship between usage and perceived value? Extreme values in the ratio M=(downloads)/(selections) give interesting information: M= => titles used but not selected = underestimation? Niche journals the users of which did not answer the survey? M=0 => titles not used but selected = overestimation? Anomaly index
Journal ranking 1/2 Usefulness is defined through an algorithm combining data about usage and perceived value: U = (e + 0.2u) * downloads U = usefulness e = number of selections as essential u = number of selections as useful Titles were given a score and ranked By adding prices (p+e) the ranking list can be used to calculate savings in relation to cancellations SISTEMA BIBLIOTECARIO DI ATENEO
Journal ranking 2/2 Remarkable savings can only be obtained by giving-up a significant number of titles The first 300 titles cost as much as the entire Big Deal
Part 3. Conclusions SISTEMA BIBLIOTECARIO DI ATENEO
Results and findings Performance indicators Journal ranking Hypothesis: hit content usage nonhit content usage SISTEMA BIBLIOTECARIO DI ATENEO
The long tail-like usage of nonhit content Are information needs atomized or elastic (depending on perceptions of availability)?
After the building blocks… tasks for the future: Enhancing the assessment method measurement of content usefulness evaluation of content impact on research activity development of proper statistical methods to analize together qualitative and quantitative data Supplying practical instruments for supporting collection-development decisions starting a benchmark activity of Big Deal packages SISTEMA BIBLIOTECARIO DI ATENEO
Thank you!