How Does Digitization Affect Scholarship? Mark McCabe University of Michigan Roger Schonfeld Ithaka Christopher Snyder Dartmouth College December 11, 2007.

Slides:



Advertisements
Similar presentations
The DART-Europe E-theses Portal and ETDs from the Czech Republic Martin Moyle Digital Curation Manager UCL Library Services, UK 4 th.
Advertisements

June 23, 2003AVW International Mortality Comparisons Richard MacMinn Richard MacMinn Edmondson-Miller Chair Katie School College of Business Illinois.
“American high-school education is ‘obsolete’… In 2001, India graduated almost a million more students from college than the United States did. China graduates.
Political Map of Europe. 1. British Isles 2. Nordic Nations 3. Central Western Europe 4. Mediterranean Europe 5. Eastern Europe.
Scoping review to draw together data on safeguarding children and compare the position of England with that in other countries Emily Munro and Esme Manful.
1 FE Panel Data assumptions. 2 Assumption #1: E(u it |X i1,…,X iT,  i ) = 0.
Lecture 4 (Chapter 4). Linear Models for Correlated Data We aim to develop a general linear model framework for longitudinal data, in which the inference.
World Wine Trade in 2014 April 17, 2015 Rafael del Rey Spanish Observatory of Wine Markets.
What does it take to make online deliberation happen? -A comparative analysis of 28 online discussion forums Martin Karlsson PhD Student in.
Shall we take Solow seriously?? Empirics of growth Ania Nicińska Agnieszka Postępska Paweł Zaboklicki.
Bibliometrics overview slides. Contents of this slide set Slides 2-5 Various definitions Slide 6 The context, bibliometrics as 1 tools to assess Slides.
JRC's Open Access (OA) Policy G. P. Tartaglia, A. Annoni, G. Merlo, F
Pension systems during the financial and economic crisis Edward Whitehouse Social Policy division, OECD.
Natural Resources and Economic Growth: The Role of Investment Thorvaldur Gylfason and Gylfi Zoega.
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT This sequence describes the testing of a hypotheses relating to regression coefficients. It is.
OECD World Forum “Statistics, Knowledge and Policy”, Palermo, November Territorial Indicators for Regional Policies Vincenzo Spiezia Head,
Centre for Tax Policy and Administration Organisation for Economic Co-operation and Development Trends in Top Incomes & Inequality, and their implications.
Journal Impact Factors and H index
College of Agriculture and Life Sciences WELCOME Associate Professor P&T Workshop Transitioning from Associate to Full Professor April 23, 2015.
Using the H-index to Measure Czech Economic Research and Czech Researchers’ Habits Related to Research Papers T. Cahlík, H. Pessrová.
Decay Effects in Online Advertising: Quantifying the Impact of Time Since Last Exposure Authors: Christian Kugel, Starcom IP Bill Havlena, Ph.D., Dynamic.
Assessing child-well-being: perspectives and experiences of Health Behaviour in School- Aged Children (HBSC) Study A World Health Organization Cross- National.
Institutional diversity: some trends and some hypotheses Richard Yelland OECD Directorate for Education OECD/France International Conference CNAM, 8-9.
Access to finance in the euro area: what are SMEs telling us about the crisis? Annalisa Ferrando European Central Bank The economics of small businesses.
1. Measuring the Impact of Universal Preschool Education and Care on Literacy Performance Scores. Tarek Mostafa Institute of Education – University of.
New Skills for New Jobs: Action Now Professor Mike Campbell OBE Director of Research and Policy ETUC Conference International Trade Union House, Brussels.
An Examination of the Program for the International Assessment of Adult Competencies (PIAAC) Findings in the United States National Council for Workforce.
Global Science Forum OECD Global Science Forum Study on Declining interest in science studies Preliminary Report on the Quantitative Analysis Prof. Jean-Jacques.
Reversing the reversal? The cross-country correlation between female labour market participation and fertility revisited Anna Matysiak and Tomáš Sobotka.
1 Achievement, Standards, and Assessment in Iowa and in Iowa Districts.
November 8, Global Competitive Internet Usage Forecasting Across Countries and Languages June Wei Department of Management/MIS College of Business.
Innovation for Growth – i4g Universities are portfolios of (largely heterogeneous) disciplines. Further problems in university rankings Warsaw, 16 May.
The Value of Old Data: Trends in GSA Data Repository Usage Matt Hudson, Geological Society of America, 3300 Penrose Place, Boulder CO INTRODUCTION.
What Do Faculty Think of the Changing Environment? Kevin Guthrie Roger C. Schonfeld April 17, 2007.
OECD Short-Term Economic Statistics Working PartyJune Establishing guidelines for creating long time series for short-term economic statistics.
The IEA Civic Education Study as a Source for Indicators of Civic Life Skills Judith Torney-Purta Carolyn Barber Gary Homana Britt Wilkenfeld University.
Cross-national attitudinal research
Emerald Group Publishing Presenters: Radka Krivankova Business Manager for South Eastern Europe & Baltics & Poland
Deposit Insurance Coverage, Ownership, and Banks’ Risk- taking in Europe Apanard Angkinand Department of Economics, University of Illinois at Springfield.
Population Mortality and Morbidity in Ireland n April 2001.
Where Should I Publish? Journal Ranking Tools eigenfactor.org SCImago is a freely available web resource available at This uses.
Faculty Survey 2009: The Format Transition for Scholarly Works Ross Housewright ALA Annual /26/2010.
Introduction Objective  Generally speaking, economists believe that recessions (a decrease in production or GDP) begin with a slump in investment spending.
Lecture 3 Linear random intercept models. Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The.
The health of grandparents caring for their grandchildren: the role of early and mid-life conditions Karen Glaser, Giorgio Di Gessa, Anthea Tinker Age.
Heads & Managers Forum Finance 1 December Road Travelled Ireland v Selected Other European Countries 2 Evolution Public Funding Country/System 2008.
Panel Data. Assembling the Data insheet using marriage-data.csv, c d u "background-data", clear d u "experience-data", clear u "wage-data", clear d reshape.
Retirement in Europe Annika Sundén Presentation at 16th Annual Meeting of the Retirement Research Consortium “Social Security and the Retirement Income.
Hugo Horta Center for the Advancement of Higher Education, Tohoku University Japan CIES-ISCTE, Portugal.
Problem gambling in Europe: Why a regulatory authority needed Dr Mark Griffiths Professor of Gambling Studies International Gaming Research Unit
Using Open Access Publishing for the Effective Dissemination of African Research PKP PUBLIC KNOWLEDGE PROJECT Ensuring a Journal’s Economic Sustainability,
The Effect of Wage Differences on the Cyclical Behavior of the Two Genders in the Labor Market Nissim Ben-David.
European Survey FENCA Number of respondents Austria 0 Belgium 0 Czech Republic 4 France 11 Germany 103 Greece 0 Italy 30 Netherlands 0 Norway.
Semmelweis University Publication Database Semmelweis University Publication Database László Hunyady MD, PhD, DSc Department of Physiology Semmelweis University,
Scientists and public communication: A survey of popular science publishing across 15 countries EMA Thematic Conference, Bordeaux March 29-30, 2010 Peter.
Cohort religiosity: does it stay at a stable level everywhere and across all cohorts? Marion Burkimsher University of Lausanne.
Recent reforms in decentralization frameworks in OECD countries: financial, institutional and territorial aspects Joaquim OLIVEIRA MARTINS Head, OECD Regional.
Francia L., Gitto L., Mennini F.S., Polistena B (*). HEALTH EXPENDITURE IN OECD COUNTRIES: AN ECONOMETRIC ANALYSIS Francia L., Gitto L., Mennini F.S.,
USD billion

Living in Fear, Living in Safety: A Cross-National Study
How Canada Compares Internationally
Managing Research: Quo Vadis
Greg Tananbaum ScholarNext Consulting November 4, 2016
“GHG Data – 2006” Greenhouse Gas (GHG) Emissions Data for 1990–2004
چگونه بنویسیم و کجا چاپ کنیم؟
Table 4. Panel Regression with Fixed Effects
COUNTRIES TO LABEL MAP OF EUROPE Austria Belarus Belgium Bulgaria
Includes data from the Welsh Cancer Intelligence and Surveillance Unit
European Agency Statistics on Inclusive Education (EASIE) www
Presentation transcript:

How Does Digitization Affect Scholarship? Mark McCabe University of Michigan Roger Schonfeld Ithaka Christopher Snyder Dartmouth College December 11, 2007

What Characteristics Are Important to Authors?

Journal Characteristics Important to an Author When it comes to influencing decisions about journals in which to publish an article of yours, how important to you is each of the following possible characteristics of an academic journal? a)The journal makes its articles freely available on the Internet, so there is no cost to purchase or to read. b)The journal permits scholars to publish articles for free, without paying page or article charges. c)Measures have been taken to ensure the protection and safeguarding of the journal’s content for the long term. d)The current issues of the journal are circulated widely, and are well read by scholars in your field. e)The journal is highly selective; only a small percentage of submitted articles are published. f)The journal is available to readers not only in developed nations, but also in developing nations.

Preferences for Academic Journals, 2006 Percent of faculty who believe that each characteristic is “very important” in influencing the decisions where to publish their articles

Background on the Present Study

Objectives What are the scholarly impacts of various business models for journal publishing? How do various business models for journal publishing affect the value derived by authors and readers?

Natural Experiment Beginning in 1995 publishers and content aggregators began digitizing current and archival content and placing it online. However, as late as 2005 (the endpoint of our analysis) backfiles for many journals (and current content in some cases) remained offline. We exploit this heterogeneous chronology to explore the impact of online access.

Previous Studies Many previous studies of this relationship find large effects Common flaws: these efforts do not adequately control for potential selection problems affecting article quality, do not use adequate statistical methods, or both For example, did the best journals, at least in some disciplines, gain an online presence earlier? This study avoids these problems: Variation in journal quality for content published prior to 1995 is unlikely to be related to online strategies adopted by publishers after 1995.

Some Empirical Questions What is the impact of online access on journal citation rates? Are the benefits greater for newer or older content? Are the effects discipline-specific? Which online “channels” have the greatest impact? Is the geographic and institutional distribution of citing authors influenced by online access?

People, Funding, and Timeline Researchers Mark McCabe, Professor of Economics, University of Michigan – Principal Investigator Chris Snyder, Professor of Economics, Dartmouth – Co-Principal Investigator Roger Schonfeld, Manager of Research, Ithaka Funded by a grant from The Andrew W. Mellon Foundation Data collection is completed, analysis is underway, full findings are expected to become available by mid 2008

Our Data

Three Disciplines History Economics and Business Biological and General Sciences Hundreds of publishers, aggregators, and archives provided data 100 journals in each discipline, comparing journal-year by journal-year 50 that were digitized early on 50 that were digitized only more recently or not at all Examine citations TO these journals that appeared in ANY journal from 1980 to 2005 Complete citation databases obtained from ISI

Descriptive Statistics ECONOMICS ObsMeanStd devMinMax Year journal first published Publication year3, Citation year58, Citations to journal-publication-year in a year 58, SCIENCE ObsMeanStd devMinMax Year journal first published Publication year3, Citation year71, Citations to journal-publication-year in a year 71, , ,589

Skewed Distribution of Citation in Economics Citations to journal-publication-year in a year Frequency About 4,700 zeros, one had 771 cites

Skewed Distribution of Citations in Science Citations to journal-publication-year in a year Frequency About 5,500 zeros, one had 32,500 cites

Online Availability for 1980 Content TitlesMeanSt DevMinMax Economics (82 journals published in 1980) JSTOR ProQuest Ebsco Publisher Website Science ( 74 journals published in 1980 ) JSTOR Ebsco PubMed Central Publisher Website

Geographic Distribution of First Authors of Articles that Cite Other Articles Science Cites (000) % Econ Cites (000) % English-Speaking Countries*9, , Non-English-Speaking Western Europe** 3, Rest of the World2, Total Cites15,521 1,687 * US, England, Canada, Australia, Scotland, New Zealand, Wales, Ireland, Northern Ireland ** Germany, Netherlands, France, Spain, Italy, Sweden, Belgium, Norway, Switzerland, Denmark, Finland, Austria, Greece, Portugal, Czech Republic, Slovakia.

Challenges ISI data requires extensive clean-up and quality control Many publishers and aggregators maintain poor records of their journals’ online histories First authors are confusing and require more consideration

Findings

Regression Outputs. xtreg lncit1 age* cyr* d2* js2* ow2*, i(articlegroup) fe robust; Fixed-effects (within) regression Number of obs = Group variable: articlegroup Number of groups = 99 R-sq: within = Obs per group: min = 52 between = avg = overall = max = 975 F(102,54464) = corr(u_i, Xb) = Prob > F = (Std. Err. adjusted for clustering on articlegroup) | Robust lncit1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] age1 | age2 | age3 | age4 | age5 | age6 | age7 | age8 | age9 | age10 | age11 | age12 | age13 | age14 | age15 | age16 | age17 | age18 | age19 | age20 | age21 | age22 | age23 | age24 | age25 | age26 | age27 | age28 | age29 | age30 | age31 | age32 | age33 | age34 | age35 | age36 | age37 | age38 | age39 | age40 | age41 | age42 | age43 | age44 | age45 | age46 | age47 | age48 | age49 | cyr1981 | cyr1982 | cyr1983 | cyr1984 | cyr1985 | cyr1986 | cyr1987 | cyr1988 | cyr1989 | cyr1990 | cyr1991 | cyr1992 | cyr1993 | cyr1994 | cyr1995 | cyr1996 | cyr1997 | cyr1998 | cyr1999 | cyr2000 | cyr2001 | cyr2002 | cyr2003 | cyr2004 | cyr2005 | d21995 | d21996 | d21997 | d21998 | d21999 | d22000 | d22001 | d22002 | d22003 | d22004 | d22005 | js21995 | (dropped) js21996 | (dropped) js21997 | js21998 | js21999 | js22000 | js22001 | js22002 | js22003 | js22004 | js22005 | ow21995 | (dropped) ow21996 | (dropped) ow21997 | (dropped) ow21998 | ow21999 | ow22000 | ow22001 | ow22002 | ow22003 | ow22004 | ow22005 | _cons | sigma_u | sigma_e | rho | (fraction of variance due to u_i) USA. xtreg lncit1 age* cyr* d2* js2* ow2*, i(articlegroup) fe robust; Fixed-effects (within) regression Number of obs = Group variable: articlegroup Number of groups = 99 R-sq: within = Obs per group: min = 136 between = avg = overall = max = 975 F(102,57725) = corr(u_i, Xb) = Prob > F = (Std. Err. adjusted for clustering on articlegroup) | Robust lncit1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] age1 | age2 | age3 | age4 | age5 | age6 | age7 | age8 | age9 | age10 | age11 | age12 | age13 | age14 | age15 | age16 | age17 | age18 | age19 | age20 | age21 | age22 | age23 | age24 | age25 | age26 | age27 | age28 | age29 | age30 | age31 | age32 | age33 | age34 | age35 | age36 | age37 | age38 | age39 | age40 | age41 | age42 | age43 | age44 | age45 | age46 | age47 | age48 | age49 | cyr1981 | cyr1982 | cyr1983 | cyr1984 | cyr1985 | cyr1986 | cyr1987 | cyr1988 | cyr1989 | cyr1990 | cyr1991 | cyr1992 | cyr1993 | cyr1994 | cyr1995 | cyr1996 | cyr1997 | cyr1998 | cyr1999 | cyr2000 | cyr2001 | cyr2002 | cyr2003 | cyr2004 | cyr2005 | d21995 | d21996 | d21997 | d21998 | d21999 | d22000 | d22001 | d22002 | d22003 | d22004 | d22005 | js21995 | (dropped) js21996 | (dropped) js21997 | js21998 | js21999 | js22000 | js22001 | js22002 | js22003 | js22004 | js22005 | ow21995 | (dropped) ow21996 | (dropped) ow21997 | (dropped) ow21998 | ow21999 | ow22000 | ow22001 | ow22002 | ow22003 | ow22004 | ow22005 | _cons | sigma_u | sigma_e | rho | (fraction of variance due to u_i) Non_USA English. xtreg lncit1 age* cyr* d2* js2* ow2*, i(articlegroup) fe robust; Fixed-effects (within) regression Number of obs = Group variable: articlegroup Number of groups = 99 R-sq: within = Obs per group: min = 136 between = avg = overall = max = 975 F(102,56959) = corr(u_i, Xb) = Prob > F = (Std. Err. adjusted for clustering on articlegroup) | Robust lncit1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] age1 | age2 | age3 | age4 | age5 | age6 | age7 | age8 | age9 | age10 | age11 | age12 | age13 | age14 | age15 | age16 | age17 | age18 | age19 | age20 | age21 | age22 | age23 | age24 | age25 | age26 | age27 | age28 | age29 | age30 | age31 | age32 | age33 | age34 | age35 | age36 | age37 | age38 | age39 | age40 | age41 | age42 | age43 | age44 | age45 | age46 | age47 | age48 | age49 | cyr1981 | cyr1982 | cyr1983 | cyr1984 | cyr1985 | cyr1986 | cyr1987 | cyr1988 | cyr1989 | cyr1990 | cyr1991 | cyr1992 | cyr1993 | cyr1994 | cyr1995 | cyr1996 | cyr1997 | cyr1998 | cyr1999 | cyr2000 | cyr2001 | cyr2002 | cyr2003 | cyr2004 | cyr2005 | d21995 | d21996 | d21997 | d21998 | d21999 | d22000 | d22001 | d22002 | d22003 | d22004 | d22005 | js21995 | (dropped) js21996 | (dropped) js21997 | js21998 | js21999 | js22000 | js22001 | js22002 | js22003 | js22004 | js22005 | ow21995 | (dropped) ow21996 | (dropped) ow21997 | (dropped) ow21998 | ow21999 | ow22000 | ow22001 | ow22002 | ow22003 | ow22004 | ow22005 | _cons | sigma_u | sigma_e | rho | (fraction of variance due to u_i) Non_English_Non_Europe. xtreg lncit1 age* cyr* d2* js2* ow2*, i(articlegroup) fe robust; Fixed-effects (within) regression Number of obs = Group variable: articlegroup Number of groups = 99 R-sq: within = Obs per group: min = 104 between = avg = overall = max = 975 F(102,53138) = corr(u_i, Xb) = Prob > F = (Std. Err. adjusted for clustering on articlegroup) | Robust lncit1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] age1 | age2 | age3 | age4 | age5 | age6 | age7 | age8 | age9 | age10 | age11 | age12 | age13 | age14 | age15 | age16 | age17 | age18 | age19 | age20 | age21 | age22 | age23 | age24 | age25 | age26 | age27 | age28 | age29 | age30 | age31 | age32 | age33 | age34 | age35 | age36 | age37 | age38 | age39 | age40 | age41 | age42 | age43 | age44 | age45 | age46 | age47 | age48 | age49 | cyr1981 | cyr1982 | cyr1983 | cyr1984 | cyr1985 | cyr1986 | cyr1987 | cyr1988 | cyr1989 | cyr1990 | cyr1991 | cyr1992 | cyr1993 | cyr1994 | cyr1995 | cyr1996 | cyr1997 | cyr1998 | cyr1999 | cyr2000 | cyr2001 | cyr2002 | cyr2003 | cyr2004 | cyr2005 | d21995 | d21996 | d21997 | d21998 | d21999 | d22000 | d22001 | d22002 | d22003 | d22004 | d22005 | js21995 | (dropped) js21996 | (dropped) js21997 | js21998 | js21999 | js22000 | js22001 | js22002 | js22003 | js22004 | js22005 | ow21995 | (dropped) ow21996 | (dropped) ow21997 | (dropped) ow21998 | ow21999 | ow22000 | ow22001 | ow22002 | ow22003 | ow22004 | ow22005 | _cons | sigma_u | sigma_e | rho | (fraction of variance due to u_i)

Science Journal Citations Peak in Year Three 95% confidence interval Years since publication Citations relative to age 49 Notes: Results from negative binomial regression with age dummies, digital dummy aggregated across channels for any presence, restricted to publication years

Economics Journal Citations Peak in Year Five 95% confidence interval for science Years since publication Citations relative to age 49 Notes: Results from negative binomial regression with age dummies, digital dummy aggregated across channels for any presence, restricted to publication years 95% confidence interval for economics

Preliminary General Findings Citation levels more than double in both disciplines over the sample period, There is an increase in citations as a result of digitization and online availability. Highly significant, both for pre-1995 content (digitized backfiles) and born-digital periods.

Disciplinary Differences Citation rates peak earlier in science (3 years) than in economics (5 years); the subsequent decline in citations is more rapid in science. Online access is associated with an average increase in citations of about 10% for economics and 20% for science titles. However, the changes in citations observed over time is an order of magnitude larger than the measured impact of online access.

Years since publication Citations relative to age 49 Online Offline For Science, Online Access Boosts Citations 20% Overall Notes: Results from negative binomial regression with age dummies, digital dummy aggregated across channels for any presence, restricted to publication years

Years since publication Citations relative to age 49 Online Offline For Economics, Online Access Boosts Citations 10% Overall Notes: Results from negative binomial regression with age dummies, digital dummy aggregated across channels for any presence, restricted to publication years

Channel Effects For Science: JSTOR and publisher portals are important, but not other 3rd party channels (except for the period 95-97). For Economics, all types of channels have a significant impact. Longer embargo periods clearly decrease the ability of a given channel to increase citations.

HIGHLY PRELIMINARY: Geographic Effects on Citation Growth over Time Rate of citation growth for biology is much higher (double) in non-English-speaking countries. Rate of citation growth for economics is moderately higher in non-English-speaking countries. Implication: Are these disciplines growing faster in non-English- speaking countries?

Impact of Digitization for Science – Publisher Website

Impact of Digitization for Science – JSTOR

Impact of Digitization for Science – Aggregators

Impact of Digitization for Economics – Publisher Website

Impact of Digitization for Economics – JSTOR

Impact of Digitization for Economics – Aggregators

HIGHLY PRELIMINARY: Geographic Effects on Citation Patterns Science: The channel impact is about twice as large in the non- English speaking countries (e.g. overall a 30% increase versus 15%). Economics: The channel impact is about twice as large outside of the developed English-speaking countries (~20% increase versus less than 10%). There is much we can learn from various models for the distribution of content and their relative strengths over time.

Further Questions and Discussion

Further Questions Does year of source-item publication matter? Will references to older articles increase more than references to more recently published articles? Have self-citation patterns changed? Presumably we will find no effect, an important confirmation of our data and analytical framework.

Findings and Discussion We find a consistent significant impact from digitization. At the same time, it is an order of magnitude less than the changes observed over time. Is the impact “large” or “small” and what implications if any are there? The impact is greater in science than in economics. Why? What are the implications? The impact is greater outside of the English-speaking countries. Why? What are the implications? Channel effects are dramatic. What are the implications?

How Does Digitization Affect Scholarship? Roger C. Schonfeld (212) 500 –