Disintermediation of Academic Publishing through the Internet: An Intermediate Report from the Front Line Thomas Krichel

1 Disintermediation of Academic Publishing through the Internet: An Intermediate Report from the Front Line Thomas Krichel Simeon M. Warner

2 Nature of this talk intermediate report Interaction welcome, ample time Done by pioneer (Krichel) and practitioner (Warner) Normative rather than positive emphasis listen to the horses mouth descriptive and speculative parts

3 The Internet threat Internet is a relatively recent technology that threatens all sorts of businesses whose essential function is to provide an intermediary between different parties these include estate agents, marital agencies, academic publishers

4 Esoteric authors An academic has little change but big ego. –No monetary reward for writings, therefore optimal for authors to allow free access. –But big ego only satisfied with quality certification. Social optimum reached when price is equal to marginal cost. Unclear if free access can become a reality

5 Problems for toll-gate publishers Static demand for material by libraries leads to upward spiral prices to raise profits. Remedy is pricing per customer and consortia deals. Risk of a downward spiral where poor dissemination may detract best authors away to alternative venues.

6 Alternative venues on Internet Homepage on the web Some isolated Internet publishing venture (budding electronic journals) Institutional multidisciplinary archive Formal internet archiving and dissemination venues, essentially limited to the preprint disciplines.

7 The preprint disciplines Some few disciplines have had a tradition of informal publication through –preprints –working papers and tech reports These are –Computing –Economics –Mathematics –Physics

8 Centralised and decentralised model discipline centraliseddecentralised Computing CORR NCSTRL Economics EconWPA RePEc Mathematics arXiv MathNet Physics arXiv PhysNet and then there is the web...

9 arXiv Oldest (1991) and best-known author self- archiving system in fact the essence of an author self-archiving system –authors upload papers to a centralised system –the centralised system itself is mirrored founded by Paul Ginsparg at LANL

10 History Mail exchange (August 1991) ftp server (1992) web interface (December 1993) automatic PostScript generation from TeX source (June 1995) PDF generation (April 1996) web upload (June 1996) OAI interface (February 2000)

11 Statistics for ,000 users in over 100 countries 13,000,000 downloads of papers 30,000 submissions 3,500 additional new submissions per annum Over 98% of submissions are entirely auto-mated: 68% via the web, 27% via and 5% via ftp. arXiv uses less than one full-time equivalent to deal with day-to-day operations.

12 Special strengths of arXiv Simple to understand concept Usage of TeX document formatting system indefinite funding horizon thanks to NSF and US DoE strong community support (e.g. volunteer moderators)

13 (minor) Weaknesses of arXiv Its model failed on other discipline-based attempts –cogprints –EconWPA –CORR not as well integrated as possible with other sources lack of important innovation in past few years

14 RePEc Comprehenisive academic self-documentation system in fact, the very essence of an academic self- documentation system –run decentrally by academic volunteers –comprehensive picture of academic output activity originates with WoPEc project founded by Thomas Krichel in 1993

15 RePEc principle Many archives –archives offer metadata about digital objects (mainly working papers) One database –The data from all archives forms one single logical database despite the fact that it is held on different servers. Many services –users can access the data through many interfaces. –providers of archives offer their data to all interfaces at the same time. This provides for an optimal distribution.

16 RePEc is based on 190+ archives WoPEc EconWPA DEGREE S-WoPEc NBER CEPR US Fed in Print IMF OECD MIT University of Surrey CO PAH

17 …to form one dataset... over 140,000 items in over 1,000 series, contains working paper, published paper, software, personal and institutional data largest distributed free source about online scientific publications, over 45,000 electronic papers data is encoded using the purpose-built ReDIF format all archives follow a convention called the Guildford protocol on how to store ReDIF files and other data on their servers. Therefore the archives can be mirrored.

19 … describes documents Template-Type: ReDIF-Paper 1.0 Title: Dynamic Aspect of Growth and Fiscal Policy Author-Name: Thomas Krichel Author-Person: RePEc:per: :thomas_krichel Author- Author-Name: Paul Levine Author- Author-WorkPlace-Name: University of Surrey Classification-JEL: C61; E21; E23; E62; O41 File-URL: pub/RePEc/sur/surrec/surrec9601.pdf File-Format: application/pdf Creation-Date: Revision-Date: Handle: RePEc:sur:surrec:9601

20 … describes persons (HoPEc) Template-Type: ReDIF-Person 1.0 Name-Full: KRICHEL, THOMAS Name-First: THOMAS Name-Last: KRICHEL Postal: 1 Martyr Court 10 Martyr Road Guildford GU1 4LF England Homepage: Workplace-Institution: RePEc:edi:desuruk Author-Paper: RePEc:sur:surrec:9801 Author-Paper: RePEc:sur:surrec:9601 Author-Paper: RePEc:rpc:rdfdoc:concepts Author-Paper: RePEc:rpc:rdfdoc:ReDIF Handle: RePEc:per: :THOMAS_KRICHEL

21 … describes institutions (EDIRC) Template-Type: ReDIF-Institution 1.0 Primary-Name: University of Surrey Primary-Location: Guildford Secondary-Name: Department of Economics Secondary-Phone: (01483) Secondary- Secondary-Fax: (01483) Secondary-Postal: Guildford, Surrey GU2 5XH Secondary-Homepage: Handle: RePEc:edi:desuruk

22 Weaknesses of RePEc No funding Difficult to grasp innovative concepts –relational database for the academic process –plethora of user and contributor services testing out concept in other discipline with to date limited results (ReLIS). Setting-up costs are large. Little support from the top of the academic food chain


24 Think forward... Optimisation over time involves finding the best path that leads to the desired outcome. That is the essence of Bellmans principle of intertemporal optimality. Therefore a realistic desired outcome has to be fixed first.

25 Think British... Extreme scenarios are unlikely Slow evolution Totally free access to scholarly documents unlikely Budding initiatives of free quality-controlled journals shows that academics can do it themselves

26 One size does not fit all... There are important discipline-specific differences in scholarly communication that are likely to persist in the rise of Internet-mediated scholarly communication. This can already be demonstrated on current initiatives, all of which have a discipline anchoring. (talk about institutional archiving later)

27 Disciplines differ... communication patterns before Internet presence or absence of entrepreneurial pioneers rewards systems sensitivity and contestitivity of material but all will have a free layer and a toll-gated layer

28 Scenario 1: vacuum cleaner Free academic layer dispersed and available with all the rest of the web. Toll-gated material much more quality controlled no free bibliographical database Scenario defended by Bill Arms. Impossible to build scholarly communication system on the free layer alone. Default scenario.

30 Scenario 3: Gosplan One central archive for the discipline with much of the papers available on it. Peer-review running as overlay to the central archive. Scenario of ArXiv.

31 Suggestion to move forward Concentrate on the provision of contents. Dont waste so much time on –metadata schemes (adopt AMF) –user interfaces Use OAI protocols to export contents. Shift focus of attention away from works towards the persons who create the works.


33 Conclusion When a technological shock (like the Internet) hits a social structure (like the scholarly communication system), then there is an opportunity for new entrants to come along. This opportunity is here today. Seize it. Thank you for listening.

