Presentation on theme: "RePEc, a digital commons for economics Thomas Krichel 2013-02-05."— Presentation transcript:
RePEc, a digital commons for economics Thomas Krichel 2013-02-05
??? RePEc is a congenial and highly original initiative. It is very poorly understood. It has been running close to 20 years.
structure of talk some history extent of RePEc reasons for success challenges soapbox
History It started with me as a research assistant an in the Economics Department of Loughborough University of Technology in 1990. a predecessor of the Internet allowed me to download free software without effort but academic papers had to be gathered in a painful way
CoREJ published by HMSO –Photocopied lists of contents tables recently published economics journal received at the Department of Trade and Industry –Typed list of the recently received working papers received by the University of Warwick library The latter was the more interesting.
working papers early accounts of research findings published by economics departments –in universities –in research centers –in some government offices –in multinational administrations disseminated through exchange agreements important because of 4 year publishing delay
1991-1992 I planned to circulate the Warwick working paper list over listserv lists I argued it would be good for them –increase incentives to contribute –increase revenue for ILL After many trials, Warwick refused. During the end of that time, I was offered a lectureship, and decided to get working on my own collection.
1993: BibEc and WoPEc Fethy Mili of Université de Montréal had a good collection of papers and gave me his data. I put his bibliographic data on a gopher and called the service "BibEc" I also gathered the first ever online electronic working papers on a gopher and called the service "WoPEc".
NetEc consortium BibEcprinted papers WoPEcelectronic papers CodEcsoftware WebEcweb resource listings JokEcjokes HoPEc a lot of Ec!
why? In the 90s it was clear to me that open access to scientific publication would bring tremendous benefits. The aim was to place open access documents in the same system with toll- gated access documents to allow the former to compete more effectively with the latter.
WoPEc to RePEc WoPEc was a catalog record collection WoPEc remained largest web access point but getting contributions was tough In 1996 I wrote basic architecture for RePEc. –ReDIF –Guildford Protocol
1996: RePEc principle Many archives –archives offer metadata about digital objects (mainly working papers) One database –The data from all archives forms one single logical database despite the fact that it is held on different servers. Many services –users can access the data through many interfaces. –providers of archives offer their data to all interfaces at the same time. This provides for an optimal distribution.
RePEc is based on 1400+ archives MPRA DEGREE S-WoPEc NBER CEPR US Fed in Print IMF OECD MIT arXiv CO PAH
to form a 1.2M item dataset 500,000 working papers 700,000 journal articles 3,000 software components 30,000 book and chapter listings 35,000 author contact and publication listings 12,000 institutional contact listings
RePEc is used in many services IDEAS EconPapers NEP: New Economics Papers Inomics RePEc Author Service IDEAS RuPEc EDIRC LogEc CitEc
… describes documents Template-Type: ReDIF-Paper 1.0 Title: Dynamic Aspect of Growth and Fiscal Policy Author-Name: Thomas Krichel Author-Person: RePEc:per:1965-06- 05:thomas_krichel Author-Email: T.Krichel@surrey.ac.uk Author-Name: Paul Levine Author-Email: P.Levine@surrey.ac.uk Author-WorkPlace-Name: University of Surrey Classification-JEL: C61; E21; E23; E62; O41 File-URL: ftp://www.econ.surrey.ac.uk/ pub/RePEc/sur/surrec/surrec9601.pdf File-Format: application/pdf Creation-Date: 199603 Revision-Date: 199711 Handle: RePEc:sur:surrec:9601
… describes institutions Template-Type: ReDIF-Institution 1.0 Primary-Name: University of Surrey Primary-Location: Guildford Secondary-Name: Department of Economics Secondary-Phone: (01483) 259380 Secondary-Email: email@example.com Secondary-Fax: (01483) 259548 Secondary-Postal: Guildford, Surrey GU2 5XH Secondary-Homepage: http://www.econ.surrey.ac.uk/ Handle: RePEc:edi:desuruk
summary about RePEc RePEc is not an open access archive. It is a free abstracting and indexing dataset collected in a collaborative fashion. It treats full-text locations as attributes of the document descriptions.
RePEc and institutional repositories If a RePEc archive is augmented with full- text (which it can be) it is as true example of an institutional repository albeit discipline limited. RePEc is the living proof that an institutional repository (IR) system can thrive.
data and service providers In classical IR thinking there is a distinction between data providers and service providers. In RePEc many service providers act as data providers. We have more of a peer ecology than a standard institutional repository system.
key to success Have a small group of volunteers. All are technically competent. No stakeholder consultation talk. Disseminate as widely as possible. Demonstrate to authors and institutions that it works for them. –institutional registration –author registration
institutional registration It started by one sad geezer making a list of departments that have a web site. I persuaded him that his data would be more widely used if integrated into the RePEc database. Now he is a happy geezer and one of our three crucial volunteers.
author registration It started when funding allowed us to hire a crazy programmer to write an author registration system. system went online as "HoPEc" in late 2000. has been renamed "RePEc author service" (RAS) In 2003 a grant from OSI allows for a rewrite and expansion.
RePEc Author Service RePEc document data has author names as strings. The authors register with RAS to list contact details and identify the papers they wrote. This is classic access control, but done by the authors. In a ranking of 100 most important economists, over 80% are registered with RAS.
authors' incentives Authors perceive the registration as a way to achieve common advertising for their papers. Author records are used to aggregate usage logs across RePEc user services for all papers of an author. Stimulates a I am bigger than you are mentality. Size matters!
LogEc Despite the existence of many user services, a central service collects usage data from the most important ones. This data is then distributed to user services. The can globally assess usage of an item. Created and maintained by Sune Karlsson.
NEP: New Economics Papers This service collects data on new working papers in RePEc. It makes it available to editors to filter into close to 100 subject specific report. Editors are aided by machine learning. Created in 1998 by yours truly and maintained by yours truly. Server sponsored by Victoria University.
Citation in Economics CitEc CitEc is an autonomous citation system. We download available full texts, convert to text, parse references to parse citations. –458247 documents processed –10900917 references found –4411202 citations found Created by Jose Manuel Barrueco Cruz in 1998 and maintained by him. Data is widely used across RePEc services.
CitEc and RAS RAS authors can claim citiations from CitEc data. The can verify that the association between reference and cited document is correct.
EconAcademics.org A new service by Christian Zimmermann, sponsored by the St. Louis Fed. The service monitors blogs citing work in RePEc using links to RePEc documents. The service encourages discussion of research in RePEc and inbound links. Brings blog posts closer to formally published items. This is very slick!
MPRA The Munich personal RePEc archive is a repository for authors who are not affiliated with institutions that have a RePEc archive. Launched by Ekkehart Schlicht in 2006 and sponsored by Munich University, based on EPrints software. Currently over 23000 items.
CollEc A full collaboration graph of the RePEc Author dataset. It maintains about 400000000 shortest paths in a rolling continuous updated system. Started by yours truly in 2006, fully functional since 2012. Server sponsored by Symplectic.
RePEc genealogy service Another new service by Christian Zimmermann and his team at the St. Louis Fed. This builds a genealogy of RePEc authors through the use of a crowd-sourcing tools. Only RAS registrants may contribute.
problems Over time, Google has become more slick in pointing directory to full text rather than to RePEc pages. While usage is still growing and RePEc is still growing this is not reflected in the usage numbers. RePEc has expanded its user services but that may not be sufficient to guarantee data growth.
ArchEc ArchEc, created by yours truly in 2012, is an attempt to build a dark archive. Later, we can find agreements with archive maintainers to make it a light archive, and encourage links to ArchEc rather than to the original item. As an unfunded initiative it may take many months, if not years, to complete.
some thoughts When a new technology disrupts an established social system based on an old technology, change is slow. The main reason is that people are prisoners of thinking in concepts the relevance of which has passed.
overall move We are witnessing the transition from an economy of information to an economy of attention. Economy of information: information is scarce, attention plentiful. Economy of attention: information is plentiful, attention is scarce.
past thinking: peer review Peer review means reviewing material prior to publication. –This makes sense when publishing / updating is expensive. –It makes no sense when publishing / updating is cheap. Usage-based evaluation is the way to go.
past thinking: myth of industry Data providers think of the data they have collected as something they have to control. Ask Greg for public domain metadata, no response. No way to get a complete copy of EconBiz. Oh we have an API…
RePEc vs. myth of industry RePEc has an ftp server with almost all data it has. We aggressively try to give our data away because we believe that this is what the community wants. We work for the economy of attention. We need to get away from proprietary silos.
example RePEc gives its data to the American Economic Association. They run a multi-million dollar business out of selling a product called EconLit. We give them our data for free. We get nothing back. Well almost nothing…
past thinking: subscription model The distribution of scholarly material is mainly supported via massive transfers of funds from universities to toll-gating publishers. Open access ventures get a tiny share of these funds. As long as subscription persists there will be not much progress with open access.
problem 1 of subscription model The subscription model is a product of the economy of information. But research is fundamentally conducted to create attention to the universitys work. When an university buy access to toll- gated material, it subsidizes attention to research conducted by others. The subscription model is individually irrational.
problem 2 of subscription model The subscription model is not only individually irrational, it is also collectively irrational. When all institutions switch funds from closed to open access we need no subscriptions any more.
Am I crazy? Money does not make the world go round. Ideas do. When RMS proposed a free replacement for UNIX in the early 80s, most people dismissed the idea. Today it is reality! Similarly, when I started to work on RePEc a totally free and improved A&I dataset in 1993, nobody gave it a high probability to succeed. It is a reality!
http://openlib.org/home/krichel Thank you for your attention!