Presentation on theme: "Lessons from the Open Citation Project Presented by Steve Hitchcock, Southampton University These slides prepared for The Open Archives Initiative: application."— Presentation transcript:
Lessons from the Open Citation Project Presented by Steve Hitchcock, Southampton University These slides prepared for The Open Archives Initiative: application and exploitation, a one-day seminar on the application and exploitation of the OAI Protocol for Metadata Harvesting, May 14, 2003, London A joint JISC-NSF International Digital Libraries Project 1999-2002
A post-Google information environment Electronic journals exist in a post-Gutenberg and a post-Google information environment The ability to locate a specified item of information precisely and instantly among the mass of information available on the Web has profound implications. In the electronic environment the search engine has become the de facto interface to information, rather than the fragmented packages that have migrated from the print world.
About this presentation Citebase: citation-ranked search and impact discovery service – New scientometric indices – Evaluating Citebase EPrints.org software: free software to build and manage OAI- compliant eprint archives Growth of OAI, Eprints.org and institutional archives How to accelerate the growth of OAI eprint archives
Citebase, a discovery service with usage- and citation-bases ranking http://citebase.eprints.org/ Google for the refereed literature Citebase is based on a citation database Harvests metadata using OAI-PMH Extracts and indexes citations from published research papers stored in the larger open access, OAI disciplinary archives - currently arXiv, CogPrints and BioMed Central Provides impact (and other)-ranked search based on reference data Re-exports metadata + references
Some old and new scientometric (publish or perish) indices of research impact Quality-level and citation-counts of the journal in which the article appears Citation-counts for the article Citation-counts for the researcher Co-citations, co-text (cited with whom/what else?) Citation-counts for the preprint Usage-measures (hits, Webmetrics) Time-course analyses, early predictors, etc.
Citebase, a new interface to the scholarly literature
Time-Course of Citations (red) and Usage (hits, green) Witten, Edward (1998) String Theory and Noncommutative Geometry Adv. Theor. Math. Phys. 2 : 253 1. Preprint or Postprint appears. 2. It is downloaded (and sometimes read). 3. Eventually citations may follow (for more important papers). 4. This generates more downloads, etc. Perhaps the most important new information to become available for bibliometric studies is the per article readership information. Kurtz et al. (2003) "The NASA Astrophysics Data System: Sociology, Bibliometrics and Impact" http://cfa-www.harvard.edu/~kurtz/jasist-submitted.ps
Evaluating Citebase http://opcit.eprints.org/opcitevaluation.shtml First detailed user evaluation of an open access Web citation indexing service The evaluation was aimed at users of arXiv, and all others who use bibliographic services to access the refereed journal literature. Citebase was evaluated by nearly 200 users from different backgrounds between June and October 2002 Just prior to the evaluation Citebase had records for 230,000 papers, indexing 5.6 million references. By discipline, approximately 200,000 of these papers are classified within arXiv physics archives.
Results of Citebase evaluation Web-based citation indexing of open access eprint archives is closer to a state of readiness for serious use than had previously been realised Within the scope of its primary components, the search interface and services available from its rich bibliographic records, Citebase can be used simply and reliably for the purpose intended Tasks can be accomplished efficiently with Citebase regardless of the background of the user Links to citing and co-citing papers are features of Citebase that are valued by users Citebase compares favourably with other bibliographic services Coverage is seen as a limiting factor. Non-physicists were frustrated at the lack of papers from other sciences
Accomplishing tasks with Citebase Tasks can be accomplished efficiently with Citebase regardless of the background of the user. A key part of the evaluation assessed the usability of Citebase with a practical exercise to build a short bibliography based on a series of questions Yellow line, T=true Blue, F=false Purple, N=no response All users Physicists only
Most useful features of Citebase Links to citing and co-citing papers are features of Citebase that are valued by users
Citebase compares favourably with other bibliographic services
Growth of OAI, Eprints.org and Institutional Archives How OAI Archives for institutional research output have been growing – and how to accelerate their growth The following slides are taken from the presentation The Research Impact Cycle, which contains key data on the growth of open access through the self-archiving of institutional (peer-reviewed) research. These data can be freely used or adapted for other talks. Copy this PPT version for reuse. http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.ppt Data collected and analysed by Tim Brody, Electronics and Computer Science, Southampton University
Growth in number of OAI Archives (now 140+ Archives, but the average number of papers per Archive (9000) needs to grow faster!)
EPrints.org software http://www.eprints.org/ Generates eprint archives that are compliant with the OAI Protocol for Metadata Harvesting. Eprints.org software has been used to build institutional archives, and disciplinary archives. In conjunction with OAI, Eprints.org has been a primary motivator for institutional archives Eprints.org v. 2.0 released February 2002 (now on v. 2.2.1) EPrints is free (GPL) software, aimed at organisations and communities.
Growth in number of Eprints.org Archives (c. 70) (again, average number of papers per Archive [c. 120] needs to grow faster!)
Work that needs to be done to accelerate growth per archive These curves must become convex upward: Institutional self-archiving policies are needed
What have we learned from the Open Citation Project? OAI is gathering momentum Software for building OAI repositories is available Institutional archives are beginning to be created, but need to be filled by authors Attracting authors requires evidence of services that will improve the visibility and impact of their works Citation-ranked search and reference linking are examples of OAI services that do this
Online or Invisible? (Lawrence 2001) average of 336% more citations to online articles compared to offline articles published in the same venue Lawrence, S. (2001) Free online availability substantially increases a paper's impact. Nature, 411 (6837): 521 http://www.neci.nec.com/~lawrence/papers/online-nature01/
What is needed to fill the archives 1.Universities: Adopt a university-wide policy of self- archiving all university research output, e.g. Southampton (ECS) Research Self-Archiving Policy http://www.ecs.soton.ac.uk/~lac/archpol.html http://www.ecs.soton.ac.uk/~lac/archpol.html 2.Departments: Create Departmental OAI-compliant Eprint Archives 3.University Libraries: Provide digital library support for research self-archiving and archive-maintenance 4.Promotion Committees: Request a standardized online CV from all candidates, with refereed publications all linked to their full-texts in the Departmental Archives 5.Research Funders: Assess research impact online (from the online CVs)
Mandating online UK Research Assessment CVs linked to university eprint archives "will set an example for the rest of the world that will almost certainly be emulated in terms of research assessment and research access" Ariadne, issue 35, April 30, 2003 http://www.ariadne.ac.uk/issue35/harnad/
Exploiting OAI OAI has become the critical technical infrastructure for open access to author self-archived papers in institutional archives OAI enables cross-archive services such as Citebase Open access data and services promise increased visibility and impact for authors OAI resources will begin to grow significantly when authors realise this, and when research councils start mandating open access to the publication of results of funded research
Credits: Open Citation Project @ Southampton Principal Investigator is Stevan Harnad Technical development at Southampton is directed by Les Carr EPrints.org software is being developed by Chris Gutteridge Citebase is produced and managed by Tim Brody Project manager is Steve Hitchcock A copy of these slides can be found on the OpCit Web site http://opcit.eprints.org/http://opcit.eprints.org/. Look for Papers and Presentations Contact Steve Hitchcock: email@example.com