Presentation is loading. Please wait.

Presentation is loading. Please wait.

From eprint archives to open archives and OAI: the Open Citation project By The Open Citation Project team Presented by Steve Hitchcock, Southampton University.

Similar presentations


Presentation on theme: "From eprint archives to open archives and OAI: the Open Citation project By The Open Citation Project team Presented by Steve Hitchcock, Southampton University."— Presentation transcript:

1 From eprint archives to open archives and OAI: the Open Citation project By The Open Citation Project team Presented by Steve Hitchcock, Southampton University These slides prepared for the JISC/NSF Digital Libraries Initiative (DLI) All Projects Meeting, Edinburgh, 24-25th June 2002 OpCit is a joint JISC-NSF International Digital Libraries Project 1999-2002

2 About this presentation The aim is to: Report progress since Stratford All-Projects meeting in 2000 Demonstrate new services developed by the project Highlight the role of the project in the Open Archives Initiative Outline key tasks remaining Look beyond the Open Citation Project

3 Recap 1: principal partners Southampton University, IAM (Intelligence, Agents, Multimedia) Research Group, PI Stevan Harnad Citation-ranked search, EPrints.org, user surveys Cornell University, Digital Library Research Group, PI Carl Lagoze Architecture for reference linking, experiments with the ACM Digital Library and D-Lib magazine, OAI technical support center arXiv.org, Paul Ginsparg Now based at Cornell University. Still the largest archive of freely accessible author-deposited scientific papers

4 The Open Citation Project: deliverables The Open Citation Project (OpCit) is developing software and services to support the Open Archives Initiative (OAI). OpCit can help OAI data providers and service providers: Citebase: citation-ranked search EPrints.org software: free software to build and manage OAI- compliant eprint archives API for reference linking, an interface on which reference linking applications can be built

5 Recap 2: last time at Stratford Reference links on pdf copies of papers

6 Citebase, a new interface to the scholarly literature

7 Citebase, a citation-ranked search engine http://citebase.eprints.org/ Google for the refereed literature Citebase is based on a citation database Harvests metadata using OAI-PMH Extracts reference lists from arXiv papers Provides impact (and other)-ranked search based on reference data Re-exports metadata + references

8 Evaluating Citebase http://citebase.eprints.org/survey/ The evaluation is aimed at users of arXiv, and all others who use bibliographic services to access the refereed journal literature. You can contribute (June-July 2002) using the form linked above. Aims of the evaluation: Discover the users awareness of related services Assess usability with a practical exercise Invite the users views on the main features Assess the level of user satisfaction with the service

9 Citebase: further developments OpenURL-enabled: pointing Citebase links at library and journal services Google interface using DP9: getting Citebase results, and open archives, into Google Metadata format and XML schema for citations: making citation metadata harvestable via OAI-PMH. Possible formats include: – Academic Metadata Format: a local profile format, some collaborative experiments performed within OpCit – OpenURL metadata, moving towards NISO standardisation

10 Recap 3: API for reference linking getLinkedText – contents of the paper, reference-linked plus lots of metadata for the paper getReferenceList – this papers references getCurrentCitationList – the list of works citing this paper (best knowledge) getMyData – metadata for this paper

11 Surrogates in the API Based on an automatic analysis of the work, a surrogate for a scholarly work (and of other works, for citations), consists of the following three XML files: Bibliographic data for the scholarly work References contained in that work, and their contexts within the full text Citations of that work

12 API evaluation API tested on D-Lib Magazine and the ACM Digital Library. Try demo at http://cs-tr.cs.cornell.edu/RefLinkingDemo/http://cs-tr.cs.cornell.edu/RefLinkingDemo/ Performance (in terms of accuracy of data extracted): Reference analysis: 86.7% Item analysis (bib data, contexts, and references for a given paper): 82.42 % Implementability Simple interface: Surrogate s = new Surrogate (some-url) Portable: written in Java, has run in Solaris, Win2K, and NT4 Installation: API source code plus public domain jar files

13 EPrints.org software http://www.eprints.org/ Generates eprint archives that are compliant with the Open Archives Initiative Protocol for Metadata Harvesting. EPrints is free (GPL) software. It is aimed at organisations and communities. EPrints v. 2.0 released February 2002 (now on v. 2.0.1, which fixes bugs and typos). Features: Internationalised metadata stored as Unicode Support for multiple archives on one server Improved user interface

14 OpCit and OAI OAI Aggregator (Celestial): collecting and caching the results from OAI data providers to improve the efficiency of data harvesting http://celestial.eprints.org http://celestial.eprints.org OAI infrastructure: proxies, caches, gateways. Improve interoperability, scalability and reliability of OAI services. Joint work with Old Dominion University, see paper http://arxiv.org/abs/cs.DL/0205071 http://arxiv.org/abs/cs.DL/0205071 OAI Registration and Validation: performed at Cornell http://www.openarchives.org/Register/BrowseSites.pl http://www.openarchives.org/Register/BrowseSites.pl

15 EPrints and OAI EPrints feeds repository URLs straight into the OAI registration process (if so desired by the EPrints administrator) A scan of the OAI database of registered sites shows many sites use EPrints software to create repositories

16 A repository administrators view of OAI As we have introduced our repository to our faculty and staff, we have emphasized the point that because they would be depositing their material in an OAI-compliant archive, it would automatically and painlessly be discoverable from various other points around the globe. Luckily, we were right. Roy Tennant, eScholarship, California Digital Library, June 2002 http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2085.html http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2085.html

17 OpCit user surveys and data mining Maximising impact Maximising access Results from Mining the Social Life of an Eprint Archive http://opcit.eprints.org/tdb198/opcit/ http://opcit.eprints.org/tdb198/opcit/ When interoperability is not enough: show authors what users do when open access services are available

18 Key project tasks remaining OpCit formally ends in September 2002. Before then: Evaluation and reporting of the results Programmer's guide to using the API Journal and conference papers Final reports to JISC and NSF

19 Beyond OpCit Beyond the project, the following will continue to be developed: Citebase EPrints.org OAI … and variously applied in the JISC FAIR programme (start 2002) http://www.jisc.ac.uk/dner/development/programmes/fair.html Targeting Academic Research for Deposit and Disclosure (lead institution: Southampton University) e-prints UK (RDN, Kings College London) citation analysis service for eprints database Machine-readable rights metadata (Loughborough University)

20 What we have achieved; what we have learned OAI is gathering momentum Software for building OAI repositories is available Institutional archives are being created, but need to be filled by authors Attracting authors requires evidence of services that will improve the visibility and impact of their works Citation-ranked search and reference linking are examples of OAI services that do this The infrastructure supporting OAI services continues to be enhanced Resource discovery and current awareness are exemplar OAI services now. Future services may be preservation management, and personalization

21 Credits Other contributors to the project include Technical development at Southampton is directed by Les Carr Research at Cornell by Donna Bergmark EPrints.org software is being developed by Chris Gutteridge Citebase is produced and managed by Tim Brody Project manager is Steve Hitchcock A copy of these slides can be found on the OpCit Web site http://opcit.eprints.org/http://opcit.eprints.org/. Look for Papers and Presentations Contact Steve Hitchcock: sh94r@ecs.soton.ac.uk


Download ppt "From eprint archives to open archives and OAI: the Open Citation project By The Open Citation Project team Presented by Steve Hitchcock, Southampton University."

Similar presentations


Ads by Google