Presentation is loading. Please wait.

Presentation is loading. Please wait.

Moving Forward with the OpenDOAR Directory

Similar presentations

Presentation on theme: "Moving Forward with the OpenDOAR Directory"— Presentation transcript:

1 Moving Forward with the OpenDOAR Directory
Peter Millington SHERPA Technical Development Officer University of Nottingham, England Update Seminar for Imperial College, 2nd Dec 04 Bill Hubbard, SHERPA

2 Outline Brief introduction to OpenDOAR OAI-PMH harvesting exercise
What it is. Project time line OAI-PMH harvesting exercise Modus operandi Results for re-use policies Technical issues & performance Conclusions & Recommendations Prototype ‘policy generator’ tool Questions & Feedback

3 What is OpenDOAR? Directory of Open Access Repositories Coverage
Institutional & Subject-based repositories; Funders’ OA archives Not covering: OA journals – see DOAJ – Authoritative evaluated data More than auto-harvested OAI data Proactive - more than data supplied by repository administrators Periodic review for currency and functionality Target users Search service providers, OA stakeholders, end-users Active dialogue with providers, administrators, funders, etc


5 OpenDOAR Project Time Line
Started early 2005 University of Nottingham & University of Lund Funded by: OSI, JISC, CURL & SPARCEurope First public version January 2006 Data built on work by Tim Brody, Southampton, & others 380 repositories (04-May-2006) Developing Version 2 Additional fields & views Due summer 2006

6 Harvesting Modus Operandi
Aims Familiarisation with OAI-PMH Investigation of repositories’ policies OAI-PMH protocol 315 Repositories in OpenDOAR with an OAI Base URL verb=Identify – policies from eprints.xsd schema Timings recorded & technical glitches noted Microsoft Excel Macros Prompted for operator interventions Such events would hamper auto-harvesting PHP Firewall problems – needed to use HTTP proxy server PHP functions would not handle HTTPS Update Seminar for Imperial College, 2nd Dec 04 Bill Hubbard, SHERPA

7 eprints.xsd Policy Criteria
content Text and/or a URL linking to text describing the content of the repository It would be appropriate to indicate the language(s) of the metadata/data in the repository metadataPolicy Text and/or a URL linking to text describing policies relating to the use of metadata harvested through the OAI interface dataPolicy Text and/or a URL linking to text describing policies relating to the data held in the repository This may also describe policies regarding downloading data (full-content) submissionPolicy Text and/or a URL linking to text describing policies relating to the submission of content to the repository (or other accession mechanisms)

8 Metadata Policy Results

9 Metadata Policy Results
No policy info for two thirds of repositories Technical problems with 9% No data provided for 40% ‘Undefined’ for 17% - EPrints default settings Policies given Nearly all permit re-use for non-commercial purposes A third seem to allow commercial re-use Many policies copied from other repositories e.g. CogPrints Issues for service providers Lack of easily accessible policy statements Prohibited re-sale of metadata – Why prohibited?

10 [Full] Data Policy Results

11 Full Data Policy Results
Also no policy info for two thirds of repositories Technical problems with 9% No data provided for 42% ‘Undefined’ for 17% Policies given Re-sale of full items nearly universally prohibited Unclear policy in ~7% of cases 7% prohibit harvesting by robots Prohibited harvesting by robots Total prohibition prevents full text indexing and analysis Transient harvesting should be permitted – e.g. CalTech

12 Content Policies Repository Type Subject Specialities Type of Material
Institutional or departmental repository Multi-institution subject-based repository Subject Specialities Up to three, or ‘many’ Type of Material e.g. Research papers, Theses, etc Publication Status Pre-prints (not peer-reviewed) Final peer-reviewed drafts (post-prints) Published versions Individual tagging with peer-review and publication status Principle Languages Up to three

13 Submission Policies Eligible Depositors Deposition Rules
Role and/or Organisation unit Or their delegated agents Deposition Rules Who can deposit what – usually own work only Mandatory deposition of metadata Moderation (vetting) What, if anything, is vetted by the administrator e.g. eligibility, relevance, valid layout. Exclusion of spam Content Quality Control (Peer review) Responsibility for the validity and authenticity of the content Not checked, or checking by internal subject specialists. Copyright Policy Responsibility for copyright clearance Dealing with proven copyright violations

14 Interim Conclusions The eprints.xsd is not working Why? But…
Not used at all – or left ‘undefined’ Muddled entries – e.g. items under wrong heading Why? Lack of awareness of its existence Unsupported by repository software package Insufficient guidance – possible language issues Some policies not covered – e.g. preservation But… Copying indicates a desire for model policies Plenty of good examples on which to base models Would be very useful to service providers, advocates, etc.

15 Recommendations For Repository Administrators For OpenDOAR Future Work
Ensure the eprints.xsd schema is in your OAI configuration Put real policy info in the schema – not just ‘undefined’ Fix any technical issues Avoid using HTTPS For OpenDOAR Encourage repository administrators to improve matters Provide model policies Provide a ‘policy generator’ tool for administrators Future Work Update eprints.xsd or replace with something new Re-analyse annually to monitor progress

16 OpenDOAR Policy Generator
Aims Capturing policies using standard formulae Tool to help administrators formulate their policies Analysis of policies Identification of recurring phrases and concepts Natural language cluster analysis Selection of statements & options Appropriate to the policy type And meaningful OpenDOAR policy recommendations Minimum options – achieving OA goals but restricted Optimum options – refinements for more use or better quality








24 Proposed Minimum Metadata Policy
Anyone may access the metadata free of charge. The metadata may be re-used in any medium without prior permission for not-for-profit purposes provided the OAI Identifier and/or a link to the original metadata record are given. The metadata must not be re-used in any medium for commercial purposes without formal permission.

25 Proposed Minimum Full Data Policy
Anyone may access full items free of charge. Single copies of full items can be: Reproduced & displayed or performed in any format or medium for personal research or study, educational, or not-for-profit purposes without prior permission or charge. Full items must not be harvested by robots except transiently for full-text indexing or citation analysis Full items must not be sold commercially in any format or medium without formal permission of the copyright holders.

26 Proposed Minimum Submission Policy
Items may only be deposited by accredited members of the organisation, or their delegated agents. Authors/Depositors may archive only their own work. The administrator only vets items for the exclusion of spam The validity and authenticity of the content of submissions is the sole responsibility of the depositor. Any copyright violations are entirely the responsibility of the authors/depositors. If the repository receives proof of copyright violation, the relevant item will be removed immediately.

27 Optimum Policy Ideas Metadata Policy Full Data Policy
Allow re-sale of metadata Increased visibility outweighs ‘exploitation’ Full Data Policy Allow multiple copying – for educational purposes Allow full harvesting – LOCKSS-like preservation Submission Policy Mandatory deposition of metadata Mandatory deposition of thesis full texts

28 What Next? Consultation Policy generator Refining recommended policies
SHERPA partners Other interested parties Policy generator End-user testing – volunteers needed Ideas for output – e.g. text for EPrints configuration Refining recommended policies Ideas for minimum and optimum options Feedback on our proposals Aiming for release summer 2006

29 Any Questions or Feedback?
Contact Peter Millington

30 OpenDOAR Organisation
The OpenDOAR Team University of Nottingham, England Bill Hubbard, Gareth Johnson, Peter Millington University of Lund, Sweden Lars Bjørnshauge, Kristoffer Lundqvist, Salam Baker Shanawa Our Funders Open Society Institute (OSI) Joint Information Systems Committee (JISC) Consortium of Research Libraries (CURL) SPARCEurope

Download ppt "Moving Forward with the OpenDOAR Directory"

Similar presentations

Ads by Google