1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ CERN Document Server Software Martin Vesely CERN Geneva, Switzerland.

Slides:



Advertisements
Similar presentations
The REPOX system Nuno Freire -
Advertisements

2008 EPA and Partners Metadata Training Program: 2008 CAP Project Geospatial Metadata: Intermediate Course Module 3: Metadata Catalogs and Geospatial One.
Web Center Certification Administration Web Center Certification Training Intuit Financial Services University.
Putting the Pieces Together Grace Agnew Slide User Description Rights Holder Authentication Rights Video Object Permission Administration.
A centre of expertise in digital information management The OAI Protocol for Metadata Harvesting Andy Powell UKOLN,
1 of 18 Information Dissemination New Digital Opportunities IMARK Investing in Information for Development Information Dissemination New Digital Opportunities.
GL5, December 4 - 5, 2003 Amsterdam, The Netherlands CERN Document Server Martin Vesely CERN Geneva, Switzerland Document Management System for Grey Literature.
© Keith G Jeffery, Anne G S Asserson GL 10 Amsterdam Keith G Jeffery Director, IT & International Strategy, STFC
1 An Update on XML.org Registry and Repository Una Kearns Documentum, Inc.
1 Web Search Environments Web Crawling Metadata using RDF and Dublin Core Dave Beckett Slides:
28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
EPrint Software options William J Nixon DAEDALUS Project, University of Glasgow eFAIR Meeting, Southampton.
Tim Brody University of Southampton CiteBase Services 13/07/2001.
OAF Workshop, May 13-14, 2002, Pisa.CYCLADES IST CYCLADES An Open Collaborative Virtual Archive Environment Umberto Straccia.
IST Humboldt University Berlin, Germany – Computer and Media Service – Electronic Publishing Group Birgit Matthaei, 4th Sept. 2003, Bath,
Heinrich Stamerjohanns Institute for Science Networking Distributed Open Archives Dr. Heinrich Stamerjohanns Institute for Science Networking at the University.
2nd OAF Workshop: 6./7. December 2002 Lisbon Subject Interoperability Breakout Session Facilitator: Paul Child 7. December 2002.
Possibility in Digital Collection Management Introduction to CONTENTdm TM Hitoshi Kamada University of Arizona Presentation for OCLC-CJK Users Group Annual.
OLAC Metadata Steven Bird University of Melbourne / University of Pennsylvania OLAC Workshop 10 December 2002.
Accessing Distributed Resources Information: An OLAC perspective Steven Bird Gary Simons Chu-Ren Huang Melbourne SIL Academia Sinica ENABLER/ELSNET Workshop.
OLAC Process and OLAC Protocol: A Guided Tour Gary F. Simons SIL International ___________________________ OLAC Workshop 10 Dec 2002, Philadelphia.
Deconstructing Cataloging A Web Services Approach to Bibliographic Control Thomas Hickey.
A centre of expertise in digital information management IMS Digital Repositories Interoperability Andy Powell UKOLN,
Theo Andrew, Edinburgh University Library Choosing Suitable Open-Source Repository Software Choosing Suitable Open Source Repository Software Theo Andrew.
Open Scholarship 2006 Bielefeld Academic Search Engine a Scientific Search Service for Institutional Repositories Open Scholarship 2006 New Challenges.
DigiTools support of Web Services Repositories and Web Services workshop | 2 June 2009 Alan Oliver, Business Development Director.
New Developments in Library and Archives Canadas ETD Program 11 th International Symposium on ETDs Aberdeen, Scotland, June 5, 2008 Sharon Reeves, Manager,
Distributed Service Registries Workshop, July 2005 Slide 1 NISO Metasearch Initiative Registries Robert Sanderson Dept. of Computer Science University.
An overview of collection-level metadata Applications of Metadata BCS Electronic Publishing Specialist Group, Ismaili Centre, London, 29 May 2002 Pete.
Collections and services in the information environment JISC Collection/Service Description Workshop, London, 11 July 2002 Pete Johnston UKOLN, University.
Collection-level description in practice Collection-Level Description & NOF-digitise projects NOF-digitise programme seminar, London, 22 February 2002.
FGDC presentation 1/14/2014 Bureau of Land Management (BLM) Christine Hawkinson, Bureau Data Administrator 1.
The Library behind the scene How does it work ? The Library behind the scenes 1 JINR / CERN Grid and advanced information systems 2012 Anne Gentil-Beccot.
HEPiX Spring Meeting, Edinburgh 26th May 2004 Integrated Digital Conferencing Mick Draper CERN (on behalf of CDS/InDiCo team)
Dspace – Digital Repository Dawn Petherick, University Web Services Team Manager Information Services, University of Birmingham MIDESS Dissemination.
The Open Archives Initiative Simeon Warner Cornell University, Ithaca, NY, USA CREPUQ 2002, Montréal, Canada 14:00, 24 October 2002.
Implementing search with free software An introduction to Solr By Mick England.
JY Le Meur/Tibor Simko 12 th Feb’04 1)Context 2)Interoperability 3)Submission 4)Search 5)Preservation CERN, OAI3 Workshop, Geneva.
Metadata Harvesting The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing Workshop.
CERN – IT Department CH-1211 Genève 23 Switzerland t CERN Open Source Collaborative tools: Digital Library Software Tim Smith CERN/IT.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
2005 JACoW Team Meeting Thomas Baron/Jose Benito Gonzalez – CERN – IT Managing Events with Indico.
07/11/2002Thomas Baron - JACoW Workshop1 CERN Library Requirements T. Baron CERN ETT-DH-CDS.
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science.
SCIELO AS AN OPEN ARCHIVE: the development of SciELO / OpenArchives data provider interface Prof. Carlos H. Marcondes Federal Fluminense University/ Information.
CERN Document Server Status News Challenges User Group Workshop 2013 Juelich University Jean-Yves Le Meur.
A Networked Machine Management System 16, 1999.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
First Indico Workshop An Introduction to the Indico Software Thomas Baron May 2013 CERN.
Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March © Heriot-Watt University. You may reproduce all or any part.
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
OAI Workshop, October 17, Geneva, Switzerland CERN Document Server: An OAI-based solution for managing data collections Jean-Yves.
Metadata and OAI DLESE OAI Workshop April 29-30, 2002 Katy Ginger Presentation available at:
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
Metadata and OAI DLESE OAI Workshop June 29 to July 2, 2002 Katy Ginger Presentation available at:
OAI and peer review Workshop (CERN 22/03/2001) Thomas Baron – Tibor Simko CERN Document Server: Validation & OAI WORKSHOP on the Open Archives initiative.
Feb 24-27, 2004ICDL 2004, New Dehli Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
CERN Document Server 19 tth January 2006 CERN Document Server Jean-Yves Le Meur 19 th January 2006.
GNU EPrints 2 Overview Christopher Gutteridge 19 th October 2002 CERN. Geneva, Switzerland.
from Invenire: inveniō invenīs invenit invenī́mus invenī́tis inveniunt
An Overview of Data-PASS Shared Catalog
Tim Smith CERN Geneva, Switzerland
Institutional Repository at NIO: Inspiration to Implementation
OAI and Metadata Harvesting
Context Interoperability Submission Search Preservation
The New Face of Information Retrieval: The Ankara University Open Access Platform Prof. Dr. Sekine Karakaş Prof. Dr. Doğan.
Presentation transcript:

1 st OAF-Workshop, th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ CERN Document Server Software Martin Vesely CERN Geneva, Switzerland

1 st OAF-Workshop, th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 2 / 14 Overview CERN Document Server Software Services within the CDS Providing CERN metadata OAI-PMH Implementation OAI-PMH Evaluation Conclusions

1 st OAF-Workshop, th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 3 / 14 CDS Introduction CDS Software runs at CERN on: metadata records full text documents 330 data collections With ~15% CERN original documents Repository MySQL database system MARC21 format Apache Web Server OAI Sets OAI repository CDS Software is available under GPL

1 st OAF-Workshop, th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 4 / 14 Services within the CDS Search engine Google-like syntax Designed for large data collections Personal features (baskets, alerts) Document Submission (with flow control) Peer reviewing for scientific notes Approval of documents 25 different types of submission Document Conversion Server Other services (Scan, Agenda, WebCast)

1 st OAF-Workshop, th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 5 / 14 Data gathering before OAI Various types of resources Structured metadata in various formats Unstructured metadata (e.g. free text) Various transfer channels http and ftp transfers, mail subscriptions individual submissions Uploader application XML XML Schema HTTP

1 st OAF-Workshop, th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 6 / 14 CDSware (metadata gathering) Harvesting model… Resources CDS (OAI compliant) Other repositories BibConvertBibHarvest repositories OAI compliant Interactive Submissions WebSubmit

1 st OAF-Workshop, th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 7 / 14 CERN as metadata repository Centralized vs. distributed model Harvesting from multiple repositories Two-way traffic / metadata sharing Hierarchical harvesting Reciprocal harvesting Providing CERN Metadata Identifiers of value added records Maintain Most recent record

1 st OAF-Workshop, th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 8 / 14 OAI-PMH Implementation CERN OAI Harvester (BibHarvest) Modules Metadata gatherer (crawler) Scheduler Python CERN OAI Repository (data provider) Optional features Data flow control OAI Sets Metadata Formats

1 st OAF-Workshop, th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 9 / 14 Data flow control Resumption tokens (optional) Expiration / lifetime Transfer failure resistant (not guarantied) Technique usedNotes Complete snapshot (Cache all metadata fields) + database queried once - database replicated Partial snapshot (no record caching) + saves resources + database queried once Individual query (for each request) + saves resources - several database queries

1 st OAF-Workshop, th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 10 / 14 OAI Sets Semantics Defined by data provider Description in XML container (opt. in v.2.0) human vs. machine readable Missing unification Prevents cross-archive services Sets by subject category

1 st OAF-Workshop, th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 11 / 14 Metadata Formats Supported metadata formats Preferred metadata format Information loss within metadata transfer Conversion from native formats possible DublinCore (only) 44 (64%) RFC_ (14%) MARC8 (12%) ETDMS7 (10%) OLAC6 (9%) Other (native) 9 (13%) TOTAL69

1 st OAF-Workshop, th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 12 / 14 OAI-PMH Evaluation Advantages Low-barrier access Unified metadata transfer Many optional features metadata brokering support To be discussed OAI identifiers Persistent / dependent on enriched metadata Application-level protocol proprietary solution Direction of Web Services

1 st OAF-Workshop, th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 13 / 14 Conclusions OAI-PMH v.2.0 CDS Software is available under GPL Implements both data provider and service provider Metadata transfer using pure oai_dc causes loss of information Cross-archive searches based on sets out of protocol scope

1 st OAF-Workshop, th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 14 / 14 Further Information CERN Document Server CDSware sources and demo Contact