Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ CERN Document Server Software Martin Vesely CERN Geneva, Switzerland.

Similar presentations


Presentation on theme: "1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ CERN Document Server Software Martin Vesely CERN Geneva, Switzerland."— Presentation transcript:

1 1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ CERN Document Server Software Martin Vesely CERN Geneva, Switzerland

2 1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 2 / 14 Overview CERN Document Server Software Services within the CDS Providing CERN metadata OAI-PMH Implementation OAI-PMH Evaluation Conclusions

3 1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 3 / 14 CDS Introduction CDS Software runs at CERN on: 430.000 metadata records 180.000 full text documents 330 data collections With ~15% CERN original documents Repository MySQL database system MARC21 format Apache Web Server OAI Sets OAI repository CDS Software is available under GPL

4 1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 4 / 14 Services within the CDS Search engine Google-like syntax Designed for large data collections Personal features (baskets, alerts) Document Submission (with flow control) Peer reviewing for scientific notes Approval of documents 25 different types of submission Document Conversion Server Other services (Scan, Agenda, WebCast)

5 1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 5 / 14 Data gathering before OAI Various types of resources Structured metadata in various formats Unstructured metadata (e.g. free text) Various transfer channels http and ftp transfers, mail subscriptions individual submissions Uploader application XML XML Schema HTTP

6 1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 6 / 14 CDSware (metadata gathering) Harvesting model… Resources CDS (OAI compliant) Other repositories BibConvertBibHarvest repositories OAI compliant Interactive Submissions WebSubmit

7 1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 7 / 14 CERN as metadata repository Centralized vs. distributed model Harvesting from multiple repositories Two-way traffic / metadata sharing Hierarchical harvesting Reciprocal harvesting Providing CERN Metadata Identifiers of value added records Maintain Most recent record

8 1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 8 / 14 OAI-PMH Implementation CERN OAI Harvester (BibHarvest) Modules Metadata gatherer (crawler) Scheduler Python CERN OAI Repository (data provider) Optional features Data flow control OAI Sets Metadata Formats

9 1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 9 / 14 Data flow control Resumption tokens (optional) Expiration / lifetime Transfer failure resistant (not guarantied) Technique usedNotes Complete snapshot (Cache all metadata fields) + database queried once - database replicated Partial snapshot (no record caching) + saves resources + database queried once Individual query (for each request) + saves resources - several database queries

10 1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 10 / 14 OAI Sets Semantics Defined by data provider Description in XML container (opt. in v.2.0) human vs. machine readable Missing unification Prevents cross-archive services Sets by subject category

11 1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 11 / 14 Metadata Formats Supported metadata formats Preferred metadata format Information loss within metadata transfer Conversion from native formats possible DublinCore (only) 44 (64%) RFC_180710 (14%) MARC8 (12%) ETDMS7 (10%) OLAC6 (9%) Other (native) 9 (13%) TOTAL69

12 1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 12 / 14 OAI-PMH Evaluation Advantages Low-barrier access Unified metadata transfer Many optional features metadata brokering support To be discussed OAI identifiers Persistent / dependent on enriched metadata Application-level protocol proprietary solution Direction of Web Services

13 1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 13 / 14 Conclusions OAI-PMH v.2.0 CDS Software is available under GPL Implements both data provider and service provider Metadata transfer using pure oai_dc causes loss of information Cross-archive searches based on sets out of protocol scope

14 1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ 14 / 14 Further Information CERN Document Server http://cds.cern.ch/ CDSware sources and demo http://cdsware.cern.ch/ Contact cds.support@cern.ch martin.vesely@cern.ch


Download ppt "1 st OAF-Workshop, 13-14 th May 2002, Pisa, Italyhttp://cdsware.cern.ch/ CERN Document Server Software Martin Vesely CERN Geneva, Switzerland."

Similar presentations


Ads by Google