Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Open Archives Initiative Story

Similar presentations


Presentation on theme: "The Open Archives Initiative Story"— Presentation transcript:

1

2 The Open Archives Initiative Story
Thomas Krichel Uni. of Surrey, Hitotsubashi Uni., Long Island Uni.

3 About this talk Follows essentially a historical approach
mixes in a few digital library concepts, interrupt me if you do not get some of them does not represent an official statement botches together various ideas from different people benefited from funding by DLF, LANL, CLIR, JISC, DINI

4 UPS call 1999-07 Ginsparg, Luce and Van de Sompel
“The purpose of this call is the mobilisation of a core group to work towards achieving a universal service for author-archived literature” emphasis on a pragmatic level of interoperability

5 UPS protoproto By Krichel, Nelson and Van de Sompel
found that the main problems of interoperability between eprint initiative are poor metadata no uniform identifier structure unclear legal terms and conditions lack of selective harvesting

6 Santa Fe meeting Representatives of arXiv, cogprints, Highwire, NCSTRL, NDLTD, RePEc, SLAC/SPIRES and others chaired by Lynch and Waters sponsored by CLIR, LANL and SPARC

7 basic concepts “Managed” or formal e-print archive; not papers on the web Open e-print archive means that there is a machine interface “record” can be metadata or metadata & full text archive may be partitioned

8 business model Inspired by RePEc initiative
Separation between data providers and service providers Many archives Many metadata collections Many services

9 requirements & realisations
Metadata harvesting (not distributed database) Namespace mandatory metadata & parallel sets acceptable use registration OA Dienst subset full id=archive|record OAMS and XML transport gentleperson’s agreement in a provider statement primitive templates

10 technical model Subset of Dienst protocol used by NCSTRL
Compatible archive respond to 4 requests List-Partitions List-Meta-Formats List-Contents (partitionspec, file-after, meta-format) Disseminate (fullID, meta-format, content- type)

11 Dublin Core-ish Minimal Metadata for selective harvesting
optional Display ID [R] Abstract Subject [R]. Comment [R] Date for Discovery [R] mandatory Title Date of Accession Full ID Author [R]

12 Implementation efforts
Implementation of Dienst subset arXiv.org done Cornell NCSTRL server done WCR done RePEc fails Harvesting arXiv NCSTRL for a test library

13 Critique Why OAMS, not Dublin Core
Dienst subset carries a lot of legacy to the full dienst protocol that.

14 development in DL community
Interest in interoperability for a long time, stated interest of the digital library federation trouble: two approaches union catalogue causes friction distributed search high entry requirement problematic to implement

15 Harvard meeting Vision statement: SFc a new way forward for interoperability could the OAi develop in a more general fashion such that it can be used by different communities? political agenda of OAi (free access) perceived as problem

16 San Antonio meeting 45 people show broad range of interest leads to problem of not getting lost. View that SFc is a technical support infrastructure Communities in different business and contents model can adopt the framework for interoperabilty

17 San Antonio meeting 2000-05 Carl’s reverse bubble
First there was the OAi that made the SFc. Now there is the SFc that is implemented by more than the original OAi discussion of what changes required to the OAi steering committee attract funding to develop other application domain

18 Ithaca meeting Experience gained with implementing & discussing the current SFc specs aim: new spec by the end of 2000 stable for experimentation but not definite hope to minimise risks for implementors maximise chances for interoperability SFc+ to translate from eprint domain interoperability towards general domain interoperability

19 Abstract concepts to keep
open eprint archive --> open archive data provider / service provider archive management issue of records needed to be discussed OAMS confuses metadata and full text

20 Implementation features to keep
Metadata harvesting OAi namespace shared metadata and parallel metadata acceptable use registration of data and service providers

21 All change please, all change...
OAi DIENST replaced by OA protocol OAi ID revised OAMS replaced by wrapped DC introduction of the concept of native metadata generalised and marginalised partitions revisited registrations

22 New OAi metadata Accession date to be renamed datestamp and stripped of semantic link to the records Full ID kept, colon used as canonical separator unqualified DC is mandatory, but empty DC may be returned introduction of the idea of native metadata OAMS scrapped, Krichel and Warner to lead an EPMS discussion

23 Solution: encapsulate metadata
<oai> <oai.fullid>dini:01</oai.fullid> <oai:datestamp>” ” <oai:datestamp> <dc xmlns:dc=“…”> <dc.title> Someone’s paper </dc.title> </dc> </oai>

24 Identifier Identifiers point to metadata records Concatenate
Case sensitive archive name delimiter is a colon anything internal to the archive appearing after that prefixed by OAI as a pointer to a resolution mechanism

25 Sets replace partitions
ONLY for a local community to implement selective harvesting there can be zero or more sets in an archive records can exist at interior nodes in the set hierarchy asking for records in a set returns records in the set and in all its subsets.

26 OA protocol Identify (no arguments, no exceptions)
ListMetadataFormats ([fullId]), response is the same as for the SFc ListSets (no arguments, empty response ok) ListRecord ([Sets] colon as separator)

27 OA protocol ListContents ([sets][recordbefore] [recordafter][metaformat]) response as before but may contain resumption token (set,recordbefore,recordafter) errors 206,503,302 GetRecord (fullId) response as before error 404

28 Encoding via cgi General syntax
baseurl?verb=verbname&argname=argval... baseurl is the location of the OA v1 protocol as registered at openarchives.org verbname is the name of the verb argname is the name of the attribute argval is the value of the attribute

29 Registration of archives
Metadata format registration as now, names alphanumeric and underscore Self-description introduced in the OA protocol through the identify verb Fields of data provider templates Natural language name description url archive id maintainer (of OA interface) version of OA protocol used OA base url

30 Conclusion After the Ithaca work, the OAi is set for another time of testing, with a broader set of tests rather than at the first time. Many ideosyncracies of the old SFc have been removed, and that will increase the overall acceptability. The new version one of the OAi protocol may be a bit more complicated than the SFc, but a lot more sound. It still is not definite.


Download ppt "The Open Archives Initiative Story"

Similar presentations


Ads by Google