Presentation is loading. Please wait.

Presentation is loading. Please wait.

Andy Powell, Eduserv Foundation June 2006 Eprints Application Profile.

Similar presentations


Presentation on theme: "Andy Powell, Eduserv Foundation June 2006 Eprints Application Profile."— Presentation transcript:

1 Andy Powell, Eduserv Foundation andy.powell@eduserv.org.uk www.eduserv.org.uk/foundation June 2006 Eprints Application Profile

2 June 2006Eprint Application Profile Meeting - London Agenda Welcome and introductions Issues with current use of simple DC Functional Requirements Model Lunch Eprints Application Profile Workplan

3 June 2006Eprint Application Profile Meeting - London Issues (1) dc:title - where multiple titles are provided, there is no way of determining the main title dc:creator - recipient of metadata has no knowledge that normalised form of name has been used. Therefore difficult to disambiguate different author names or combine different names for the same author dc:creator - there is no mechanism for providing the affiliation of the author(s) dc:creator - there is no mechanism for indicating whether the author is a person or an organisation.

4 June 2006Eprint Application Profile Meeting - London Issues (2) dc:subject - recipient of metadata has no knowledge about whether terms have been taken from controlled vocabularies. Therefore difficult to build browse interfaces based on knowledge of vocabulary hierarchies/relationships dc:publisher - recipient of metadata has no knowledge that normalised form of name has been used. Therefore difficult to disambiguate different publisher names or combine different names for the same publisher dc:publisher - there is no mechanism for indicating whether the publisher is a person or an organisation

5 June 2006Eprint Application Profile Meeting - London Issues (3) dc:contributor - recipient of metadata has no knowledge that normalised form of name has been used. Therefore difficult to disambiguate different contributor names or combine different names for the same contributor dc:contributor - there is no mechanism for indicating whether the contributor is a person or an organisation dc:contributor - there is no mechanism for recording the nature of the contribution made by the contributor (editor, illustrator, etc.)

6 June 2006Eprint Application Profile Meeting - London Issues (4) dc:date - recipient of metadata has no knowledge about what kind of date is being provided or how the date is formatted. Therefore difficult to make any reliable use of the date in user-interface or other applications dc:type - recipient of metadata has no knowledge that the value has explicitly been taken from the controlled lists provided here and is therefore only able to infer (i.e. guess) that the originator system's use of, say, 'Preprint' corresponds to the use suggested in the guidelines dc:type - the use of dc:type to carry 'status' information somewhat stretches the semantics of the property

7 June 2006Eprint Application Profile Meeting - London Issues (5) dc:format - recipient of metadata has no explicit knowledge that a MIME type is being provided dc:format - not clear what is being described with recommended use of dc:format (1:1 problem). If the work is being described then use of dc:format is incorrect. If a single manifestation is being described, then dc:format shouldn't be repeated dc:identifier - recipient of metadata has no explicit knowledge that a URI is being provided. Nor is it particularly clear whether the 'work' or a 'manifestation' of the work is being identified

8 June 2006Eprint Application Profile Meeting - London Issues (6) dc:source - where this property is used, the recipient of metadata has no explicit knowledge about whether a URI or title or bibliographic citation is being provided dc:language - recipient of metadata has no explicit knowledge that an RFC 3066 language tag is being providedRFC 3066 dc:relation - recipient of metadata has no explicit knowledge that a URI is being provided. Nor is there any indication about the relationship between the eprint and the related resource. For example, in some cases the relationship will be 'isInstanceOf' but in others it could be 'isCitedBy'

9 June 2006Eprint Application Profile Meeting - London Issues (7) dc:coverage - recipient of metadata has no explicit knowledge that a term taken from the TGN has been used dc:coverage - there is no mechanism for indicating whether coverage is spatial or temporal dc:coverage - where coverage is temporal, there is no agreed explicitly mechanism for recording dates and date ranges dc:rights - recipient of metadata has no explicit knowledge about whether a human readable statement or a URI is being provided

10 June 2006Eprint Application Profile Meeting - London Current issues what’s the problem with using simple DC to describe eprints? difficult to differentiate ‘works/expressions’ from ‘manifestations/items’ does dc:identifier identify the work/expression or a particular manifestation/item of the work? –in ePrints UK guidelines, dc:identifier used to identify ‘work/expression’ and dc:relation used to identify ‘manifestation/item’ –but dc:relation may be used for other resources (e.g. cited works), therefore ambiguity in the metadata record –and guidelines not widely implemented anyway… –therefore difficult for software applications to move reliably from the metadata record to the full text

11 June 2006Eprint Application Profile Meeting - London Current issues (2) not possible to determine whether subject terms are taken from a controlled vocabulary or not (e.g. is ‘Physics’ a free-text keyword or a term taken from Dewey?). –therefore difficult to base subject-browse interfaces on controlled vocabulary hierarchy not possible to disambiguate authors with same name or reconcile instances of the same author being given different form of name –therefore difficult to build browse-by-author type interfaces dates are ambiguous (either because of formatting and/or because type of date is not known)

12 June 2006Eprint Application Profile Meeting - London functional requirements support search based on title, author, description, keyword, full text index support browse by keyword and author support rich subject browse based on knowledge of controlled vocabulary support filtering of search results and browse tree by type, publisher, date range, status and version(?) display title, author, publisher, keyword, full-text match in search results and browse tree move reliably from search results and browse tree to available copies, filtered by format move from search results and browse tree to OpenURL ‘link server’ support citation analysis (between works/expressions)

13 June 2006Eprint Application Profile Meeting - London functional requirements (2) enable capture of metadata about and relationships between different ‘versions’ of the same eprint be suitable for use in the context of OpenURLs and OpenURL resolvers i.e. support navigation/discovery of particular version of an eprint (e.g. most recent version of Author’s Original) and navigation/discovery of most appropriate copy of discovered ‘version’ be compatible with dc-citation WG recommendations be compatible with preservation metadata approaches be compatible with library cataloguing approaches

14 June 2006Eprint Application Profile Meeting - London Functional assumptions citations are made between eprint ‘expressions’ (in FRBR terms) hypertext links tend to be made between eprint ‘items’ (in FRBR terms) adopting a simple underlying model now may be expedient in the short term but costly to interoperability in the long term the underlying model need to be as complex as it needs to be, but not more so! a complex underlying model may be manifest in relatively simple metadata and/or end-user interfaces

15 June 2006Eprint Application Profile Meeting - London FRBR (1) FRBR models the bibliographic world using 4 key entities - 'Work', 'Expression', 'Manifestation' and 'Item'. –A work is a distinct intellectual or artistic creation. A work is an abstract entity –An expression is the intellectual or artistic realization of a work in the form of alpha-numeric, musical, or choreographic notation, sound, image, object, movement, etc., or any combination of such forms. An expression is the specific intellectual or artistic form that a work takes each time it is "realized." –A manifestation is the physical embodiment of an expression of a work. The entity defined as manifestation encompasses a wide range of materials, including manuscripts, books, periodicals, maps, posters, sound recordings, films, video recordings, CD-ROMs, multimedia kits, etc. –An item is a single exemplar of a manifestation. The entity defined as item is a concrete entity.

16 June 2006Eprint Application Profile Meeting - London FRBR (2) FRBR also defines a set of additional entities that are related to the four entities above - 'Person', 'Corporate body', 'Concept', 'Object', 'Event' and 'Place' - and a set of relationships between each of the entities. the key entity-relations appear to be: –Work -- is realized through --> Expression –Expression -- is embodied in --> Manifestation –Manifestation -- is exemplified by --> Item –Work -- is created by --> Person or Corporate Body –Manifestation -- is produced by --> Person or Corporate Body –Expression -- has a translation --> Expression –Expression -- has a revision --> Expression –Manifestation -- has an alternative --> Manifestation

17 June 2006Eprint Application Profile Meeting - London FRBR (3) Simple metadata standards like Dublin Core have traditionally tended to model the resources being described in a rather flat way - for example, as a set of relatively unrelated 'document-like objects‘ this approach may be sufficient in the context of describing Web pages, it is rather limited in those cases, like scholarly publications, where the things being described are more complex. For example, a typical eprint (the publisher's PDF file that is deposited in an eprint archive) is a single item that is an exemplar of a particular manifestation (the PDF manifestation) of a particular expression (the published version) of a work (the conceptual work that is the eprint). There may be other items that are exemplars of the same manifestation (the PDF file as served from the publisher's Web site for example), other manifestations of the saame expression (the HTML manifestation), and other expressions of the same work (the pre-print for example), and so on.

18 June 2006Eprint Application Profile Meeting - London Model based on FRBR but some of the labels have been changed intention is to make things more intuitive but may not have succeeded!

19 June 2006Eprint Application Profile Meeting - London Eprints model Eprint Version 0..∞ isExpressedAs Format isManifestedAs 0..∞ Copy isAvailableAs 0..∞ Agent 0..∞isAuthoredBy 0..∞ isPublishedBy

20 June 2006Eprint Application Profile Meeting - London Eprints model and FRBR Eprint Version 0..∞ isExpressedAs Format isManifestedAs 0..∞ Copy isAvailableAs 0..∞ Agent 0..∞isAuthoredBy 0..∞ isPublishedBy FRBR Work FRBR Expression FRBR Manifestation FRBR Item

21 June 2006Eprint Application Profile Meeting - London Eprints model and FRBR Eprint Version 0..∞ isExpressedAs Format isManifestedAs 0..∞ Copy isAvailableAs 0..∞ Agent 0..∞isAuthoredBy 0..∞ isPublishedBy the eprint (an abstract concept) the ‘version of record’ or the ‘french version’ or ‘version 2.1’ the PDF format of the version of record the publisher’s copy of the PDF … the author or the publisher

22 June 2006Eprint Application Profile Meeting - London FRBR for eprints The eprint – an abstract work Author’s Original 1.0Author’s Original 1.1 Version of Record (French) htmlpdf publisher’s copy institutional repository copy eprint (work) version (expression) format (manifestation) copy (item) Here we are using FRBR to model eprints. A work is “a distinct intellectual or artistic creation”. An expression is “the intellectual or artistic realization of a work in the form of alpha- numeric … notation …”. A manifestation is “the physical [or digital] embodiment of an expression of a work”. Finally, an item is “a single exemplar of a manifestation”. Note that “Author’s Original” and “Version of Record” (used below) are taken from the ALPSP/NISO ‘status’ vocabulary at http://www.niso.org/committees/Journal_versioning/TermsandDefinitionsdraft2006.pdf http://www.niso.org/committees/Journal_versioning/TermsandDefinitionsdraft2006.pdf Note 1: different languages modelled as versions as per FRBR sect 5.3.2 Note 2: orange parts used as basis for examples later… … Version of Record (English)

23 June 2006Eprint Application Profile Meeting - London Vertical vs. horizontal relationships Eprint Version isExpressedAs Version isExpressedAs Format isManifestedAs hasVersion hasFormat

24 June 2006Eprint Application Profile Meeting - London Vertical vs. horizontal relationships (2) Eprint Version isExpressedAs Version isExpressedAs Format isManifestedAs hasVersion and hasFormat relationships inferred by following vertical relations

25 June 2006Eprint Application Profile Meeting - London Attributes Eprint: title subject abstract identifier (URI) Version: date issued status version number language type copyright identifier (URI) Format: format date modified identifier (URI) Copy: identifier (URI) Agent: name type date of birth affiliation mailbox homepage identifier (URI) OpenURL or citation (string) is available as (URI) creator is expressed as publisher is manifested as

26 June 2006Eprint Application Profile Meeting - London Attributes Eprint: title subject abstract identifier (URI) creator is expressed as Eprint: title subject abstract identifier (URI) creator is expressed as Format: format date modified publisher is available as (URI) Format: format date modified publisher is available as (URI) Agent: name type date of birth affiliation mailbox homepage Agent: name type date of birth affiliation mailbox homepage Version: date issued status version number language type rights OpenURL or citation (string) is manifested as Version: date issued status version number language type rights OpenURL or citation (string) is manifested as


Download ppt "Andy Powell, Eduserv Foundation June 2006 Eprints Application Profile."

Similar presentations


Ads by Google