Andy Powell, Eduserv Foundation June 2006 Eprints Application Profile.

Slides:



Advertisements
Similar presentations
Presented to the ALCTS FRBR Interest Group, ALA Annual, 24 June 2011
Advertisements

Ali Alshowaish. dc.coverage element articulates limitations in the scope of the resource, typically along the following lines: geographical, temporal,
DC Architecture WG meeting Monday Sept 12 Slot 1: Slot 2: Location: Seminar Room 4.1.E01.
A centre of expertise in digital information management UKOLN is supported by: British Academy e-Resources Policy Review: UKOLN Report.
Andy Powell, Eduserv Foundation July 2006 Repository Roadmap – technical issues.
Andy Powell, Eduserv Foundation Feb 2007 The Dublin Core Abstract Model – a packaging standard?
Repositories thru the looking glass Andy Powell Eduserv Foundation
A centre of expertise in digital information management UKOLN is supported by: The Dublin Core Application Profile for Scholarly Works.
JISC CETIS Metadata and Digital Repository SIG meeting, Manchester 16 April 2007 A Dublin Core Application Profile for Scholarly Works (eprints) ‏ Julie.
International Conference on Dublin Core and Metadata Applications DC-Scholar, 24 th September /10/2014 Scholarly Works Application.
A centre of expertise in digital information management UKOLN is supported by: The Dublin Core Application Profile for Scholarly Works.
Eprints Application Profile
Open Repositories 2007 Eprints Application Profile The Eprints Application Profile: a FRBR approach to modelling repository metadata Julie Allinson, UKOLN,
A centre of expertise in digital information management UKOLN is.
A centre of expertise in digital information management UKOLN is.
Developing a Metadata Exchange Format for Mathematical Literature David Ruddy Project Euclid Cornell University Library DML 2010 Paris 7 July 2010.
Bibliographic Relationships and Bibliographic Families.
Alexandria Digital Library Project The ADEPT Bucket Framework.
RDA Terminology: A Summary Atoma Batoma. RDA Terminology RDA Vocabularies: Controlled Vocabularies -Closed – Open –
Images Application Profile meeting 29th October 2007, London Julie Allinson Digital Library Manager Library & Archives, University of York SWAP a Dublin.
Metadata for Heterogeneous Digital Assets Fellow: Yong-Mi Kim Faculty Mentors: Judy Ahronheim and Lynn Johnson.
UKOLN is supported by: Repositories and the wider context Exchange of Experience on Institutional/Digital Repositories 3 November 2006, Liverpool Julie.
A centre of expertise in digital information management UKOLN is supported by: Eprints Application Profile UK Repositories Search Project.
8/28/97Information Organization and Retrieval Metadata and Data Structures University of California, Berkeley School of Information Management and Systems.
Eprints Special Session DC-2006, Mexico Wednesday Oct 4, Julie Allinson (UKOLN, University of Bath) and Andy Powell (Eduserv Foundation)
RDF Kitty Turner. Current Situation there is hardly any metadata on the Web search engine sites do the equivalent of going through a library, reading.
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
DIGITIZATION OF RARE LIBRARY MATERIALS Metadata Format Access to Digital Documents © Adolf Knoll, National Library of the Czech Republic.
UKOLUG - July Metadata for the Web RDF and the Dublin Core Andy Powell UKOLN, University of Bath UKOLN.
Cornell CS Bibliographic Concepts CS 502 – Carl Lagoze – Cornell University Acks to H. Van de Sompel.
1 © Netskills Quality Internet Training, University of Newcastle Metadata Explained © Netskills, Quality Internet Training.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
FRAD: Functional Requirements for Authority Data.
RDA data and applications Gordon Dunsire Presented to staff of the British Library, Boston Spa, 20 Mar 2014.
1 CS 430: Information Discovery Lecture 14 Automatic Extraction of Metadata.
SWAP FOR DUMMIES. Scholarly Works Application Profile a Dublin Core Application Profile for describing scholarly works (eprints) held in institutional.
7/14/09. Robert L. Maxwell RDA Lecture Series National Library of South Africa 22 July /14/09 Cataloging: Still a Professional Asset to Become Excited.
The Active Role of Libraries in Web Based Education Patras Greece April 11th 2003.
1 CS/INFO 430 Information Retrieval Lecture 20 Metadata 2.
10/14/20151 Bibliographic Ontologies. Bibliontology Providing ontology to model bibliographic information for the libraries.
Semantics and Syntax of Dublin Core Usage in Open Archives Initiative Data Providers of Cultural Heritage Materials Arwen Hutt, University of Tennessee.
1 CS 430: Information Discovery Lecture 7 Descriptive Metadata 3 Dublin Core Automatic Generation of Catalog Records.
A centre of expertise in digital information management UKOLN is supported by: FRBR and Metadata Application Profiles Peter Cliff, Research.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Metadata and Versioning VIF workshop 22 nd April
LIS654 lecture 5 DC metadata and omeka tables Thomas Krichel
Resource Description and Access Deirdre Kiorgaard Australian Committee on Cataloguing Representative to the Joint Steering Committee for the Development.
APPLYING FRBR TO LIBRARY CATALOGUES A REVIEW OF EXISTING FRBRIZATION PROJECTS Martha M. Yee September 9, 2006 draft.
Evidence from Metadata INST 734 Doug Oard Module 8.
1 Dublin Core & DCMI – an introduction Some slides are from DCMI Training Resources at:
A centre of expertise in digital information managementwww.ukoln.ac.uk DCMI Affiliates: Implications for Institutions Rosemary Russell UKOLN University.
Intellectual Works and their Manifestations Representation of Information Objects IR Systems & Information objects Spring January, 2006 Bharat.
1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Libraries Catalogs Dublin Core.
FRBR: Cataloging’s New Frontier Emily Dust Nimsakont Nebraska Library Commission NCompass Live December 15, 2010 Photo credit:
Functional Requirements for Bibliographic Records The Changing Face of Cataloging William E. Moen Texas Center for Digital Knowledge School of Library.
Pete Johnston, Eduserv Foundation 16 April 2007 An Introduction to the DCMI Abstract Model JISC.
LIS512 lecture 2: FRBR reading International Federation of Library Association “Fundamental Requirements for Bibliographic Records”, revised.
A centre of expertise in digital information management UKOLN is supported by: Functional Requirements Eprints Application Profile Working.
RDA: history and background Ann Huthwaite Library Resource Services Manager, QUT ACOC Seminar, Sydney, 24 October 2008.
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
8/28/97Information Organization and Retrieval Introduction University of California, Berkeley School of Information Management and Systems SIMS 245: Organization.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
Metadata & Repositories Jackie Knowles RSP Support Officer.
1 CS 430: Information Discovery Lecture 7 Automatic Generation of Catalog Records.
Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.
TRIG: Truckee River Info Gateway Dave Waetjen Graduate Student in Geography Information Center for the Environement (ICE) University of California, Davis.
Session 3 Metadata & Workflow
Professional Development Programme: Design and Development of Institutional Repository Using DSpace Nipul G Shihora INFLIBNET Centre Gandhinagar
FRBR and FRAD as Implemented in RDA
Presentation transcript:

Andy Powell, Eduserv Foundation June 2006 Eprints Application Profile

June 2006Eprint Application Profile Meeting - London Agenda Welcome and introductions Issues with current use of simple DC Functional Requirements Model Lunch Eprints Application Profile Workplan

June 2006Eprint Application Profile Meeting - London Issues (1) dc:title - where multiple titles are provided, there is no way of determining the main title dc:creator - recipient of metadata has no knowledge that normalised form of name has been used. Therefore difficult to disambiguate different author names or combine different names for the same author dc:creator - there is no mechanism for providing the affiliation of the author(s) dc:creator - there is no mechanism for indicating whether the author is a person or an organisation.

June 2006Eprint Application Profile Meeting - London Issues (2) dc:subject - recipient of metadata has no knowledge about whether terms have been taken from controlled vocabularies. Therefore difficult to build browse interfaces based on knowledge of vocabulary hierarchies/relationships dc:publisher - recipient of metadata has no knowledge that normalised form of name has been used. Therefore difficult to disambiguate different publisher names or combine different names for the same publisher dc:publisher - there is no mechanism for indicating whether the publisher is a person or an organisation

June 2006Eprint Application Profile Meeting - London Issues (3) dc:contributor - recipient of metadata has no knowledge that normalised form of name has been used. Therefore difficult to disambiguate different contributor names or combine different names for the same contributor dc:contributor - there is no mechanism for indicating whether the contributor is a person or an organisation dc:contributor - there is no mechanism for recording the nature of the contribution made by the contributor (editor, illustrator, etc.)

June 2006Eprint Application Profile Meeting - London Issues (4) dc:date - recipient of metadata has no knowledge about what kind of date is being provided or how the date is formatted. Therefore difficult to make any reliable use of the date in user-interface or other applications dc:type - recipient of metadata has no knowledge that the value has explicitly been taken from the controlled lists provided here and is therefore only able to infer (i.e. guess) that the originator system's use of, say, 'Preprint' corresponds to the use suggested in the guidelines dc:type - the use of dc:type to carry 'status' information somewhat stretches the semantics of the property

June 2006Eprint Application Profile Meeting - London Issues (5) dc:format - recipient of metadata has no explicit knowledge that a MIME type is being provided dc:format - not clear what is being described with recommended use of dc:format (1:1 problem). If the work is being described then use of dc:format is incorrect. If a single manifestation is being described, then dc:format shouldn't be repeated dc:identifier - recipient of metadata has no explicit knowledge that a URI is being provided. Nor is it particularly clear whether the 'work' or a 'manifestation' of the work is being identified

June 2006Eprint Application Profile Meeting - London Issues (6) dc:source - where this property is used, the recipient of metadata has no explicit knowledge about whether a URI or title or bibliographic citation is being provided dc:language - recipient of metadata has no explicit knowledge that an RFC 3066 language tag is being providedRFC 3066 dc:relation - recipient of metadata has no explicit knowledge that a URI is being provided. Nor is there any indication about the relationship between the eprint and the related resource. For example, in some cases the relationship will be 'isInstanceOf' but in others it could be 'isCitedBy'

June 2006Eprint Application Profile Meeting - London Issues (7) dc:coverage - recipient of metadata has no explicit knowledge that a term taken from the TGN has been used dc:coverage - there is no mechanism for indicating whether coverage is spatial or temporal dc:coverage - where coverage is temporal, there is no agreed explicitly mechanism for recording dates and date ranges dc:rights - recipient of metadata has no explicit knowledge about whether a human readable statement or a URI is being provided

June 2006Eprint Application Profile Meeting - London Current issues what’s the problem with using simple DC to describe eprints? difficult to differentiate ‘works/expressions’ from ‘manifestations/items’ does dc:identifier identify the work/expression or a particular manifestation/item of the work? –in ePrints UK guidelines, dc:identifier used to identify ‘work/expression’ and dc:relation used to identify ‘manifestation/item’ –but dc:relation may be used for other resources (e.g. cited works), therefore ambiguity in the metadata record –and guidelines not widely implemented anyway… –therefore difficult for software applications to move reliably from the metadata record to the full text

June 2006Eprint Application Profile Meeting - London Current issues (2) not possible to determine whether subject terms are taken from a controlled vocabulary or not (e.g. is ‘Physics’ a free-text keyword or a term taken from Dewey?). –therefore difficult to base subject-browse interfaces on controlled vocabulary hierarchy not possible to disambiguate authors with same name or reconcile instances of the same author being given different form of name –therefore difficult to build browse-by-author type interfaces dates are ambiguous (either because of formatting and/or because type of date is not known)

June 2006Eprint Application Profile Meeting - London functional requirements support search based on title, author, description, keyword, full text index support browse by keyword and author support rich subject browse based on knowledge of controlled vocabulary support filtering of search results and browse tree by type, publisher, date range, status and version(?) display title, author, publisher, keyword, full-text match in search results and browse tree move reliably from search results and browse tree to available copies, filtered by format move from search results and browse tree to OpenURL ‘link server’ support citation analysis (between works/expressions)

June 2006Eprint Application Profile Meeting - London functional requirements (2) enable capture of metadata about and relationships between different ‘versions’ of the same eprint be suitable for use in the context of OpenURLs and OpenURL resolvers i.e. support navigation/discovery of particular version of an eprint (e.g. most recent version of Author’s Original) and navigation/discovery of most appropriate copy of discovered ‘version’ be compatible with dc-citation WG recommendations be compatible with preservation metadata approaches be compatible with library cataloguing approaches

June 2006Eprint Application Profile Meeting - London Functional assumptions citations are made between eprint ‘expressions’ (in FRBR terms) hypertext links tend to be made between eprint ‘items’ (in FRBR terms) adopting a simple underlying model now may be expedient in the short term but costly to interoperability in the long term the underlying model need to be as complex as it needs to be, but not more so! a complex underlying model may be manifest in relatively simple metadata and/or end-user interfaces

June 2006Eprint Application Profile Meeting - London FRBR (1) FRBR models the bibliographic world using 4 key entities - 'Work', 'Expression', 'Manifestation' and 'Item'. –A work is a distinct intellectual or artistic creation. A work is an abstract entity –An expression is the intellectual or artistic realization of a work in the form of alpha-numeric, musical, or choreographic notation, sound, image, object, movement, etc., or any combination of such forms. An expression is the specific intellectual or artistic form that a work takes each time it is "realized." –A manifestation is the physical embodiment of an expression of a work. The entity defined as manifestation encompasses a wide range of materials, including manuscripts, books, periodicals, maps, posters, sound recordings, films, video recordings, CD-ROMs, multimedia kits, etc. –An item is a single exemplar of a manifestation. The entity defined as item is a concrete entity.

June 2006Eprint Application Profile Meeting - London FRBR (2) FRBR also defines a set of additional entities that are related to the four entities above - 'Person', 'Corporate body', 'Concept', 'Object', 'Event' and 'Place' - and a set of relationships between each of the entities. the key entity-relations appear to be: –Work -- is realized through --> Expression –Expression -- is embodied in --> Manifestation –Manifestation -- is exemplified by --> Item –Work -- is created by --> Person or Corporate Body –Manifestation -- is produced by --> Person or Corporate Body –Expression -- has a translation --> Expression –Expression -- has a revision --> Expression –Manifestation -- has an alternative --> Manifestation

June 2006Eprint Application Profile Meeting - London FRBR (3) Simple metadata standards like Dublin Core have traditionally tended to model the resources being described in a rather flat way - for example, as a set of relatively unrelated 'document-like objects‘ this approach may be sufficient in the context of describing Web pages, it is rather limited in those cases, like scholarly publications, where the things being described are more complex. For example, a typical eprint (the publisher's PDF file that is deposited in an eprint archive) is a single item that is an exemplar of a particular manifestation (the PDF manifestation) of a particular expression (the published version) of a work (the conceptual work that is the eprint). There may be other items that are exemplars of the same manifestation (the PDF file as served from the publisher's Web site for example), other manifestations of the saame expression (the HTML manifestation), and other expressions of the same work (the pre-print for example), and so on.

June 2006Eprint Application Profile Meeting - London Model based on FRBR but some of the labels have been changed intention is to make things more intuitive but may not have succeeded!

June 2006Eprint Application Profile Meeting - London Eprints model Eprint Version 0..∞ isExpressedAs Format isManifestedAs 0..∞ Copy isAvailableAs 0..∞ Agent 0..∞isAuthoredBy 0..∞ isPublishedBy

June 2006Eprint Application Profile Meeting - London Eprints model and FRBR Eprint Version 0..∞ isExpressedAs Format isManifestedAs 0..∞ Copy isAvailableAs 0..∞ Agent 0..∞isAuthoredBy 0..∞ isPublishedBy FRBR Work FRBR Expression FRBR Manifestation FRBR Item

June 2006Eprint Application Profile Meeting - London Eprints model and FRBR Eprint Version 0..∞ isExpressedAs Format isManifestedAs 0..∞ Copy isAvailableAs 0..∞ Agent 0..∞isAuthoredBy 0..∞ isPublishedBy the eprint (an abstract concept) the ‘version of record’ or the ‘french version’ or ‘version 2.1’ the PDF format of the version of record the publisher’s copy of the PDF … the author or the publisher

June 2006Eprint Application Profile Meeting - London FRBR for eprints The eprint – an abstract work Author’s Original 1.0Author’s Original 1.1 Version of Record (French) htmlpdf publisher’s copy institutional repository copy eprint (work) version (expression) format (manifestation) copy (item) Here we are using FRBR to model eprints. A work is “a distinct intellectual or artistic creation”. An expression is “the intellectual or artistic realization of a work in the form of alpha- numeric … notation …”. A manifestation is “the physical [or digital] embodiment of an expression of a work”. Finally, an item is “a single exemplar of a manifestation”. Note that “Author’s Original” and “Version of Record” (used below) are taken from the ALPSP/NISO ‘status’ vocabulary at Note 1: different languages modelled as versions as per FRBR sect Note 2: orange parts used as basis for examples later… … Version of Record (English)

June 2006Eprint Application Profile Meeting - London Vertical vs. horizontal relationships Eprint Version isExpressedAs Version isExpressedAs Format isManifestedAs hasVersion hasFormat

June 2006Eprint Application Profile Meeting - London Vertical vs. horizontal relationships (2) Eprint Version isExpressedAs Version isExpressedAs Format isManifestedAs hasVersion and hasFormat relationships inferred by following vertical relations

June 2006Eprint Application Profile Meeting - London Attributes Eprint: title subject abstract identifier (URI) Version: date issued status version number language type copyright identifier (URI) Format: format date modified identifier (URI) Copy: identifier (URI) Agent: name type date of birth affiliation mailbox homepage identifier (URI) OpenURL or citation (string) is available as (URI) creator is expressed as publisher is manifested as

June 2006Eprint Application Profile Meeting - London Attributes Eprint: title subject abstract identifier (URI) creator is expressed as Eprint: title subject abstract identifier (URI) creator is expressed as Format: format date modified publisher is available as (URI) Format: format date modified publisher is available as (URI) Agent: name type date of birth affiliation mailbox homepage Agent: name type date of birth affiliation mailbox homepage Version: date issued status version number language type rights OpenURL or citation (string) is manifested as Version: date issued status version number language type rights OpenURL or citation (string) is manifested as