Workshop on the DOI System DOI SYSTEM: SYNTAX International DOI Foundation.

Slides:



Advertisements
Similar presentations
The Corporation for National Research Initiatives The Handle System Persistent, Secure, Reliable Identifier Resolution.
Advertisements

ADL Registry (Plus a Little Technological Context) Larry Lannom Corporation for National Research Initiatives
Publishers Web Sites Standard Features. Objectives Access publishers websites Identify general features available on most publishers websites Know how.
DOI SYSTEM AND ITS APPLICATIONS
Doi> DOI Standardisation DOI Tools and Technologies.
IDF Patent Policy & Core DOI Specification. DOI Encyclopedia to DOI Core DOI Handbook has been the DOI encyclopedia – main developments of the DOI framework.
DOI Syntax - NISO Standard? Ed Pentz Academic Press.
Registration agencies: DOI deployment doi>. POLICIES Any form of identifier NUMBERING DESCRIPTION framework: DOI can describe any form of intellectual.
Handle System: DOI Technical Infrastructure Corporation for National Research Initiatives Larry Lannom December 10, 1997.
CISCIS CIS - The Common Information System Keith Hill International DOI Foundation 7th May, 1998.
Doi> Digital Object Identifier and ISO TC46/SC9 IDF meeting Bologna 2005.
doi> Digital Object Identifier: overview
Registration agencies: DOI deployment doi>. POLICIES Any form of identifier NUMBERING DESCRIPTION framework: DOI can describe any form of intellectual.
Integrating the DOI with Intra- organization Legacy Systems WWW8 Conference - DOI Workshop Toronto, May 11, 1999 Andy Stevens John Wiley & Sons, Inc.,
IDF Open Meeting 2008: Resource Access for a Digital World International DOI Foundation Brussels, June
Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.
DOI and STM doi>. A consistent extensible system –full implementation –is interoperable with other standards Supported by the publishing industry –publishers,
Corporation For National Research Initiatives DOIs and the Handle System 5 August 1998 Larry Lannom CNRI.
Resolution issues and DOI doi>. POLICIES Any form of identifier NUMBERING DESCRIPTION framework: DOI can describe any form of intellectual property, at.
IDF open meeting 2007 doi>. Eight possible innovations doi> Innovative uses of the DOI System.
Doi> DOI – new applications panel IDF Annual Members meeting Bologna 2005.
CrossRef Linking and Library Users “The vast majority of scholarly journals are now online, and there have been a number of studies of what features scholars.
DOI System: overview Norman Paskin International DOI Foundation.
Effective management Accurate tracking Easier automation.
METS: An Introduction Structuring Digital Content.
A Unified Approach to Combat Counterfeiting: Use of the Digital Object Architecture and ITU-T Recommendation X.1255 Robert E. Kahn President & CEO CNRI,
Key to the management of intellectual property in digital media BISG/NISO The Changing Standards Landscape Washington DC, June Norman Paskin IDENTIFY.
TCP/IP Protocol Suite 1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 22 World Wide Web and HTTP.
Persistent identifiers – an Overview Juha Hakala The National Library of Finland
DOI K.V.Lakshmi, Trainee, , NCSI, 01/03/2002.
Challenges for the DL and the Standards to solve them Alan Hopkinson Technical Manager (Library Systems) Learning Resources Middlesex University.
The Digital Object Identifier: A Tool for E-Commerce and Rights Management doi> Glen Secor 26 Nov 01.
1 CS 502: Computing Methods for Digital Libraries Lecture 4 Identifiers and Reference Links.
1 HTML’s Transition to XHTML. 2 XHTML is the next evolution of HTML Extensible HTML eXtensible based on XML (extensible markup language) XML like HTML.
Introducing HTML & XHTML:. Goals  Understand hyperlinking  Understand how tags are formed and used.  Understand HTML as a markup language  Understand.
EdReNe Workshop London, 8th – 9th January 2008 Enhancing the LOM application profiles using the DOI AIE – Italian Publishers Association.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Persistent Identifiers Reinhard.
Metadata and identifiers for e- journals Copenhagen Juha Hakala Helsinki University Library
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
Creating a Simple Page: HTML Overview
Doi> Norman Paskin, International DOI Foundation Digital Object Identifier.
Digital Object Identifier Charles Ellis: Chairman, International DOI Foundation Norman Paskin: Director, International DOI Foundation Steve Stone: Director,
Chapter 6 Text and Multimedia Languages and Properties
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
Piero Attanasio mEDRA: the European DOI agency The DOI as a tool for interoperability between private and public sector Athens, 14 January.
Mohammed Mohsen Links Links are what make the World Wide Web web-like one document on the Web can link to several other documents, and those.
An Overview of MPEG-21 Cory McKay. Introduction Built on top of MPEG-4 and MPEG-7 standards Much more than just an audiovisual standard Meant to be a.
Cataloguing Electronic resources Prepared by the Cataloguing Team at Charles Sturt University.
1 CrossRef - a DOI Implementation for Journal Publishers January 29, 2003 CENDI Workshop.
MPEG-21 : Overview MUMT 611 Doug Van Nort. Introduction Rather than audiovisual content, purpose is set of standards to deliver multimedia in secure environment.
Metadata Considerations Implementing Administrative and Descriptive Metadata for your digital images 1.
Linking electronic documents and standardisation of URL’s What can libraries do to enhance dynamic linking and bring related information within a distance.
Identifiers for Digitised Heritage Danijela Getliher Jasenka Zajec National and University Library in Zagreb The Seventh SEEDI Conference Digitisation.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Primary funding is provided by the JISC and ESRC. Based at Manchester Computing, The University of Manchester. 1 1 Getting Technical - Linking UKSG Serial.
Alternative Architecture for Information in Digital Libraries Onno W. Purbo
HTML Concepts and Techniques Fifth Edition Chapter 1 Introduction to HTML.
Digital Object Identifier doi> Norman Paskin The International DOI Foundation W3C DRM workshop January 22/
Website Design, Development and Maintenance ONLY TAKE DOWN NOTES ON INDICATED SLIDES.
Objective: To describe the evolution of the Internet and the Web. Explain the need for web standards. Describe universal design. Identify benefits of accessible.
Copyright (c) 2014 Pearson Education, Inc. Introduction to DBMS.
Chapter 1 Introduction to HTML, XHTML, and CSS HTML5 & CSS 7 th Edition.
1 CS 502: Computing Methods for Digital Libraries Guest Lecture William Y. Arms Identifiers: URNs, Handles, PURLs, DOIs and more.
Identifiers for a Digital World June 29, 2010 Patricia Payton Senior Director of Publisher Relations & Content Development
1 Metadata: an overview Alan Hopkinson ILRS Middlesex University.
Norman Paskin International DOI Foundation
An Overview of MPEG-21 Cory McKay.
Digital Object Identifier
MUMT611: Music Information Acquisition, Preservation, and Retrieval
New Perspectives on XML
Presentation transcript:

Workshop on the DOI System DOI SYSTEM: SYNTAX International DOI Foundation

Terminology Format Assignment and uniqueness Scope of the DOI System Relation to other identifier schemes Directory management The uses of prefixes for management Administrative granularity Outline / Key concepts in this section doi>

DOI Handbook Chapter 2, Numbering Further reading on key concepts in this section doi>

DOI name: the string that specifies a unique object (the referent) within the DOI System. Names may consist of alphanumeric characters in a sequence prescribed by the DOI syntax. The terms identifier and number are sometimes but not always used in the same sense and are to be avoided where ambiguity might arise. The unqualified use of DOI alone may also be ambiguous: the term should instead always be used in conjunction with a specific noun (DOI name, DOI system, etc). DOI name doi>

A DOI name consists of a prefix and a suffix e.g /4567 DOI names are case insensitive –10.123/ABC is identical to /AbC –This is a deliberate choice: see DOI Handbook 2.4 Prefixes and suffixes use ascii characters (letters and numbers) –in principle can use any printable characters from the Universal Character Set (UCS-2), of ISO/IEC 10646, which is the character set defined by Unicode v2.0: encompasses most characters used in every major language written today. –However, because of specific uses made of certain characters by some Internet technologies (vary by browser!), recommended to keep to simple (A-Z, 1-9) –Note encoding requirements when a DOI name is used with HTML, URLs, and HTTP (special care with % # and [space], and use of pointed brackets in xml etc) – Prefixes are allocated to DOI name assigners; assigners then add the suffix. RAs oversee the process to ensure no duplication etc. DOI syntax doi>

Prefix always begins 10 (by convention) –In practice, 10 is the Handle system prefix allocated to the IDF –If it doesnt begin 10, its not a DOI name (but it may be a Handle) Prefix may be any length, but currently using four digits. e.g /456-mydoc Prefix may be further subdivided e.g / –Current DOI System practice is not to do so unless a specific requirement –Such subdivisions are peers ( is the same level as ), but can be specifically configured to be a hierarchy DOI syntax: prefix doi>

Suffix may be any length. Suffix may incorporate another identifier numbering scheme (or may be new): –e.g /ISBN –the DOI System treats all DOI names as dumb strings –care if the other identifier contains special characters (e.g. the SICI ) If not using another identifier, then the assigner needs to devise some way of allocating numbers. Using DOI names may obviate the need adopt or create a new scheme: e.g. in CrossRef: – Publisher A uses PII: S – Publisher B uses SICI: (1997)42: 2.0.TX;2-B – Publisher C uses his own numbers: JoesPaper56 These three schemes are not at all interoperable, but become so in the DOI System as: – doi: /S – doi: / (1997)42: 2.0.TX;2-B – doi: /JoesPaper56 A particular Registration Agency may (and probably should) determine some specific rules or recommendations for its own DOI name registrants and applications. DOI syntax: suffix doi>

When displayed on screen or in print, a DOI name is preceded by a lowercase "doi:" unless the context clearly indicates that a DOI name is implied. –EXAMPLE: the DOI name /jmbi is displayed as doi: /jmbi The use of lowercase string doi follows the specification for representation as a URI; (as for e.g. "ftp:" and " When displayed in web browsers the DOI name itself may be attached to the address for an appropriate proxy server, to enable resolution of the DOI name via a standard web hyperlink. –EXAMPLE: the DOI name /jmbi could be made an active link as Visual presentation of DOI name doi>

Digital Object Identifier = Digital [Object Identifier] –not [Digital Object] Identifier The DOI ® System provides an infrastructure for persistent unique identification of entities... A DOI name is permanently assigned to an object, to provide a persistent link to current information about that object, including where the object, or information about it, can be found on the internet. Because entities of interest may be physical, digital, or abstract. –e.g. CrossRef assigns DOI name to article irrespective of format Handle: Digital Object Architecture –Not a conflict: Any entity can be abstracted into a representation as a digital object Scope of the DOI System doi>

A DOI name may be assigned to any object of any form whenever there is a functional need to distinguish it as a separate entity. Registration Agencies may specify more constrained rules for the assignment of DOI names to objects for DOI-related services. The principal focus of assignment shall be to content-related entities exemplified by, but not limited to: text documents; data sets; sound carriers; books; photographs; serials; audio, video and audiovisual recordings; software; abstract works; artwork, etc., and related entities in their management, e.g. licences, parties. doi> Scope of the DOI System

Each DOI name can specify one and only one referent in the DOI System. –A role of Registration Agencies is to provide a service to registrants which facilitates this. –However, the DOI System will not accept duplicate prefix+suffix and makes internal checks for uniqueness at the time of registration. A referent may be specified by more than one DOI name, though its recommended practice that each referent has only one DOI name. –Because it may not always be known that a DOI name already exists –Where multiple DOI names are assigned to the same referent, e.g. through assignment of DOI names by two different registration agencies, the IDF encourages registration agencies to collaborate in provide a unifying record for that referent. It is good practice never to reissue any unique identifier that has been once issued in error. doi> Uniqueness

No time limit for the existence of a DOI name shall be assumed in any assignment, service or application. A DOI name and its referent are unaffected by changes in the rights associated with the referent, or changes in the management responsibility of the referent object. The IDF implements rules for transfer of management responsibility between Registration Agencies, requirements on Registration Agencies for maintenance of records, default resolution services, and technical infrastructure resilience. The DOI System is not a means of archival preservation of identified entities. The DOI System provides a means to continue interoperability through exchange of meaningful information about identified entities and initiated actions between different systems through at minimum persistence of the DOI name and description of the referent. doi> Persistence

Party makes Creation uses Transaction about do View 2: commerce doi> Intellectual property and the DOI System Current DOI name uses

Identifier schemes already exist for many creations –ISBN, ISSN, ISRC, etc. –New ones: e.g. ISTC (textual abstractions e.g. Robinson Crusoe by Daniel Defoe) ISO standardisation of DOI System recognises this First example – Bookland DOIs from ISBNs –Name comes from Bookland bar codes from ISBNs Pilot scheme based on the new syntax of the ISBN-13 –ISBN: –DOI name to be: /45678 Second example - ISSN: Defined syntax for ISSNs in DOI names: –doi: /issnl (linking ISSN: all media versions) –doi: /issn (ISSN: specific media version) NB: Relevant information as to the identity of the referent is included in the metadata associated with the DOI name string. doi> DOI names with existing identifiers

General case ISO standardisation of DOI System –A DOI name is not intended as a replacement for other identifier schemes, but when used with them may enhance the identification functionality provided by those systems with additional functionality… Incorporate the other identifier into the DOI syntax and/or Record the other identifier in the DOI metadata. Each scheme retains its autonomy but works together doi> DOI names with existing identifiers

Parties –Authors: for disambiguation etc –Institutions: for licensing transactions, etc. –ISNI: International Standard Name Identifier (was: ISPI) Based on InterParty PIDI = Public identity identifier –ITU Identity management Focus group Any end point in the network (machines, users) Licences –ONIX for licencing work (with NISO/ERMI) Electronic Resource Management Initiative –Contextual identification doi> DOI names for entities other than creations

Granularity: the extent to which a collection of information has been subdivided for purposes of identification (e.g. a collection; a book; tables and figures) –Functional Granularity: it should be possible to identify an entity whenever it needs to be distinguished Your functional granularity may not be my functional granularity: –A wants to distinguish this book in any format, but B wants to distinguish the pdf version from the html version, etc …. It is a fundamental of almost any statistic that, to produce it, something, somewhere has been defined and identified. Never underestimate how much nuisance that small practical detail can cause. First, it has to be agreed what to count…. In maths numbers seem hard, pristine and bright, neatly defined around the edges. In life, we do better to think of something murkier and softer –The Tiger That Isnt: Seeing Through a World of Numbers (2007) Blastland & Dilnot You must know (say) PRECISELY WHAT is being identified Granularity doi>

A DOI name may be assigned to any entity, regardless of the extent to which it may be a component part of some larger entity. DOI names may be assigned at arbitrary levels of granularity or abstraction. EXAMPLE: separate DOI names may be assigned to: –a novel as an abstract work; –a specific edition of that novel; –a specific chapter within that edition of the novel; –a single paragraph; –a specific image or quotation; –each specific manifestation in which any of those entities are published or otherwise made available, –or any other level of granularity which a registrant deems to be appropriate Assignment of a DOI name shall require the Registrant to record metadata describing the entity to which the DOI name is being assigned. The metadata shall describe the entity to the degree that is necessary to distinguish it as a separate entity within the DOI System. In certain cases (which shall be defined in the User Manual) it shall be allowable for no metadata declaration to be made. Granularity doi>

Manuscript mss #ABC123 paper journal/volume/page Specifying what is identified Two things in one: Physical manifestation of intangible work (which is identified?) doi>

MS Vol/page; ISBN; SICI, etc Web page URL intangible Work intangible Work work used in analytical sense, not copyright sense

Versions – separately identified?

Document on screen Abstract work? Manifestation of abstract work? Version? This HTML file? All/some of these? What are we identifying? doi>

Does it matter? Yes, it can do. e.g.: 1. Practical use of data. Example – journal article –For the purpose of citation: Count pdf, print, html as same Citation refers to the abstract work (hence ISI, CrossRef) –For the purpose of purchase: Count pdf, print, html as different Purchase refers to the manifestation –Suppose I encounter a purchase system and try to use it for counting citations…. –Can I rely on a system now if I dont know what is being identified? Can others rely on the system long after Im gone? 2. Legal implications: copyright My A is the same as your B and is my copyright…

Principles: Unique Identification: every entity should be uniquely identified within an identified namespace. Functional Granularity: it should be possible to identify an entity whenever it needs to be distinguished Designated Authority: the author of an item of metadata should be securely identified. Appropriate Access: everyone requires access to the metadata on which they depend, and privacy and confidentiality for their own metadata from those who are not dependent on it. Definition of metadata: An item of metadata is a relationship that someone claims to exist between two referents (description) More on this: see data model The framework doi>

Many of the items we manage should be treated as First-class objects. First class = having an identity independent of any other item. –A key concept of Digital object architecture (e.g. Handles) Document456 Vanity Fair Penguin Classics: Vanity Fair ISBN-13: First class naming First class name doi>

A DOI name consists of a prefix and a suffix e.g /4567 A prefix can have unlimited suffixes So in theory, only one prefix is needed? Could a set of DOI names ever need to be managed differently – e.g. separated across DOI RAs, or different mirror servers, etc? CrossRef example: Prefix allocated to a publisher (imprint), not a journal Would it be better to have a separate prefix for each journal? Journals can move publisher. Easy to manage one prefix on an everyday basis (ISBN, etc) –Management of a whole customers DOI name set by one prefix But easiest to group DOI names by separate prefixes if you need to change them… A trade-off The use of prefixes doi>

Who will need to administer the prefix? IDF Directory Manager RA manager Individual customer of RA (e.g. a publisher) Individual manger within a publisher (e.g. production manager) Prefixes can have a defined administrator –Similarly, URLs rely on one site administrator But also: DOI names can have any level of administrative granularity Every single DOI name could have a different manager! Handle System has various levels of administrator, and keys A choice which must depend on each applications requirements Administrative granularity doi>

URL2http://a-books.com/…. DLS9acme/repository HS_ADMIN100acme.admin/jsmith XYZ Handle data HandleData type Index /456URL1http://acme.com/…. Handles resolve to typed data Rules for data type construction: doi>

Terminology Format Assignment and uniqueness Scope of the DOI System Relation to other identifier schemes Directory management The uses of prefixes for management Administrative granularity Outline / Key concepts in this section doi>

Workshop on the DOI System DOI SYSTEM: SYNTAX International DOI Foundation