Presentation is loading. Please wait.

Presentation is loading. Please wait.

Workshop on the DOI System DOI SYSTEM: SYNTAX International DOI Foundation.

Similar presentations

Presentation on theme: "Workshop on the DOI System DOI SYSTEM: SYNTAX International DOI Foundation."— Presentation transcript:

1 Workshop on the DOI System DOI SYSTEM: SYNTAX International DOI Foundation

2 Terminology Format Assignment and uniqueness Scope of the DOI System Relation to other identifier schemes Directory management The uses of prefixes for management Administrative granularity Outline / Key concepts in this section doi>

3 DOI Handbook Chapter 2, Numbering Further reading on key concepts in this section doi>

4 DOI name: the string that specifies a unique object (the referent) within the DOI System. Names may consist of alphanumeric characters in a sequence prescribed by the DOI syntax. The terms identifier and number are sometimes but not always used in the same sense and are to be avoided where ambiguity might arise. The unqualified use of DOI alone may also be ambiguous: the term should instead always be used in conjunction with a specific noun (DOI name, DOI system, etc). DOI name doi>

5 A DOI name consists of a prefix and a suffix e.g /4567 DOI names are case insensitive –10.123/ABC is identical to /AbC –This is a deliberate choice: see DOI Handbook 2.4 Prefixes and suffixes use ascii characters (letters and numbers) –in principle can use any printable characters from the Universal Character Set (UCS-2), of ISO/IEC 10646, which is the character set defined by Unicode v2.0: encompasses most characters used in every major language written today. –However, because of specific uses made of certain characters by some Internet technologies (vary by browser!), recommended to keep to simple (A-Z, 1-9) –Note encoding requirements when a DOI name is used with HTML, URLs, and HTTP (special care with % # and [space], and use of pointed brackets in xml etc) – Prefixes are allocated to DOI name assigners; assigners then add the suffix. RAs oversee the process to ensure no duplication etc. DOI syntax doi>

6 Prefix always begins 10 (by convention) –In practice, 10 is the Handle system prefix allocated to the IDF –If it doesnt begin 10, its not a DOI name (but it may be a Handle) Prefix may be any length, but currently using four digits. e.g /456-mydoc Prefix may be further subdivided e.g / –Current DOI System practice is not to do so unless a specific requirement –Such subdivisions are peers ( is the same level as ), but can be specifically configured to be a hierarchy DOI syntax: prefix doi>

7 Suffix may be any length. Suffix may incorporate another identifier numbering scheme (or may be new): –e.g /ISBN –the DOI System treats all DOI names as dumb strings –care if the other identifier contains special characters (e.g. the SICI ) If not using another identifier, then the assigner needs to devise some way of allocating numbers. Using DOI names may obviate the need adopt or create a new scheme: e.g. in CrossRef: – Publisher A uses PII: S – Publisher B uses SICI: (1997)42: 2.0.TX;2-B – Publisher C uses his own numbers: JoesPaper56 These three schemes are not at all interoperable, but become so in the DOI System as: – doi: /S – doi: / (1997)42: 2.0.TX;2-B – doi: /JoesPaper56 A particular Registration Agency may (and probably should) determine some specific rules or recommendations for its own DOI name registrants and applications. DOI syntax: suffix doi>

8 When displayed on screen or in print, a DOI name is preceded by a lowercase "doi:" unless the context clearly indicates that a DOI name is implied. –EXAMPLE: the DOI name /jmbi is displayed as doi: /jmbi The use of lowercase string doi follows the specification for representation as a URI; (as for e.g. "ftp:" and "http:"). When displayed in web browsers the DOI name itself may be attached to the address for an appropriate proxy server, to enable resolution of the DOI name via a standard web hyperlink. –EXAMPLE: the DOI name /jmbi could be made an active link as Visual presentation of DOI name doi>

9 Digital Object Identifier = Digital [Object Identifier] –not [Digital Object] Identifier The DOI ® System provides an infrastructure for persistent unique identification of entities... A DOI name is permanently assigned to an object, to provide a persistent link to current information about that object, including where the object, or information about it, can be found on the internet. Because entities of interest may be physical, digital, or abstract. –e.g. CrossRef assigns DOI name to article irrespective of format Handle: Digital Object Architecture –Not a conflict: Any entity can be abstracted into a representation as a digital object Scope of the DOI System doi>

10 A DOI name may be assigned to any object of any form whenever there is a functional need to distinguish it as a separate entity. Registration Agencies may specify more constrained rules for the assignment of DOI names to objects for DOI-related services. The principal focus of assignment shall be to content-related entities exemplified by, but not limited to: text documents; data sets; sound carriers; books; photographs; serials; audio, video and audiovisual recordings; software; abstract works; artwork, etc., and related entities in their management, e.g. licences, parties. doi> Scope of the DOI System

11 Each DOI name can specify one and only one referent in the DOI System. –A role of Registration Agencies is to provide a service to registrants which facilitates this. –However, the DOI System will not accept duplicate prefix+suffix and makes internal checks for uniqueness at the time of registration. A referent may be specified by more than one DOI name, though its recommended practice that each referent has only one DOI name. –Because it may not always be known that a DOI name already exists –Where multiple DOI names are assigned to the same referent, e.g. through assignment of DOI names by two different registration agencies, the IDF encourages registration agencies to collaborate in provide a unifying record for that referent. It is good practice never to reissue any unique identifier that has been once issued in error. doi> Uniqueness

12 No time limit for the existence of a DOI name shall be assumed in any assignment, service or application. A DOI name and its referent are unaffected by changes in the rights associated with the referent, or changes in the management responsibility of the referent object. The IDF implements rules for transfer of management responsibility between Registration Agencies, requirements on Registration Agencies for maintenance of records, default resolution services, and technical infrastructure resilience. The DOI System is not a means of archival preservation of identified entities. The DOI System provides a means to continue interoperability through exchange of meaningful information about identified entities and initiated actions between different systems through at minimum persistence of the DOI name and description of the referent. doi> Persistence

13 Party makes Creation uses Transaction about do View 2: commerce doi> Intellectual property and the DOI System Current DOI name uses

14 Identifier schemes already exist for many creations –ISBN, ISSN, ISRC, etc. –New ones: e.g. ISTC (textual abstractions e.g. Robinson Crusoe by Daniel Defoe) ISO standardisation of DOI System recognises this First example – Bookland DOIs from ISBNs –Name comes from Bookland bar codes from ISBNs Pilot scheme based on the new syntax of the ISBN-13 –ISBN: –DOI name to be: /45678 Second example - ISSN: Defined syntax for ISSNs in DOI names: –doi: /issnl (linking ISSN: all media versions) –doi: /issn (ISSN: specific media version) NB: Relevant information as to the identity of the referent is included in the metadata associated with the DOI name string. doi> DOI names with existing identifiers

15 General case ISO standardisation of DOI System –A DOI name is not intended as a replacement for other identifier schemes, but when used with them may enhance the identification functionality provided by those systems with additional functionality… Incorporate the other identifier into the DOI syntax and/or Record the other identifier in the DOI metadata. Each scheme retains its autonomy but works together doi> DOI names with existing identifiers

16 Parties –Authors: for disambiguation etc –Institutions: for licensing transactions, etc. –ISNI: International Standard Name Identifier (was: ISPI) Based on InterParty PIDI = Public identity identifier –ITU Identity management Focus group Any end point in the network (machines, users) Licences –ONIX for licencing work (with NISO/ERMI) Electronic Resource Management Initiative –Contextual identification doi> DOI names for entities other than creations

17 Granularity: the extent to which a collection of information has been subdivided for purposes of identification (e.g. a collection; a book; tables and figures) –Functional Granularity: it should be possible to identify an entity whenever it needs to be distinguished Your functional granularity may not be my functional granularity: –A wants to distinguish this book in any format, but B wants to distinguish the pdf version from the html version, etc …. It is a fundamental of almost any statistic that, to produce it, something, somewhere has been defined and identified. Never underestimate how much nuisance that small practical detail can cause. First, it has to be agreed what to count…. In maths numbers seem hard, pristine and bright, neatly defined around the edges. In life, we do better to think of something murkier and softer –The Tiger That Isnt: Seeing Through a World of Numbers (2007) Blastland & Dilnot You must know (say) PRECISELY WHAT is being identified Granularity doi>

18 A DOI name may be assigned to any entity, regardless of the extent to which it may be a component part of some larger entity. DOI names may be assigned at arbitrary levels of granularity or abstraction. EXAMPLE: separate DOI names may be assigned to: –a novel as an abstract work; –a specific edition of that novel; –a specific chapter within that edition of the novel; –a single paragraph; –a specific image or quotation; –each specific manifestation in which any of those entities are published or otherwise made available, –or any other level of granularity which a registrant deems to be appropriate Assignment of a DOI name shall require the Registrant to record metadata describing the entity to which the DOI name is being assigned. The metadata shall describe the entity to the degree that is necessary to distinguish it as a separate entity within the DOI System. In certain cases (which shall be defined in the User Manual) it shall be allowable for no metadata declaration to be made. Granularity doi>

19 Manuscript mss #ABC123 paper journal/volume/page Specifying what is identified Two things in one: Physical manifestation of intangible work (which is identified?) doi>

20 MS Vol/page; ISBN; SICI, etc Web page URL intangible Work intangible Work work used in analytical sense, not copyright sense


22 Versions – separately identified?

23 Document on screen Abstract work? Manifestation of abstract work? Version? This HTML file? All/some of these? What are we identifying? doi>

24 Does it matter? Yes, it can do. e.g.: 1. Practical use of data. Example – journal article –For the purpose of citation: Count pdf, print, html as same Citation refers to the abstract work (hence ISI, CrossRef) –For the purpose of purchase: Count pdf, print, html as different Purchase refers to the manifestation –Suppose I encounter a purchase system and try to use it for counting citations…. –Can I rely on a system now if I dont know what is being identified? Can others rely on the system long after Im gone? 2. Legal implications: copyright My A is the same as your B and is my copyright…

25 Principles: Unique Identification: every entity should be uniquely identified within an identified namespace. Functional Granularity: it should be possible to identify an entity whenever it needs to be distinguished Designated Authority: the author of an item of metadata should be securely identified. Appropriate Access: everyone requires access to the metadata on which they depend, and privacy and confidentiality for their own metadata from those who are not dependent on it. Definition of metadata: An item of metadata is a relationship that someone claims to exist between two referents (description) More on this: see data model The framework doi>

26 Many of the items we manage should be treated as First-class objects. First class = having an identity independent of any other item. –A key concept of Digital object architecture (e.g. Handles) Document456 Vanity Fair Penguin Classics: Vanity Fair ISBN-13: First class naming First class name doi>

27 A DOI name consists of a prefix and a suffix e.g /4567 A prefix can have unlimited suffixes So in theory, only one prefix is needed? Could a set of DOI names ever need to be managed differently – e.g. separated across DOI RAs, or different mirror servers, etc? CrossRef example: Prefix allocated to a publisher (imprint), not a journal Would it be better to have a separate prefix for each journal? Journals can move publisher. Easy to manage one prefix on an everyday basis (ISBN, etc) –Management of a whole customers DOI name set by one prefix But easiest to group DOI names by separate prefixes if you need to change them… A trade-off The use of prefixes doi>

28 Who will need to administer the prefix? IDF Directory Manager RA manager Individual customer of RA (e.g. a publisher) Individual manger within a publisher (e.g. production manager) Prefixes can have a defined administrator –Similarly, URLs rely on one site administrator But also: DOI names can have any level of administrative granularity Every single DOI name could have a different manager! Handle System has various levels of administrator, and keys A choice which must depend on each applications requirements Administrative granularity doi>

29 URL2…. DLS9acme/repository HS_ADMIN100acme.admin/jsmith XYZ Handle data HandleData type Index /456URL1…. Handles resolve to typed data Rules for data type construction: doi>

30 Terminology Format Assignment and uniqueness Scope of the DOI System Relation to other identifier schemes Directory management The uses of prefixes for management Administrative granularity Outline / Key concepts in this section doi>

31 Workshop on the DOI System DOI SYSTEM: SYNTAX International DOI Foundation

Download ppt "Workshop on the DOI System DOI SYSTEM: SYNTAX International DOI Foundation."

Similar presentations

Ads by Google