Presentation is loading. Please wait.

Presentation is loading. Please wait.

DOI and DataCite Establishing information infrastructures Dr. Irina Sens 14. Conference „Consortia Library Systems: Technologies and Innovation“ 23. Juni.

Similar presentations


Presentation on theme: "DOI and DataCite Establishing information infrastructures Dr. Irina Sens 14. Conference „Consortia Library Systems: Technologies and Innovation“ 23. Juni."— Presentation transcript:

1 DOI and DataCite Establishing information infrastructures Dr. Irina Sens 14. Conference „Consortia Library Systems: Technologies and Innovation“ 23. Juni 2015

2 2 Overview 0. TIB 1. Persistent identification & DOI for research data 2. DataCite 3. DOI registration 4. How to take part

3 The TIB http://www.nationsonline.org/oneworld/europe_map.htm

4 Main Building

5 The TIB German National Library for Architecture, Chemistry, Computer Science, Engineering, Mathematics, Physics, and Technology Collection scope of a national library  Special collections: Grey literature, literature from East Asia and East Europe World‘s Largest Specialist Library for Science and Technology Customers in more than 60 countries Founded 1959 – on the basis of the existing university library (founded 1831)

6 The TIB Financed through national funds (30 %) and state funds (70 %) National mission and responsibilities Member of the Leibniz-Association Quality assurance via an external Evaluation Procedure by the Leibniz-Association (Non-University Research Institutes) – 7 yearly interval Strengths, Weaknesses, Potentials Prerequisite in order to qualify for the joint national and state funding

7 TIB Hannover – Additional facts 14,7 million Euro annual acquisition budget 52,700 journal subscriptions (16,800 print; 35,900 digital) 9 million items Staff: ca. 400 people (librarians, researchers, IT people, etc.)

8 8 Global Network TechLib

9 Vision and Strategy: Publications of text, data, software code and more TIB text research data 3D-objects simulation software scientific films

10 10 1. Persistent identification & DOI for research data

11 Anything that is the foundation of further reserach is research data Data is evidence Anything that is the foundation of further reserach is research data Data is evidence Definition of research data

12 Earth quake events => doi:10.1594/GFZ.GEOFON.gfz2009kciudoi:10.1594/GFZ.GEOFON.gfz2009kciu Climate models => doi:10.1594/WDCC/dphase_mpepsdoi:10.1594/WDCC/dphase_mpeps Sea bed photos => doi:10.1594/PANGAEA.757741doi:10.1594/PANGAEA.757741 Distributes samples => doi:10.1594/PANGAEA.51749doi:10.1594/PANGAEA.51749 Medical case studies => doi:10.1594/eaacinet2007/CR/5-270407doi:10.1594/eaacinet2007/CR/5-270407 Computational model => doi:10.4225/02/4E9F69C011BC8doi:10.4225/02/4E9F69C011BC8 Audio record => doi:10.1594/PANGAEA.339110doi:10.1594/PANGAEA.339110 Grey Literature => doi:10.2314/GBV:489185967doi:10.2314/GBV:489185967 Videos => doi:10.3207/2959859860doi:10.3207/2959859860 What type of data are we talking about?

13 13 1. Persistent identification & DOI for data Social & political responsibility European Commission requirements Horizon 2020 Open Access strategies Funding body requirements  Science policy requirement to publish research data  Reusability of publicly funded research Why? – Political significance!

14 14 STM Association – 2015 Report: “…The explosion of data-intensive research is challenging publishers to create new solutions to link publications to research data (…) to facilitate data mining and to manage the dataset as a potential unit of publication (…) Change continues to be rapid, with new leadership and coordination from the Research Data Alliance (…) research funders have introduced or tightened (data) policies data repositories have grown in number and type (…) and DataCite was launched (...) discovery services such as Thomson Reuters’ Data Citation Index…” 1. Persistent identification & DOI for data Why? – Publishing companies!

15 15 1. Persistent identification & DOI for data Why? – Publishing companies! Brussels Declaration – STM Association publishing companies “… Sets or sub-sets of data that are submitted with a paper to a journal should wherever possible be made freely accessible to other scholars.” Response: data journals Example: Nature: Scientific Data “Scientific Data's central mission is to help foster the sharing and re-use of the data underpinning scientific research.”

16 16 1. Persistent identification & DOI for data But – scientific scepticism! “A biologist would rather share their toothbrush than their gene name” Mike Ashburner and others Professor in Dept of Genetics, University of Cambridge, UK

17 17 Options for publishing data: Processes RD / publication Data collections and structured databases Primary data and data sets Articles with RD Data in publication Data cited in the article, deposited in data centres & repositories Data in supplements Data on private & institutional hard disks Independent data publications 1. Persistent identification & DOI for data Data landscape – the theory Modified based on STM / Smit, E: Avoiding a Digital Dark Age for Data: why data and publications belong together ICSTI workshop Delivering Data in Science PARIS, 5 March 2012

18 18 Articles Reality of data publishing: Data centres/repositories Supplements Data on private / institutional hard disks Few Lack of archives in many subject areas! Potential for ‘data dumping’  overburdened! ~ 75 % of RD is never published 1. Persistent identification & DOI for data Data landscape – the reality Modified based on STM / Smit, E: Avoiding a Digital Dark Age for Data: why data and publications belong together ICSTI workshop Delivering Data in Science PARIS, 5 March 2012

19 19 Modified based on STM / Smit, E: Avoiding a Digital Dark Age for Data: why data and publications belong together ICSTI workshop Delivering Data in Science PARIS, 5 March 2012 Ideal case of data publishing: RD in articles RD in data centres and repositories Supplements Data on private / institutional hard disks Linking texts & data  ‘enhanced publications’ If no other data integration is possible Journals request and check RD filing Support ‘enhanced publications’; persistent identifiers Generic & discipline-specific; interfaces for good connection! 1. Persistent identification & DOI for data Data landscape – the future?

20 20 Clear referencing and citability Links data to other publications Increased visibility & enhanced access Transparent research Avoids duplication Promotes scientific cooperation Motivation for new research 1. Persistent identification & DOI for data Advantages

21 21 Resource can be clearly referenced & cited Persistent, i.e. also beyond the life span of the identified object, if necessary Clear separation between identification of the resource and the location reference PI is undertaken by registration agencies: Standards for structure and syntax Resolving mechanism Persistent identification (PI) 1. Persistent identification & DOI for data Properties

22 22 International DOI Foundation (IDF) founded in 1998 Long-term persistence & accessibility to objects Technology based on the Handle system. May 2012: DOI System ISO Standard 26324 was published Guaranteed, trustworthy responsibilities, uniform standards & work flows Quality control: obligatory metadata for each object IDF currently consists of nine registration agencies (RA) RA responsible for PI allocation and maintenance DOI ®, DOI.ORG ® and shortDOI ® are brand names of International DOI Foundation 1. Persistent identification & DOI for data DOI system

23 Registration Agencies

24 24 2. DataCite

25 25 Global consortium supported by local institutions Goal: Publication infrastructure for data & non-textual content Service provider for data centres/content providers Non-commercial, non-profit Standards, work flows and best practice Based on the DOI system 2. DataCite Background

26 26 Sturdy technical infrastructure. Annual Meetings: Hannover 2010 Berkeley 2011 Copenhagen 2012 Washington 2013 Nancy 2014 Paris 2015 TIB allocates the first DOIs for data sets Paris Memorandum. DataCite is founded in London. Seven members. ‘05‘03 DFG-funded project with German World Data Centres ‘09‘15 25 members 8 associated members 19 countries Over 5 million DOI names 2. DataCite Development

27 27 CISTI – Canada Institute for Scientific and Technical Information California Digital Library, USA Purdue University, USA OSTI – Office of Scientific and Technical Information, USA The British Library TIB, Germany ZB MED, Germany ZBW, Germany GESIS, Germany SUB Göttingen, Germany University of Tartu, Estonia JaLC – Japan Link Center DTIC – Technical Information Center of Denmark Library of TU Delft, The Netherlands Library of ETH Zürich, Switzerland INIST – L’Institut de l’Information Scientifique et Technique, France SND – Swedish National Data Service ANDS – Australian National Data Service NRCT – National Research Council of Thailand The Hungarian Academy of Sciences CRUI – Conferenza dei Rettori delle Università Italiane SAEON – South African Environmental Observation Network CERN – European Organization for Nuclear Research BIBSYS – Library System, Norway Affiliated members: Digital Curation Center, UK Microsoft Research, USA ICPSR – Interuniversity Consortium for Political and Social Research, USA KISTI – Korea Institute of Science and Technology Information BGI – Bejiing Genomic Institute, China IEEE, USA Harvard University Library, USA GWDG, Germany 2. DataCite Members Membership application for 2016

28 28 Support DataCite Member institution Data Centre Member institution Data Centre … Cooperation Managing Agent (TIB) Member Associate Stakeholder International DOI Foundation 2. DataCite Structure

29 DataCite – Board, Director and MA Board: Adam Farquhar (President) Head of Digital Scholarship, The British Library Paul Bracke (Treasurer), Associate Dean for Research and Assessment/Associate Professor, Purdue University Libraries Brigitte Hausstein, Staff division of Data Registration Agency, GESIS Salvatore Mele, Head of Open Access, CERN Karen Morgenroth (Deputy President) Manager, Content Access Services, National Research Council Irina Sens, Deputy Director, German National Library of Science and Technology (TIB) Wilma van Wezenbeek, Director, TU Delft Library Interims Director: Patricia Cruse, former Director Digital Preservation, California Digital Library

30 DataCite – Board, Director and MA Registered association under German law Statutes: The Association is a non profit making organisation; its primary objectives are not for profit. Membership is open to all not for profit organisations who wish to allocate DOI names and use the Registration Agency of DataCite in their capacity as allocating agents. Yearly General Assembly Managing Agent/Administration Office located at TIB Member support Operating and maintaining IT infrastructure (+ Purdue)

31 Summer Meeting 2015 DataCite, in conjunction with EPIC, is planning a half-day event focusing on persistent identifiers on September 21, 2015. The event will be located in Paris on the day before the Research Data Alliance (RDA) Plenary meeting. Potential topics include: citing dynamic datasets, managing versions with identifiers, enabling user facing services with identifiers, and more

32 32 Members and associated members:  Libraries, information and data centres Working Groups:  Metadata  Best practices Other services:  Metadata Store, Search, Stats, OAI Provider http://www.datacite.org/services http://www.datacite.org/services 2. DataCite Services

33 33 In cooperation with CrossRef: http://crosscite.org/citeproc/ Citation Formatter makes available citations in over 100 formats http://crosscite.org/citeproc/ http://crosscite.org/cn/ Content Negotiation can be used to automatically obtain access to the (previously deposited) media formats of an object http://crosscite.org/cn/ With STM Association publishing companies: Improved ability to access & find research data Promotion of bidirectional links between data sets & publications in data archives Enhanced visibility of links between publications & data sets 2. DataCite Cooperative activities - I

34 CrossRef  DataCite Target Group: Publishers Scholarly and professional research content. Journal articles, books, conference proceedings, etc. Reference linking and searchable metadata database. Target Group: Libraries/Information Centers with national responsibilities Data and Grey Literature Activities around establishing and sharing best-practices, identifying and solving some of the unique issues that arise with datasets. Working with data centres and organisations that hold data. Very strong cooperation

35 35 Thomson Reuters - Data Citation Index Harvesting metadata via DataCite Advantages for customers: Access to DCI statistics ORCID – ODIN project ORCID and DataCite Interoperability Network Inclusion of data sets in publication lists Track the use of data sets Link data sets to related articles, licences and all participants Follow-up project: THOR from June 2015 2. DataCite Cooperative activities - II

36 36 re3data & DataBib To merge and act under the auspices of DataCite as re3data MoU with RDA: DataCite will become an “organisational member” Endorsement of the Force11 “Joint Declaration of Data Citation Principles” 2. DataCite Cooperative activities - III

37 37 3. DOI registration

38 38 By 6/2015, over 5,000,000 DOI names had been allocated by DataCite for: Research data (~45%) Grey literature objects (~40%) Images (~10%) Medical case studies Videos Maps Learning objects Status in May 2015: 5,392337 DOI names 3. DOI registration Types of content

39 39 Securing persistence Providing metadata & landing pages Securing data granularity (worthy of citation?) DOI syntax:  Prefix is allocated by DataCite  Suffix can be defined by the data centre  Clear string  Positive list: A-Z a-z 0-9. : - _ /  New DOIs are resolvable after around 5 minutes  DOI update globally available after a max. of 24 hours 3. DOI registration Demands placed on data centres

40 40 Data centre Scientists Metadata & URL Data DOI Discovery Index DOI Service – work flow ? ? Where does my data go?

41 41 Identifier (with type attribute) Creator (with type and name identifier attributes) Title (with optional type attribute) Publisher Publication year Recommended citation: Creator (Publication Year): Title. Publisher. Identifier 3. DOI registration DataCite metadata schema - mandatory fields

42 42 Subject (with scheme attribute) Contributor (with type and name identifier attributes) Date (with type attribute) Language Resource type (with description attribute) Alternate identifier (with type attribute) Related identifier (with type and relation type attributes) Size Format Version Rights Description (with type attribute) GeoLocation (with point, box and place) 3. DOI registration DataCite metadata schema – optional/recommended fields

43 43 This is how for, example, the data set: Kuhlmann, H et al. (2009): Age models, iron intensity, magnetic susceptibility records and dry bulk density of sediment cores from around the Canary Islands. doi:10.1594/PANGAEA.727522 is analysed in the following article: Kuhlmann et al. (2004): Reconstruction of paleoceanography off NW Africa during the last 40,000 years: influence of local and regional factors on sediment accumulation. Marine Geology, 207(1-4), 209-224, doi:10.1016/j.margeo.2004.03.017 3. DOI registration Citing with DOI - I - papers & research data

44 44 Very precise citation of videos: Can also be used for other media if fragmentation is supported: PDF: doi.org/10.5438/0010#page=9doi.org/10.5438/0010#page=9 http://dx.doi.org/10.5446/393#t=01:21,02:0410.5446/393 DOI MFID resolver 3. DOI registration Citing with DOI - III – media fragment identifier

45 45 Constantly being debated, but: no universally valid guidelines (so far) for the granularity of research data! Every object that is to be cited may be allocated a DOI! 3. DOI registration Data granularity

46 46 DOIs cannot be deleted A DOI should always persistently identify precisely one object A DOI refers to a landing page – this is where metadata & information about the object is noted Should the object identified by the DOI no longer be available, this has to be specified on the landing page 3. DOI registration DOI facts

47 47 https://mds.datacite.org/ Register a data set Update a data set Upload a metadata file Find a specific DOI Register several data sets Update several data sets Upload several metadata files Retrieve metadata Individual operations User Interface (UI) “Bulk” operations Application Programming Interface (API) 3. DOI registration DataCite Metadatastore (MDS)

48 48 DataCite provides its own test environment in which all services can be tested in a closed system: http://test.datacite.org http://test.datacite.org Resolver for test DOIs: http://dx.test.datacite.orghttp://dx.test.datacite.org 3. DOI registration DataCite MDS - test environment

49 49 4. How to take part

50 50 Membership Collaboration with local data centres Registration of DOIs Collaboration in DataCite Working Groups Co-determination in DataCite Associated membership Collaboration in DataCite Working Groups Provision of advice for DataCite Cooperation with a member as a data centre DOI registration for your data sets 4. How to take part Possibilities

51 51 Спасибо за внимание! У вас есть вопросы ?


Download ppt "DOI and DataCite Establishing information infrastructures Dr. Irina Sens 14. Conference „Consortia Library Systems: Technologies and Innovation“ 23. Juni."

Similar presentations


Ads by Google