Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linking persistent identifiers at the British Library

Similar presentations


Presentation on theme: "Linking persistent identifiers at the British Library"— Presentation transcript:

1 Linking persistent identifiers at the British Library
Rachael Kotarski 24 April 2017

2 The British Library National Library of the United Kingdom
Receives one copy of everything published in the UK and Ireland Collections include; sound, images, video, newspapers, maps, manuscripts, books, databases, journals… 3 million items added to the collections every year We don’t collect all this as a museum, we are a working research library

3 Services to research include:
“The British Library Board was established to manage the Library as a national centre for reference, study, and bibliographic and other information services” Services to research include: Sound archive News and Media Service Web archive National Bibliography as open data Digitised collections as data EThOS We provide services for all kinds of researchers in every kind of format. I’m going to focus today on EThOS and our datasets

4 EThOS is for discovery and access to UK PhD theses

5 Oldest record: 1770 http://bit.ly/2oYmbuJ MDCCLXX
It covers almost all theses awarded in the UK, and has records for theses dating back to 1770. Half of these are freely available, either directly download-able from EthOS, or linked to institutional repository copies. Half of the remaining can be digitised on demand. This is more relevant to older these as… Oldest record: 1770

6 UK thesis submissions and PIDs (2015)
At least 88% of UK thesis submissions now include an electronic copy But no one is assigning PIDs Need to demonstrate the benefits! In a 2015 survey we found that 88% of these are submitted with an e- copy. Despite this swing to electronic thesis submission, the same survey found that no UK institutions were yet assigning PIDs to theses. We needed to articulate the benefits of having PIDs for theses and their authors.

7 Benefits: meeting challenges
Tracking usage 50,000 are downloaded every month, where are they used? Citations and links to online copies Citations are for physical copies, or have fragile links Analysing career paths Where do authors go? Breaking data out of the appendix Enabling data sharing practice at the start of careers Value for money of PhD funding We wanted to update EThOS to be able to display what PIDs were available and so show how they could be used. We know that 50k theses are downloaded from EThOS each month – but are they actually being used? How do we move researchers from citing the physical copy rather than the electronic copy – which is the one they are more likely to have used. By having ORCID iDs for authors, it would be easier to see how their research shaped their career. By having PIDs for the thesis and for the underlying data as a separate entity, we can help grow good data sharing practice from the start of research careers. With PIDs for funding information we could also look at the success of PhD funding, when linked to citations and career paths – we’re not able to do this yet, but it is an area for future development. So lets look at demonstrations of these benefits…

8 Tracking usage and citations
By updating EThOS to hold DOIs for theses, they can now be used for altmetrics….

9 And for looking at citations of theses.

10 Encouraging citation Citation can be encouraged by including the DOI in the citation metadata, but once someone has downloaded a thesis, how can we ensure they still know how the cite a thesis using the DOI?

11 Encouraging citation By ensuring it is also included in the cover page for the thesis

12 Analysing career paths
We updated EThOS to include ORCID iDs so that you can go through to…

13 Analysing career paths
ORCID records to see career paths and subsequent research where available. This was important because funders and institutions have an interest in looking at what their PhDs go on to do, and if they stay in research, how their interests evolve.

14 Breaking data out of the appendix
Although not yet available in EThOS, theses can link to PIDs for their data…

15 Breaking data out of the appendix
And vice versa. What we would like to do moving forward is look at how you can break out all of the related items – not just data, but the bibliography – to get the PIDs for all items and place theses in the wider context of research landscape.

16 Remaining challenges FIX: FIX: Claims through EThOS use ISNI
With all our collections, legacy is the main issue: ISSUE: DOIs for legacy theses FIX: use ISNI ISSUE: Name IDs for legacy theses FIX: Claims through EThOS So we’ve worked with UK institutions to demonstrate how PIDs can support PhD research, but challenges still remain. Our biggest challenges relate to legacy. Active researchers may only recently have created an ORCID iD – so they will not currently be associated with their thesis. In fact, with theses dating back to 1770, their authors will not be able to create ORCID iDs at all! For legacy theses, we need to look at what PIDs are already available – have they already been added to ResearchGate or Figshare, and so do they already have a DOI? Should the institution create the DOI, but what if the institution no longer holds a copy? What if the institution no longer exists? The first set of problems is easy to manage…

17 Allow claims Is this your thesis? Add it to your ORCiD profile
Allow authors to claim their thesis to their ORCID record. We are working on this! There are requirements we need to define around how we push claims information back to institutions, or whether that is necessary at all. But this still only works for researchers who are still active. For even older theses…

18 Using ISNIs We have started to implement ISNIs.
International Standard Name Identifier. These are generated from bibliographic metadata and automatic disambiguation. As they are generated automatically, they don’t rely on the researcher creating them. As an ISNI RA, we can provide metadata to ISNI and get the IDs back.

19 We can also use this process to create new ISNIs
We can also use this process to create new ISNIs. For new theses, this will be one of the authors first outputs. By providing EThOS metadata to ISNI, they can generate new identifiers which can later be linked to ORCID profiles. National Library of Poland is an ISNI Registration agency.

20 Data.bl.uk The final area I quickly wanted to highlight is how we’re using PIDs for data that we are creating based on our national collections. At data.bl.uk we provide digitised content, and metadata as datasets. Each gets a DOI, but again, there is an ambition to link it to more identifiers that define the content of the dataset

21 Benefits of linking PIDs in text corpora
Clarity on content PIDs for linking in metadata Lowered burden of metadata PIDs link to information on each item For instance, if we link each dataset to ISBNs for each of the related books, or the ISNIs for authors, there is clarity on the content of the corpus, and there is a reduced burden in providing a large amount of metadata to provide that clarity otherwise. It means we can link out to it, rather than holding it separately as a part of the dataset. This also supports the reproducibility of text and data mining performed on the corpus. But again, there are still challenges.

22 Challenges of linking PIDs in corpora
PIDs not as widely used in humanities Citation of physical items preferred over digital PIDs for all authors within a corpus Is this for acknowledgement or for provenance? Does it make them a ‘contributor’ to resulting work? For us, the biggest has been that fact that PIDs are not being well used by humanities researchers. When citing or referencing material, they still prefer to use references to the physical items, even if what they consulted was a digital representation. Through the THOR project, we are looking at how we can better engage humanities researchers, and get them to take advantage of identifiers in their work. Also, while it reduces metadata burden, linking to the PIDs of all authors may create confusion as to the reasons for that – it is to acknowledge a contribution or for provenance? Does using an ORCID iD automatically mean they are considered a contributor, and if so, should another kind of ID be used?

23 Thank you


Download ppt "Linking persistent identifiers at the British Library"

Similar presentations


Ads by Google