Regardless of the state of your data’s health, it can be improved by the addition of unique identifiers
What are standard identifiers? Numeric or alpha-numeric persistent designations associated with a single entity Entities can be an institution, person, or piece of content
…and what do they do, exactly? 1. Disambiguate, aka enforce uniqueness 2. Enable linking, aka data integration In other words, they provide a simple basis for data governance
Enforcing Uniqueness Means: Disambiguating things that have the same name, but are actually different: UCL: University College London (UK) Université Catholique de Louvain (Belgium) Universidad Cristiana Latinoamericana (Ecuador) University College Lillebælt (Denmark) Centro Universitario Celso Lisboa (Brazil) Union County Library (USA) NPL: National Physical Laboratory (UK) National Physical Laboratory (India) York University University of York (UK) York University (Canada) Northeastern University: Northeastern University (Boston, USA) Northeastern University (Shenyang, China)
….. And consolidating the things that have different names but are actually the same University of Oxford Univ. Oxford Oxford University Library, Oxford Univ. Radcliffe Science Library Bodleian Library Bodleian, Oxford Oxford, University of University of Northampton Northampton Business School School of Education School of Health School of Science and Technology Division of Computing Division of Engineering Environmental & Geographical Sciences Institute for Creative Leather Technologies School of Social Sciences School of The Arts
Why is disambiguation important? Uniquely identify institutions within records Eradicate duplication of data Ensure correct delivery, entitlements and access rights Better understand your customer base and relationships with institutions Improve “trust” in data Map institutions into their hierarchy
Data integration, or linking Identifiers are a single data element that provides an unambiguous “hook” into a record
What can you do with linked data? Using Institutional Identifiers to link internal systems: Break down silos Keep data up-to-date and systems synchronised Enable staff to use data more effectively Simplify data transmission Improve overall data quality Institutional Identifiers CRM Electronic document storage Usage statistics Author Database Fulfilment system Membership system Authentication Financial System
Linking author and institution IDs When authors and their affiliations are linked correctly, publishers gain: Market intelligence about authors and institutions Author and subscriber information mapped together Knowledge of where research funding is concentrated Reduction in time taken calculating open access charges (APCs) Institutions gain information about their overall research output Funders gain information about where authors reside and publish
What do we need to identify? People Authors Members Editors & other contributors Customers / subscribers Content Books & ebooks Journals Articles Institutions Subscribers / customers Funders Publishers / licensors Aggregators Sales & subscription agents
Personal Identifiers International Standard Name Identifier (ISNI) www.isni.org Open Researcher and Contributor ID (ORCID) www.orcid.org Scopus Author ID www.elsevier.com/online- tools/scopus ResearcherID http://wokinfo.com/researcherid/ And many other proprietary system IDs: Mendeley, Microsoft Academic, Google Scholar, etc….
ISNI ISO Standard 27729 ISNI is designed to be a “bridge identifier” Covers any type of entity ISNI Number Party ID 2Party ID 1 Proprietary Information and/or Metadata Proprietary Information and/or Metadata
Institutional Identifiers JISC and CASRAI (Consortia Advancing Standards in Research Administration Information) report on Organisation IDs: http://repository.jisc.ac.uk/5381/1/CC549D 001-1.0_org_ID_landscape_study.pdf http://repository.jisc.ac.uk/5381/1/CC549D 001-1.0_org_ID_landscape_study.pdf Examined the landscape of organizational identifiers in the UK and identified 23 different IDs Lots of detail on use cases for publishing, funders, and institutions
CASRAI report findings Disambiguating organizational information from multiple sources typically described as “a nightmare” Benefits from effective unique identifiers are truly realized when data is shared Key aspects of identifiers that support the widest range of uses: Governance Trust Transparency Temporal Appropriate metadata
Where & When to Include IDs Adding them to existing records Embedding IDs as new records are created – make them a required data field Priority record sets? Existing workflows? Which IDs do you need? Create dedicated fields for selected IDs
In-House Options Use internal resources & personnel to join existing records to IDs or an authority file Build customized solutions mapping systems together ; i.e. data loaders and transformation tools Improve data capture to require an ID upon record creation Manual vs. programmatic ORCID tools: http://support.orcid.org/
Outsourcing Considerations Mapping data elements in your records to standard identifiers vs. data normalization services Normalizing against a standard taxonomy Computer mapping vs manual process
How to build a linked supply chain Urge your vendors and partners to adopt identifiers Request dedicated data fields in any systems implementations Embed IDs in data exchange processes with your vendors and partners (e.g. subscription agents) Encourage authors and contributors to register with ORCID
Use Cases Identify can act as an authority file of institutions in any number of systems: editorial, MSS submissions, CRMs, financial, fulfillment, etc. Understand & analyze your customer base Analyze the wider market for opportunities Disambiguate institutions & find duplicate accounts Reveal institutional relationships with hierarchies Enhance customer records with Identify metadata Support pricing decisions & policies
The world of institutions from a publisher’s point of view
Identify Database: Catalogs & classifies institutions in the scholarly publishing supply chain…..
…organizes them into hierarchies (aka “family trees”)…
…and spans all industries, market segments, and regions. Academia Medical Not-for-profit Public libraries Corporate Government Publishers Funding bodies Intermediaries More than 370,000 institutions and growing
Delivery & Access Access is enterprise wide: All divisions may utilize complete array of Identify features and data Weekly data feed: Direct feed of complete Identify database for incorporation into your own data warehouse or systems Identify Online: Ringgold’s own web interface; may be accessed via UN/PW and IP addresses API: Webservice permits calls to Identify and returns selected data elements
Licensing terms Annual subscription: provides ongoing access to the Identify database. Upon cancellation Ringgold Numbers and Ringgold Names may be retained; Ringgold will require deletion of all other Ringgold data from the customer’s systems. Perpetual-use licence: provides ownership of all of the data provided by Ringgold in the Identify database at time of purchase and archival rights to the data supplied. The annual maintenance fee covers the supply of a continuing data feed and ownership of the data held within. Upon cancellation, Ringgold will cease to provide the data feed.
Audit Service Turn your customer records from this….. …..into this.
Auditing is…… Manual process, ideal for high-value records such as institutional subscribers Conducted by our team of 40 researchers, speaking more than 30 languages and expert in their assigned regions Delivers the following for each unique institution: Unique Ringgold Identifier Institutional hierarchy Additional metadata
Audit Process Receive files from client Normalise data (de-duplication and auto- matching) Data split into countries Data assigned to appropriate country expert Researcher checks and matches to Ringgold IDs, hierarchy etc. Researcher creates new IDs for unidentified organizations Data uploaded to Identify system Client sent encrypted file via FTP with IDs and metadata
Deliverables & Fees Audit Files for Systems: Intended for sequential upload into multiple data systems Audit Files for Humans: Excel files for direct analysis by any member of staff Identify Online incorporation: With Identify subscription, you can see your accounts in a custom, secure view of Identify Online. View your accounts vs the wider market for prospecting, penetration analysis, etc. Per-record fees apply
Audit Data Consortia Member Parent RIN Consortia Member Parent Inst Name Ringgold IDRinggold Inst Name Customer ID Customer NameProduct PriceFormat 27003Universite de Caen Basse- Normandie 27003Universite de Caen Basse- Normandie 1008564Bibliotheque Univ. de CaenAdvances in Warp Speed Engine Efficiency $ 1,230Print + Online 27003Universite de Caen Basse- Normandie 56820Universite de Caen Faculte de Medecine 1151389UFR De Medecine De CaenEnterprise-Wide Alls Package $ 10,100Print + Online 27015Universite Joseph Fourier27015Universite Joseph Fourier58596U Joe Fourier BibliothequeEnterprise-Wide Alls Package $ 10,100Print + Online 27015Universite Joseph Fourier72758Universite Joseph Fourier Faculte de Medecine de Grenoble 1216879BU Gren1 Med Lot 2Journal of Interspecies Bioengineering $ 4,500Print + Online 27092Universite Francois-Rabelais de Tours 27092Universite Francois-Rabelais de Tours 332568Tour Univ LibraryEnterprise-Wide Alls Package $ 10,100Print + Online 27092Universite Francois-Rabelais de Tours 56555Universite de Tours Faculte de Medecine de Tours 1303611Service De Documentation (I894) Annals of Mind Meld Research $ 1,600Print + Online 27092Universite Francois-Rabelais de Tours 56555Universite de Tours Faculte de Medecine de Tours 484855Medical School, Univ of ToursJournal of Interspecies Bioengineering $ 3,995Online 128791Aix-Marseille Universite128791Aix-Marseille Universite1037952Bibl. Univ Med OdontolgieLeaderSHIP Quarterly $ 970Print + Online 128791Aix-Marseille Universite128791Aix-Marseille Universite889965Med Biblio - AixJournal of Interspecies Bioengineering $ 3,995Online
Beta Affiliation Matching Service Matches institutional affiliations in personal records to Identify Combines machine matching with manual processes; ideal for datasets such as members, authors, reviewers, etc. Fees are levied on a per-record basis
Validate Validate enables Ringgold’s Identify customers to obtain Ringgold IDs for institutions which are not currently held in the Identify database with immediate effect. Users search for an institution, if the institution does not appear to be in Identify, the institution can be added and the Ringgold number obtained immediately. Ringgold’s staff and researchers manually check all entries made in the Validate system.
How Validate works User searches for an institution in Identify Cannot find institution Adds institution in Validate with required location information Obtains new and unique Ringgold ID instantly Researcher checks entry for duplication or mistakes Researcher adds metadata for new records Report sent back to publisher next day Duplicate Ringgold IDs deleted
ProtoView A service that creates and disseminates book and e-book metadata on behalf of scholarly publishers Developed from a successful model as the next generation of services to meet the needs of an evolving market Guided by industry best practices and standards Built on the Book News, Inc. foundation and its 35 years of experience in providing promotional services for publishers
Upcoming Webinars Session 3: Lean and Mean: Publication Metadata to Enhance Discovery, Purchase and Use of Your Content Wednesday, February 12. 60 minutes. Session 4: 30-Minute Workout: Quick Tips for Better Customer Data Health Wednesday February 26. 30 minutes. Visit www.ringgold.com to see full descriptions & to register.
Jay Henry Christine Orr Chief Marketing OfficerSales Director jay. henry @ ringgold. comchristine. Orr @ ringgold. com www.ringgold.com