Biodiversity literature mark-up Compelling use cases for Natural History Collections Dr Dimitris Koureas Natural History Museum London Workshop on mark-up.

Slides:



Advertisements
Similar presentations
Sylvia OrliSylvia Orli Department of BotanyDepartment of Botany National Museum of Natural HistoryNational Museum of Natural History Smithsonian InstitutionSmithsonian.
Advertisements

GUID-1 Workshop Welcome and Introduction Donald Hobern GBIF Program Officer for Data Access and Database Interoperability February 2006.
Facilitating biodiversity science through
Collections Digital Strategy Alan Hart. Collections Digital Strategy Science Strategy: Challenge: A new generation of natural history museums – revolutionise.
IDENTIFIERS & THE DATA CITATION INDEX DISCOVERY, ACCESS, AND CITATION OF PUBLISHED RESEARCH DATA NIGEL ROBINSON 17 OCTOBER 2013.
Discove r Humanities and Social Science Electronic Thesaurus - HASSET Faceted search HASSET is the subject thesaurus that the UK Data Service uses to index.
EU BON citizen science gateway Veljo Runnel University of Tartu Natural History Museum.
OpenUp! A New Project on Opening up the European Natural History Heritage for EUROPEANA W. G. Berendsohn, A. K. Michel, A. Güntsch, W.-H. Kusber (2011)
A LOOMING CRISIS: MAINTAINING ACCESS TO ELECTRONIC RESEARCH PRODUCTS Daphne Fautin University of Kansas Gail Kampmeier Illinois Natural History Survey.
By Saurabh Sardesai October 2014.
CRIS and Research Data Management in the UK euroCRIS Strategic Membership Meeting Amsterdam, th November 2014 Anna University.
Dimitris Koureas, Vince Smith & Simon Rycroft Natural History Museum London Linking data, services and communities using Virtual Research Environments.
Institutional Perspective on Credit Systems for Research Data MacKenzie Smith Research Director, MIT Libraries.
THE DATA CITATION INDEX AN INNOVATIVE SOLUTION TO EASE THE DISCOVERY, USE AND ATTRIBUTION OF RESEARCH DATA MEGAN FORCE 22 FEBRUARY 2014.
Fourth Annual Summit | Feb | Tucson, AZ Scratchpads for community involvement for natural history collections Dr Dimitris Koureas Biodiversity.
GEO Work Plan Symposium 2012 ID-05 Resource Mobilization for Capacity Building (individual, institutional & infrastructure)
Providing Access to Your Data: Tracking Data Usage Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
Bridging the Impact Gap: Systems and Serendipity 13 September 2013 ARMS Conference, Adelaide 2013 Natalie Thompson Acting Research Support Services Manager.
Report of the Science and Technology Committee GEO Plenary VIII Istanbul, Turkey 16 November 2011.
11 th GBIF Global NODES Meeting Incentivising and Strategising Publishing of Biodiversity Data Vishwas Chavan Senior Programme Officer for Digitisation.
Providing Access to Your Data: Tracking Data Usage Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
Alma Swan Key Perspectives Ltd Truro, UK Australian National University, Canberra, 22 September 2008.
EZID Easy Identifiers UC Curation Center California Digital Library.
Mid-Term GBIF Committees Meetings eLearning Alberto González Talaván Global Biodiversity Information Facility (GBIF) May 2011.
Biodiversity Informatics at the Natural History Museum Ed Baker Terrestrial Invertebrates, Department of Life Sciences & NHM Informatics Initiative
Dimitris Koureas, PhD Natural History Museum London Linking layers of biodiversity data: Informatics challenges for the long tail research RDA - Long Tail.
University of Florida Florida State University
NHM Digital Collection Programme Ian Owens, Natural History Museum, London Digital Specimen 2014, Berlin, September 2014.
Dr. Fran Berman, RPI Feedback from BRDI Sponsor Forum 11/11 January 29, 2012 Fran Berman.
32. 2 “The Obama Administration is committed to the proposition that citizens deserve easy access to the results of scientific research their tax dollars.
TDWG 2006 Conference, St Louis Digitizing the legacy literature of biodiversity An introduction to the Biodiversity Heritage Library (BHL) Neil Thomson.
Technology Transfer Execution Framework. 2 © 2007 Electric Power Research Institute, Inc. All rights reserved. Relationship Between Your EPRI Value and.
Opening access to UK doctoral theses: the EThOS E-Theses Service 13 August 2014 Sara Gould.
Dr Jamal Roudaki Faculty of Commerce Lincoln University New Zealand.
UKOLN is supported by: Digital Preservation Benefits Tools Project Dissemination Workshop Dr Liz Lyon, Associate Director, UK Digital Curation Centre Director,
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Meredith A. Lane CODATA/ERPANET Workshop: Scientific Data Selection &
Maximizing the Value of Investments in Tax Administration Terry Lutes Principal, M Group.
CBD CoP 11 Special Event National Biodiversity Information Outlook (NBIO) Vishwas Chavan 15 October 2012 Hyderabad.
Coreoidea Species File Online Laurence Livermore 5 th IHS Quadrennial Meeting – July 2014 Lessons Learned in Creating a Comprehensive Taxonomic Inventory.
Research Information Management: Continuity, Change and Impact Michael Jubb Research Information Network UUK Workshop 5 December 2007.
Introducing the Science and Technology Roadmap 1 st GEO/EGIDA Workshop Bonn, Germany, May 09 th - 11 th, 2011.
Scratchpads and the new Biodiversity Data Journal Biodiversity Data Publishing made… easier Dimitris Koureas Natural History Museum London.
National Library of Finland Strategic, Systematic and Holistic Approach in Digitisation Cultural unity and diversity of the Baltic Sea Region – common.
Context: The Strategic Plan for Establishing the Network Integrated Biocollections Alliance Judith E. Skog, Office of the Assistant Director, Biological.
Dataset citation Clickable link to Dataset in the archive Sarah Callaghan (NCAS-BADC) and the NERC Data Citation and Publication team
Proposal for a new RDA/TDWG WG Attribution Standards for Data Object Curation.
Amazon Basin Biodiversity Information Facility – ABBIF.
31 st May 2007Image Management in Bio- and Environmental Sciences: New Directions Julia Hoare Digitising Linnaeus: developing global access to taxonomic.
Tina Morton & Matt Greenhall Engagement Managers 18 September 2015 Higher Education Archives Programme #HEAP Action plan.
Working with your archive organization: Broadening your user community Robert R. Downs, PhD Socioeconomic Data and Applications Center (SEDAC) Center for.
PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA …………………………………………………………………………………………………… LOUISE CORTI …………………….…………………………….… UK DATA ARCHIVE.
Course on persistent identifiers, Madrid (Spain) Information architecture and the benefits of persistent identifiers Greg Riccardi Director Institute for.
Coordination and Policy Development in Preparation for a European Open Biodiversity Knowledge Management System Supported by the European Commission through.
What are our collections being used for?
SYNTHESYS3 Parallel Discussions
Crowd-sourcing, Public Participation, and Data Enrichment – Using crowd-sourcing tools Biological Collections Digitisation in the Pacific , Symposium.
International Congress of Entomology, Orlando
Forsking i fellesskap- Workshop 3
Digitisation Workflows, Tools and Techniques - Whole-drawer imaging
Linking persistent identifiers at the British Library
Who’s Who in Bioinformatics: The European Landscape
RDA/TDWG Metadata Standards for Attribution of Physical and Digital Collections Stewardship Anne E Thessen, Matt Woodburn, Dimitris Koureas 21 Sept, 2017/Montreal,
Unlocking Thesis Data update DOI: /PUB.4314
The ERES Digital Library:
Unlocking Thesis Data update DOI: /PUB.4314
Community Health Monitoring
6.2 data interoperability Rafael C Jimenez ELIXIR
Bird of Feather Session
Being a Local University: Towards New Assessment Tools and Indicators Dr John H Smith Senior Adviser, European University Association (EUA) Brussels Member,
GBIF Today and Tomorrow
Presentation transcript:

Biodiversity literature mark-up Compelling use cases for Natural History Collections Dr Dimitris Koureas Natural History Museum London Workshop on mark-up of biodiversity literature Berlin February 2014 DimitrisKoureas

Use case 1: Assisted label transcription Use case 2: Measuring the impact of collections Who are the current stakeholders? Support from Societal actors? What are the direct societal benefits? Natural History Museums can be key players but… SO… > 260 million specimens Introduction Significant research effort has been invested Literature markup could have industry-wide applications with significant impact We need to demonstrate compelling use cases that will engage stakeholders

Legacy literature markup of specimen records can facilitate label transcription process Digital NH Museums -Digital is NH museums strategic decision Challenge 1 of i n the Science strategy of NHM -Collection digitisation is prioritised in all major museums NHM allocated c. £750k for the next three years (not including capital expenditure) -Label transcription is important but challenging Use case 1: Assisted label transcription

Manual transcription of label elements curators crowdsourcing Manual transcription of semantic units in the label OCR/markup (semi-) automatic Use case 1: Assisted label transcription Different approaches for label transcription Hybrid models are currently in use

High resolution Typewritten Well defined structure and semantic units Suitable label for OCR and markup vs Low resolution Handwritten No proper structure Not suitable label for OCR Use case 1: Assisted label transcription Current approaches for label transcription

Crowdsourcing Current approaches for label transcription Use literature markup to identify specimen records and match against the physical object In-house Literature assisted transcription Manual or semi-automated Slow and cost ineffective Not suitable for large collections Unpredictable outcome Data cleaning needed We can enhance current approaches by introducing Use case 1: Assisted label transcription

a 1/Li I ) vi5 5, {L I‘O SPXFS \9.E " ‘: 3P~‘’‘fl\ % A HERB. ORPHANIDEUM. 3‘_‘w:a 3 PummI “lift u’ f9 ‘ A ‘-*’ /1i. _ I -}Z_,,_‘;_’:£€ Cg‘?! ~ <‘:.g‘{x ATHU 3638 Basic OCR output Create a link between specimen and literature Label transcription: Don’t do the job twice! Most labels have already been transcribed in taxonomic literature Catalogue number Use case 1: Assisted label transcription Published in 2012

Literature assisted transcription Transcription of specimen labels Is being crowdsourced for the last 250 years Minimum need of data cleaning Specimen data from small collections around the world Specimens labels transcribed several times Label transcription: Don’t do the job twice! Most labels have already been transcribed in taxonomic literature Use case 1: Assisted label transcription

Value in itself Value through utilisation preservation curation Data extraction Digitisation Natural History Collections Openness Establishing through measuring the scientific and Societal impact of collections Use case 2: measuring NH collections impact Traditional activities of repositories McAlpine (1986): 12.7% of papers used collections & 44.4% made collections

Legacy literature markup Specimen identifiers specimen citation metrics webservice Collection assessment Use case 2: measuring NH collections impact Specimen metadata born digital literature

Tracking specimen citations in literature can 1.highlight important collections 2.Promote the value of smaller repositories 3.Steer digitisation efforts 4.Help in collection gap analysis 5.Attract more funding Use case 2: measuring NH collections impact

Some concerns: Tracking specimen records in literature means tracking references to physical objects DOIs could be the easiest way BUT we cannot assign DOIs to physical objects unless museums quickly proceed in creating comprehensive collection data portals and assign UI to all records Use case 2: measuring NH collections impact The use of persistent identifiers would help NH collection curators to track the scientific impact of their collections but

Compelling use cases for Natural History Collections Biodiversity literature mark-up: Beyond taxonomic names Workshop on mark-up of biodiversity literature Berlin February 2014 Thank