Presentation is loading. Please wait.

Presentation is loading. Please wait.

Supporting high-throughput digitisation workflows in EMu

Similar presentations


Presentation on theme: "Supporting high-throughput digitisation workflows in EMu"— Presentation transcript:

1 Supporting high-throughput digitisation workflows in EMu
Rapid Data Entry Supporting high-throughput digitisation workflows in EMu Abstract: Over the next five years the Natural History Museum is embarking on a programme to digitise significant numbers of specimens with the aim to make the resulting data freely available online in the museum's Data Portal. In order to support this a new highly customisable web-based platform, Rapid Data Entry (RDE), has been developed for the museum’s content management system, EMu. RDE is composed of three parts: 1) a project creation and administration interface that allows data managers to create customisable web applications for specific digitisation projects that directly reference back-end fields in EMu; 2) forms for rapid data entry and editing that can be used on both desktop and mobile clients; 3) editors that support the normalisation of data. The general advantages and disadvantages of an alternative web interface to a traditional desktop client are discussed along with potential new workflows such as: specimen relocation via barcodes, condition checking, collections audit and other targeted data capture activities. Laurence Livermore1 Alex Fell2, Muhammad Nadat2, Andrew Brown2 and Ben Sullivan2 1 The Natural History Museum, London 2 KE Software, an Axiell Group Company

2 The Digitisation Challenge
Increased government and public expectation Aim to digitise 20 million specimens in 5 years Current CMS little provision for rapid data entry Need new tools to support digitisation Publically funded museums face increased expectations from governments, research councils and the public to democratise science by being more open and transparent. Data gathering, it’s analysis and subsequent communication are going to become more important “business as usual” activities for natural history museums. So over the next five years the Natural History Museum is embarking on a programme to digitise significant numbers of specimens with the aim to make the resulting data freely available online in the museum's Data Portal. The NHM’s current collections management system, EMu, has a highly customised Windows-only desktop client with thousands of fields and over 100 tabs. This makes it quite complex to use and unsuitable for digitisers to rapidly enter data directly into the database.

3 Solving the problem – Rapid Data Entry (RDE)
Browser-based interface for KE EMu Customisable “apps” Support rapid data entry Bulk record creation Field validation Normalise and atomise data Project-based approach In order to support rapid data entry we undertook a project to build web-based applications for Emu that were cross platform and had the following functionality: The ability to create our own customisable “apps” Apps would support rapid record creation and the ability to edit core metadata (i.e. a limited number of fields) supporting a “broad and thin” approach to data capture The ability to create new records in bulk To be able to reference and validate against existing database fields To normalise and atomise data captured from rapid digitisation And to support a project-based approach with user permissions and the ability to manage and monitor the progress of digitisation workflows that make use of the apps

4 Project-based Digitisation
Managed by one or more “leads” People may be members of more than one project Project information stored in the collections database Most projects will have multiple project-specific “apps” Digitisation at the NHM will be project-based and this is reflected in RDE. Each project would be constructed by a data manager or equivalent. The apps in a project will then be used by core staff, volunteers (and potentially members of the public) who may be involved in multiple projects at the same time. Project information and the configuration of their apps will be stored in the collections database. Most projects will have multiple apps to perform specific functions, like transcribing label data or normalising particular types of verbatim data.

5 Project Dashboard Permission dependent Three “app” categories: Forms
Editors Statistics Multiple apps support various stages/components of digitisation When a user logs in and selects a project the first thing they see is the project dashboard. The dashboard lists the constituent apps that form the project. Which apps are displayed to a user is based on individual app permissions. Apps are categorised as either: Forms (which support manual data entry, bulking editing and scripted operations) Editors (which allow users to normalise data, for example, by merging verbatim transcribed data with master atomised records) Statistics (providing project progress summaries) One of the design specifications was to potentially have multiple apps in order to break projects into specific tasks. Some digitisation activities might focus on transcribing very selective data, or apps could be used by different people with different skills e.g. volunteers, curators and georeferencers.

6 Forms Creates new records, including label transcription
Record sets can be filtered Filtered records are offered to editors/transcribers randomly Bulk editing and customised operations through scripts Forms allow users to create new records, including the transcription of labels or registers, or add data to “stub” records. Sets of records can be filtered using criteria specified by program lead Filtered records are randomly offered to users for editing/transcription Scripts can be used to build more specialised behaviour with each “Form app” potentially associated with customisable backend script which can be developed by in-house technical staff

7 Editors Global updater Resolve attachments Apply consistency
More targeted than EMu global editor Also created by project lead

8 Statistics Simple reporting mechanism Based on record status
Visualisation tool Bar chart Pie chart Currently the report function is fairly simple and allows users to view progress based on a record status field. Each record can have a single record status (e.g. stub record, transcribed, normalised, label unreadable) – this is something we are likely to change as depending on the complexity of a project a record may require multiple statuses. Currently visualization tools are either bar charts of pie charts. Future additions to the reporting tool may include automated notifications based on record status (e.g. flagging and an error or problem that requires the attention of an expert)

9 Project Creation & Administration
Browser-based configuration Can reference any backend field Permissions can be set per users on both projects and apps Configuration of all apps are done within browser by super users/data managers Data entry forms configured by the program lead with many field types (Single/Multi-valued, Validation, Data types, Mandatory, Suggested, Lookup lists, Auto suggest, Drop-down list, Attachments, Image display and capture), the ability to set display (labels, columns), and filter which records are displayed if for example the user will be transcribing data from images that have been bulk imported. Permissions for each projects and their apps can be set on a user by user basis.

10 Example Project Workflow – Botanical Sheets
Form 1 - Stub record creation from barcoded sheets Form 2 - Transcription of localities and collectors Editor 1 - Normalisation of localities Editor 2 - Normalisation of collectors

11 Future RDE Development
UX/UI improvements (desktop/tablets) Record navigation and management Ongoing improvements for NHM’s digital collections programme Support for non-digitisation activities Statistics and reporting

12 Advantages & Disadvantages
+ Apps are very flexible + No clientside installation required + Display and customisation does not (necessarily) require core client modifications + Steamlined field selection allows for rapid data entry + Digitisation occurs directly into collections database means all data are in one place from creation through to + Normalisation tools within collections database + Support for mobile/tablet devices allows novel/unanticipated workflows +/- Apps and record sets need to be configured by a Data Manager/super user +/- Complex normalisation (of complex data) requires desktop client - Requires WiFi in collections areas - Mobile/tablets less suitable for typing - Another system and interface to support and maintain Digitisation directly into collections database means they are immediately searchable and are in one place viewable by curators and researchers- advantage over using external tools/projects Apps are very flexible, they can be built and skinned for different purposes (internal, external use) When do you use an app rather than the desktop client? Many museums old listed uildings, installing new infrastructure problematic and costly

13 New Workflows Applications outside of rapid digitisation
Specimen relocation & loans Condition checking & collections audit Data capture from visiting scientists Crowdsourcing Natural history collections have an array of tasks that collections staff and researchers perform on a daily basis. Thinking of novel applications of mobile technology and web-based interfaces could save time on multiple activities, for example: Specimen relocation & loans Condition checking & collections audit Data capture from visiting scientists Crowdsourcing Original photograph taken by John Cummings

14 Acknowledgements Management and testing:
Darrell Siebert, Annette Ure and testing staff (curators and data managers) Software development: Alex Fell, Muhammad Nadat, Andrew Brown and Ben Sullivan (KE Software)


Download ppt "Supporting high-throughput digitisation workflows in EMu"

Similar presentations


Ads by Google