Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr. Vince Smith Informatics Research Leader The Natural History Museum.

Slides:



Advertisements
Similar presentations
Electronic Resources for Studying Lice Vincent S. Smith with Robert C. Dalgleish, Simon Rycroft, and David Reed.
Advertisements

Virtual Biodiversity ViBRANT 8th e-Infrastructure Concertation Meeting CERN, Geneva, November 4-5th 2010 Vincent S. Smith Natural History Museum, UK
Vincent S. Smith Simon D. Rycroft, Ben Scott & Dave Roberts Scratchpads redefining publication getting biodiversity online.
Cybertaxonom y Vincent S. Smith The use of computers and networks in a program of taxonomic research.
Scratchpads Vincent S. Smith, Simon D. Rycroft, & Dave Roberts getting biodiversity on the web.
Vincent S. Smith Simon D. Rycroft, Ben Scott & Dave Roberts Scratchpads redefining publication getting biodiversity online.
How to publish genomic Data papers based on BOL data - Biodiversity Data Journal Lyubomir Penev Bulgarian Academy of Sciences & Pensoft Publishers ViBRANT.
Making your data work for you: Scratchpads, publishing & the Biodiversity Data Journal Vince Smith 1, Dave Roberts 1 & Lyubomir Penev 2 1. Natural History.
Making your data work for you: Scratchpads, publishing & the Biodiversity Data Journal Vince Smith 1, Dave Roberts 1 & Lyubomir Penev 2 1. Natural History.
Don’t make me think Biodiversity data publishing made easy Vince Smith, Alice Heaton, Laurence Livermore, Simon Rycroft, Ben Scott & Lyubomir Penev* The.
Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics.
Pensoft Writing Tool (PWT) Lyubomir Penev ViBRANT Tools for DNA taxonomists, 11 June 2013, Brussles ViBRANT.
Making small data big! The Biodiversity Data Journal (BDJ) Lyubomir Penev, Jordan Biserkov, Teodor Georgiev, Pavel Stoev, David Roberts, Vincent Smith.
The Naturalist Fredrik Ronquist Swedish Museum of Natural History.
NYBG + KE EMu The New York Botanical Garden + KE EMu Melissa Tulig Botanical Information Management.
Publish or perish? Linking Scratchpads and the new Biodiversity Data Journal for streamlining publication of botanical data D.N Koureas 1, L. Penev 2 &
Making small data big! The Biodiversity Data Journal (BDJ) Lyubomir Penev, Teodor Georgiev, Pavel Stoev, David Roberts, Vincent Smith ViBRANT.
Facilitating biodiversity science through
Service activities ViBRANT Project Year 3/Final Review Meeting – Brussels Description & Objectives WP Description WP Objectives WP partners.
Scratchpads Publishing biodiversity: The interplay between Scratchpads and the Biodiversity Data Journal Dr Dimitrios Koureas Biodiversity Informatics.
Dimitris Koureas, Vince Smith & Simon Rycroft Natural History Museum London Linking data, services and communities using Virtual Research Environments.
Streamlining the registration- to-publication pipeline Lyubomir Penev, Teodor Georgiev, Pavel Stoev Sherborn Meeting, NHM London, 28 Oct 2011 ViBRANT.
Link yourself or perish? PhytoKeys, the next generation journal in systematic botany Lyubomir Penev 1, W. John Kress 2, Sandra Knapp 3, De-Zhu Li 4, Susanne.
Open access journals Pensoft Journal Ststem PJS 2.0 Lyubomir Penev Bulgarian Academy of Sciences & Pensoft Publishers ViBRANT ViBRANT Tools for DNA taxonomists,
Cybertaxonomy and revisionary systematics Dmitry Dmitriev Illinois Natural History Survey, USA
The EDIT Platform for Cybertaxonomy as an information broker in name infrastructures Andreas Kohlbecker 1, Yde de Jong 2, Cherian Mathew 1, Lorna Morris.
Fourth Annual Summit | Feb | Tucson, AZ Scratchpads for community involvement for natural history collections Dr Dimitris Koureas Biodiversity.
II Course on GBIF Node Management Arusha, Tanzania 31 st October and 1 st November 2008 Tim ROBERTSON Systems Architect GBIF Secretariat Data Publishing.
Making small data big: The Biodiversity Data Journal (BDJ) Lyubomir D. Penev 1,3, Teodor A. Georgiev 3, Pavel E. Stoev 2,3, David M. Roberts 4 & Vincent.
Sustainability of EDIT Informatics Activities. BoD working group on sustainability Executive Summary, 20th July 2009: “… set of themes we are sure we.
To be Published for free or to be Read for free: OA publishing from an Easterneuropean perspective Lyubomir Penev Pensoft Publishers, Sofia APE 2011 Berlin.
Scratchpads Publication Module - A paradigm shift in publishing RBG Kew, Seminar,
GLOBAL BIODIVERSITY INFORMATION FACILITY Dr Vishwas Chavan Senior Programme Officer for DIGIT Data Citation Mechanism and.
Virtual Biodiversity ViBRANT Vince Smith & Dave Roberts Natural History Museum, London ViBRANT Virtual Biodiversity.
At the frontline of publishing in systematic zoology: A presentation of ZooKeys Lyubomir Penev 1, Terry Erwin 2, Jeremy Miller 3 1 Pensoft Publishers,
The Pensoft Journal System and XML-based workflow Lyubomir Penev Life and Literature Conference, Chicago 2011 ViBRANT Virtual Biodversity.
Online tools and standards for Biodiversity data in the Semantic Web Dr Dimitris Koureas Biodiversity Informatics Group | Department of Life Sciences The.
@dimitriskoureas making small data… big. Publications based on countless specimens, images, maps, keys and datasets Typically generated by small communities.
Biodiversity Informatics at the Natural History Museum Ed Baker Terrestrial Invertebrates, Department of Life Sciences & NHM Informatics Initiative
Dimitris Koureas, PhD Natural History Museum London Linking layers of biodiversity data: Informatics challenges for the long tail research RDA - Long Tail.
Virtual Biodiversity ViBRANT Data publishing Lyubomir Penev, Vince Smith, Dave Roberts, Pavel Stoev ViBRANT Virtual Biodiversity “BioFresh goes Political”
Progress since the February 2005 London DNA Barcode of Life Conference Scott Miller, Chair Consortium for the Barcode of Life Smithsonian Institution.
Standards and tools for publishing biodiversity data Yu-Huang Wang June 25, 2012.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
ViBRANT Virtual Biodiversity Research Project overview Isabella Van de Velde Royal Belgian Institute of Natural Sciences, Brussels.
A paradigm shift in biodiversity publishing: mobilization, mark up, reuse and integration of small data Lyubomir D. Penev 1,3, Teodor A. Georgiev 3, Pavel.
Biodiversity Data Journal: mobilization, reuse and integration of small data Lyubomir D. Penev 1,3, Teodor A. Georgiev 3, Pavel E. Stoev 2,3, Jordan Bisserkov.
Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Reading,
Resolving the publishing bottleneck and increasing data interoperability in biodiversity science Lyubomir Penev, Teodor Georgiev, Pavel Stoev, David Roberts,
Scratchpads The virtual research environment for biodiversity data Simon Rycroft, Dave Roberts, Vince Smith, Alice Heaton, Katherine Bouton, Laurence Livermore,
Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.
An Introduction to Scratchpads: Making your data work for you Laurence Livermore Natural History Museum, London Joinville, Brazil.
Overview PlantCollections – Publish information about public garden collections – Using existing infrastructure Morphbank – Goals and capabilities of.
Don’t make me think Biodiversity Data Publishing Made Easy Laurence Livermore, Vince Smith, Alice Heaton, Simon Rycroft, Ed Baker, Ben Scott & Lyubomir.
Scratchpads and the new Biodiversity Data Journal Biodiversity Data Publishing made… easier Dimitris Koureas Natural History Museum London.
The New GBIF Data Portal Web Services and Tools Donald Hobern GBIF Deputy Director for Informatics October 2006.
Riccardi: DIALOGUE Workshop August 1, 2005 Supported by NSF BDI 1 Representing and Using Phylogenetic Characters in Morphbank Greg Riccardi, David Gaitros,
Scratchpads An online platform for biodiversity data Laurence Livermore Biodiversity Informatics | Department of Life Sciences Natural History Museum London.
Scratchpads Virtual Research Environments for taxonomic and biodiversity related data.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen Senior Programme Officer, ECAT 3 Oct th Nodes Meeting.
NVS New Zealand National Vegetation Survey. What is NVS? NVS (National Vegetation Survey) – New Zealand’s largest archive facility for plot-based vegetation.
Coordination and Policy Development in Preparation for a European Open Biodiversity Knowledge Management System Supported by the European Commission through.
Data sharing and exchange: Experiences within the
GBIF Implementation Plan Highlights
International Congress of Entomology, Orlando
The IPT user interface and data quality tools
Flanders Marine Institute (VLIZ)
Data publishing from the viewpoint of a biodiversity publisher
Cynthia S. Parr, Robert Guralnick, Nico Cellinese, Roderic D.M. Page 
Presentation transcript:

Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr. Vince Smith Informatics Research Leader The Natural History Museum London

Smith, V.S., Koureas, D, & Livermore, L Scratchpads introductory presentation. Slideshare. Where to find and how to cite this presentation:

Publications based on countless specimens, images, maps, keys and datasets Current taxonomic data production Typically generated by small communities for “local” research projects Figure from Costello M.J et al, doi: /science

However… not publicly accessible lack sufficient contextual metadata published in formats that require time-consuming manual extraction difficulty in publishing valuable datasets (i.a. local or regional Floras, Faunas) Published knowledge cannot easily be mobilised Vast amounts of unpublished taxonomic “knowledge”

On the other hand: Estimates of 7.5 million species still undescribed 1 1 How Many Species Are There on Earth and in the Ocean? Mora C et al. doi: /journal.pbio

Expected volume of taxonomic and biodiversity data Need of extracting, aggregating and linking data on a global level

The four nodes of data cycle 1. We collect and generate data 2. We curate, link and structure data 3. We analyse data 4. We publish data

Data curation Data curation Data publishing Data publishing The four nodes of data cycle Data collection & generation Data collection & generation What are the bottlenecks in the workflow ? Data analysis Data analysis

Data curation Data curation Data publishing Data publishing What we need is… Data collection & generation Data collection & generation a seamless workflow Data analysis Data analysis

Cyndy Parr, Rob Guralnick, Nico Cellinese and Rod Page. TREE. doi: /j.tree This requires data, information & knowledge to be… Digital Not printed paper Openly accessible Not behind barriers (e.g. paywalls) Linked-up Not in silos “ Link together evolutionary data … by developing analytical tools and proper documentation and then use this framework to conduct comparative analyses, studies of evolutionary process and biodiversity analyses” To achieve this…

Scratchpads Virtual Research Environments Making taxonomy digital, open & linked

so… what are the Scratchpads ?

What are Scratchpads? Hosted websites for biodiversity data Virtual research & publication platform Completely open access & open source Modular & flexible

What are Scratchpads? development of online research communities facilitate standardized environment of entering and curating data through sharing and interlinking that allow dissemination of research products and

A Scratchpad is a website that holds data for you and your community The Scratchpads concept Your data External data & services

The Scratchpads concept

Examples of use: Taxa (Classifications, taxon profiles, specimens, literature, images, maps, phenotypic, genotypic & morphometric datasets, keys, phylogenies) Conservation ProjectsRegionsSocieties

Red List conservation assessments Examples of use:

Bulbous monocot genera listed in CITES

Global Invasive Alien Species Information Partnership Examples of use:

Belgian Network for DNA Barcoding Examples of use:

Major integrated projects Online resource for monocot plants Collaboration between Kew, Oxford University and NHM Data to be open and usable by other scientists

Major integrated projects 21+ open community sites and growing Over 45 internationally collaborating scientists Site data feeds into a “Portal” Site List:

Major integrated projects Retrieve information on any Monocot plant Rich downloadable data Identification keys Model example of linked attributed data eMonocot Portal:

65,000 unique visitors/month Per month unique visitors to Scratchpads sites 665 Scratchpads Communities by 7,334 active registered users covering 162,432 taxa in 735,660 pages. Are Scratchpads sustainable? 81 paper citations in 2012 In total more than 1,300,000 visitors

Are Scratchpads sustainable? ViBRANT Virtual Biodiversity Research & & Other grants in the pipeline New Proposals

the main features

Classification term oriented system Biological classifications Non-biological classifications Taxonomies Hierarchical controlled vocabularies The main features

Dynamic Biological Classifications Manually entered or imported Auto generated The main features

Taxon pages Overview of data related to taxon Generated from tagged content The main features

Bibliography management Faceted browsing An inbuilt Bibliography manager Taxon tagging and free keywords Import from and export to all major formats The main features

Specimen/Observation data Linked to images and georeferenced Annotated full specimen/observation records The main features Linked to GenBank accession numbers

Distribution maps Google maps based Data layers Occurrence data Distribution data TDWG regions GBIF data The main features

Example regional distribution The main features

Create phylogenetic trees Based on Newick/NeXML Different views

Character matrices – Key construction Quantitative or qualitative characters Auto generation of keys Taxon based matrices [Specimens based character matrices] The main features

Media handling Bulk upload Metadata (EXIF & Aubudon core) Media galleries The main features

Generation of custom pages Tagged or not External RSS Twitter feeds Media files The main features

Working groups Forums Blog entries Webforms Newsletters RSS syndication Inbuilt comments Enhanced communication tools The main features

analytical tools OBOE service i.a. Ecological informatics, Phylogenetics, Sequence alignment The main features

MCMC methods to estimate the posterior distribution of model parameters Phylogenies Sequence alignment Multiple sequence alignment Microsatellite repeats finder

data mobilisation more on the way… External services Integration

IUCN data integration

GBIF data integration

Help & Support In-site Support Wiki Training Courses (12 in 2012) Ambassadors Programme Embedded Issues Queue Sandbox Site

Data curation Data curation Data publishing Data publishing Data collection & generation Data collection & generation a seamless workflow Data analysis Data analysis Data publishing

Helping researchers take credit for all research products The vision

Publication module

The Publication module Open-access journal The main features

What does the BDJ publish? Single taxon treatments and nomenclatural acts Local or regional checklists Sampling reports and occasional inventories Habitat-based checklists and inventories Ecological and biological observations of species and communities? Single identification keys biodiversity-related databases, including genomic, ecological and environmental data (data papers) Biodiversity-related software tools

How do Scratchpads and the BDJ interact?

Allow submission of datasets for publication without reformatting and restructuring Working in a single environment based on standardised XML schema

Work on multiple manuscripts Allocate different people to different manuscripts Handle permissions Assembling a manuscript

Author names and affiliations Data included in manuscript in a structured annotated format Assembling a manuscript

Taxon descriptions Assembling a manuscript

Specimen data Assembling a manuscript

Figures and Tables

Supplementary files Select from existing or upload new

References Assembling a manuscript Easily cite bibliography Auto compile list of references

Assembling a manuscript Texts

XML Figures and Tables Keys References Texts The publication module Author names and affiliations Taxon descriptions Specimen data Supplementary files

Previewing your manuscript

Submission & enhanced peer review Manuscript data validation One-click submission to BDJ Traditional peer review and optional panel/public review

The workflow MANUSCRIPT PUBLISHED (XML, PDF) MANUSCRIPT PUBLISHED (XML, PDF) PENSOFT JOURNAL SYSTEM (PJS 2.0) XML submission SCRATCHPADS Community Taxon namesOccurrence datadatasets Archive Taxon treatments Plazi Wiki

Scratchpads are an integrated system to Enter, Curate, Mark-up, Link and Publish data taxonomic workflow in a single virtual environment

Scratchpads technical development -Vince Smith, Simon Rycroft, Ben Scott, Ed Baker, Alice Heaton, Katherine Boutton Scratchpads outreach -Laurence Livermore, Isa van deVelde & Dimitris Koureas e-Monocot -Paul Wilkin & the Kew team, Charles Godfray & the Oxford team ViBRANT -Vince Smith, Dave Roberts & Lucy Reeve Pensoft - Lyubomir Penev and the Pensoft team Our users Acknowledgements

Thank you Data curation Data curation Data analysis Data analysis Data publishing Data publishing Data collection & generation Data collection & generation

However… not publicly accessible lack sufficient contextual metadata published in formats that require time-consuming manual extraction difficulty in publishing valuable datasets (i.a. local or regional Floras, Faunas) Published knowledge cannot easily be mobilised Vast amounts of unpublished taxonomic “knowledge”