The challenge of biodiversity: Plot, organism and taxonomic databases Robert K. Peet University of North Carolina The National Plots Database Committee.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

The VegBank taxonomic datamodel Robert K. Peet Sponsored by: The Ecological Society of America US National Science Foundation Produced at: The National.
What is a Flora? Peter Hovenkamp. What is not a Flora? Labwork/ecology paper Species selection on non-taxonomic criteria No identification tool Character.
GUID-1 Workshop Welcome and Introduction Donald Hobern GBIF Program Officer for Data Access and Database Interoperability February 2006.
To share data, all providers must agree upon a data standard.
OVERVIEW OF DATA FLOW IN NVC PROCESS Field sheets NVC Proceedings.
Taxonomic data issues: An ecologist’s experience R.K. Peet The University of North Carolina Adapted by J Kennedy.
Publish or perish? Linking Scratchpads and the new Biodiversity Data Journal for streamlining publication of botanical data D.N Koureas 1, L. Penev 2 &
VegBank.org: a Permanent, Open-Access Archive for Vegetation Plot Data. Michael T. Lee 1, Michael D. Jennings 2, Robert K. Peet 1. Interacting with the.
Integrated Taxonomic Information System Janet Gomon, Deputy Director, ITIS Smithsonian Institution Museum of Natural History The.
Vegetation databases Lessons from VegBank, SEEK, TDWG, IAVS, & NCEAS Robert Peet University of North Carolina.
Transition to taxon concepts from a world of legacy data --- R.K. Peet 1, A.S. Weakley 1,2, X. Liu 1,3, & N. Franz 4,5 1 The University of North Carolina.
Plant Systematics databases: Users perspectives Robert K. Peet, University of North Carolina In collaboration with The National Center for Ecological Analysis.
Names are not sufficient: the challenge of documenting organism identity R.K. Peet, J.B.Kennedy, and N.M. Franz and The Ecological Society of America Vegetation.
Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,
VegBank A vegetation field plot archive Sponsored by: The Ecological Society of America - Vegetation Classification Panel Produced at: The National Center.
EcoInformatics & Vegetation Science. The symposium message Plant community ecology is on the brink of a dramatic transformation that will be made possible.
VegBank and the ESA Cyber-infrastructure for Vegetation Science Robert K. Peet & The Ecological Society of America Vegetation Panel.
North American initiatives in Ecoinformatics: Vegbank and SEEK Robert K. Peet and The Ecological Society of America Vegetation Panel The SEEK development.
The VegBank taxonomic datamodel Robert K. Peet Sponsored by: The Ecological Society of America US National Science Foundation Produced at: The National.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen ECAT Program Officer September G A Darwin-Core Archive solution to publishing and.
BIS TDWG Conference 28 October 2013, Florence Documenting data quality in a global network: the challenge for GBIF Éamonn Ó Tuama, Andrea Hahn, Markus.
Considerations for the Construction of Lichen Databases Data Management.
Vegetation Plot Management: A National Plots Database Demo Funding: National Science Foundation (DBI ) John Harris - NCEAS Robert K. Peet - University.
Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.
Page 1 ISMT E-120 Desktop Applications for Managers Introduction to Microsoft Access.
SERNEC Image/Metadata Database Goals and Components Steve Baskauf
Richard White Biodiversity Data. Outline Biodiversity: what is it? – Definitions: is biodiversity: A resource? Something which can be measured? How to.
Use case lessons: Components of the SEEK architecture Robert K. Peet University of North Carolina.
Turboveg An out-of-the-box, easy to install and easy to use Windows program for managing vegetation data.
A new floristic atlas for the Southeast based on taxon concept relationships Robert K. Peet 1, Alan S. Weakley 1,2 & Xianhua Liu 1,3 1 The University of.
Introduction to OBIS-USA Biological Data, Applications, & Relationships March 14, 2011.
Indexing the Species Names of the World - for the World Frank Bisby (Species 2000), Michael Ruggiero (ITIS) Per de Place Bjørn (GBIF - ECAT)
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
The National Park Service's Information Management Strategy, Infrastructure, and Software Applications.
[] Where Did Those GBIF Occurrences Come From? Providing Digital Access to NatureServe's Reference Database: Report on a Project in the Early Stages of.
Standards and tools for publishing biodiversity data Yu-Huang Wang June 25, 2012.
Overview of progress in Ecoinformatics Susan Wiser Landcare Research, Lincoln New Zealand.
BIEN Confederated DB (S) Analytical DB(s) Heterogeneous source database(s) of Plots/Specimens/Occurrences Synonymy Names Reference taxonomy *** *** Feedback.
Scratchpads The virtual research environment for biodiversity data Simon Rycroft, Dave Roberts, Vince Smith, Alice Heaton, Katherine Bouton, Laurence Livermore,
Experience from Mapping Existing Models to the Transfer Schema Robert Kukla.
Vegetation Data Management: VegBank Funding: National Science Foundation (DBI ) January 8, 2002 John Harris - NCEAS.
Definition of an Observation In general, an observation represents the measurement of some attribute, of some thing, at a particular time and place. Observations.
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Meredith A. Lane CODATA/ERPANET Workshop: Scientific Data Selection &
The VegBank taxonomic datamodel Sponsored by: The Ecological Society of America - Vegetation Classification Panel Produced at: The National Center for.
Collections. Vegetation sampling We observe and collect data on soil.
Overview PlantCollections – Publish information about public garden collections – Using existing infrastructure Morphbank – Goals and capabilities of.
Database Concepts Track 3: Managing Information using Database.
The VegBank Data Model. Biodiversity data structure Taxonomic database Plot/Inventory database Occurrence database Plot Observation/ Collection Event.
U.S. Department of the Interior U.S. Geological Survey The Biological Data Profile Extending the FGDC Metadata Standard Kirsten Larsen.
Scratchpads and the new Biodiversity Data Journal Biodiversity Data Publishing made… easier Dimitris Koureas Natural History Museum London.
Acronym Soup GBIF, TDWG & GUIDs Jerry Cooper. Global Biodiversity Information Facility (GBIF) Established in 2000 through non-binding MOU (25 countries.
Transition to taxon concepts from a world of legacy data --- R.K. Peet 1, A.S. Weakley 1,2, X. Liu 1,3, & N. Franz 4,5 1 The University of North Carolina.
VegBank A vegetation field plot archive Produced at: The National Center for Ecological Analysis and Synthesis Principal Investigators: Robert K. Peet,
Multi-institutional collaborative research program. Established in 1988 to document the composition and status of natural vegetation of the Carolinas.
The New GBIF Data Portal Web Services and Tools Donald Hobern GBIF Deputy Director for Informatics October 2006.
Riccardi: DIALOGUE Workshop August 1, 2005 Supported by NSF BDI 1 Representing and Using Phylogenetic Characters in Morphbank Greg Riccardi, David Gaitros,
AUSTRALIA’S VIRTUAL HERBARIUM A national collaborative model for integrated access to distributed biological information Australian National Herbarium.
The challenge of organism identity --- The flora of the Southeast The flora of the Southeast as a case study Robert K. Peet University of North Carolina.
VegBank and the ESA Cyber-infrastructure for Vegetation Science R.K. Peet, Don Faber-Langendoen, Michael Jennings, & Michael Lee Ecological Society of.
Dave Thau - Mills College - Technology for a Better World111/24/2009 Biodiversity Informatics Dave Thau.
The challenge of biodiversity: Plot, organism and taxonomic databases Robert K. Peet University of North Carolina The National Plots Database Committee.
Laura Russell VertNet Meherzad Romer NatureServe Canada John Wieczorek
A vision for community involvement and integration Robert K. Peet & Alan S. Weakley Alan S. Weakley.
VegBank A vegetation field plot archive Produced at: The National Center for Ecological Analysis and Synthesis Principal Investigators: Robert K. Peet,
Data sharing and exchange: Experiences within the
Vegetation Data Management:
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
Taxonomic and Community Classification Resources and Standards
Presentation transcript:

The challenge of biodiversity: Plot, organism and taxonomic databases Robert K. Peet University of North Carolina The National Plots Database Committee John Harris NCEAS

A case study: VegBank - The ESA Vegetation Plot Archive Project supported by: National Center for Ecological Analysis & Synthesis U.S. National Science Foundation USGS-BRD Gap Analysis Program ABI / The Nature Conservancy Project organized and directed by: Robert K. Peet, University of North Carolina Marilyn Walker, USDA Forest Service & U. Alaska Dennis Grossman, The Nature Conservancy / ABI Michael Jennings, USGS-BRD & UCSB

Observation/Collection Event Object or specimen Taxon Locality Biodiversity data structure Taxonomic databases Plot/Inventory databases Specimen databases

Web-interface Veg Classification Database VegBank Proposal Raw Plot Data Vegetation/Biodiversity Information flow in the US National Vegetation Classification Taxonomic Database Proposal

Taxonomic database challenge The problem: Integration of data potentially representing different times, places, investigators and taxonomic standards The traditional solution: A standard list of kinds of organisms.

There exist numerous compilations of organism names. For example: Species 2000http:// (Composed of 18 participant databases) All Specieshttp:// ITIShttp:// (The US government standard list, plus Canada & Mexico) Index to organism names

Taxon-specific standard lists are available. Representative examples for higher plants include: North America / US USDA Plantshttp://plants.usda.gov/ ITIShttp:// NatureServehttp:// World IPNI International Plant Names Checklist IOPI Global Plant Checklist

Most standardized plant lists fail to allow effective integration of datasets. The reasons include: The user cannot reconstruct the database as viewed at an arbitrary time in the past, Taxonomic concepts are not defined (just lists), Multiple party perspectives on taxonomic concepts and names cannot be supported or reconciled.

Current standards Biological organisms are named following international rules of nomenclature. Database standards are being developed by TDWG, GBIF, IOPI, etc. Metadata standards have been developed. For example, the Darwin Core is a profile describing the minimum set of standards for search and retrieval of natural history collections and observation databases. (

Carya ovata (Miller)K. Koch Carya carolinae-sept. (Ashe) Engler & Graebner Carya ovata (Miller)K. Koch sec. Gleason 1952sec. Radford et al Three concepts of shagbark hickory Splitting one species into two illustrates the ambiguity often associated with scientific names. If you encounter the name “Carya ovata (Miller) K. Koch” in a database, you cannot be sure which of two meanings applies.

R. plumosa R plumosa v. intermedia R. plumosa v. plumosa R. intermedia R. plumosa v. interrupta R. pineticola R. plumosaR. sp. 1 R. plumosa v. plumosa R. plumosa v. pineticola Multiple concepts of Rhynchospora plumosa s.l. Elliot 1816 Gray 1834 Kral 1998 Peet 2002? Chapman 1860

NameReferenceAssertion An assertion represents a unique combination of a name and a reference “Assertion” is equivalent to “Potential taxon” & “taxonomic concept”

Names Carya ovata Carya carolinae-septentrionalis Carya ovata v. australis Assertions (One shagbark) C. ovata sec Gleason ’52 C. ovata (sl) sec FNA ‘97 (Southern shagbark) C. carolinae-s. sec Radford ‘68 C. ovata v. australis sec FNA ‘97 (Northern shagbark) C. ovata sec Radford ‘68 C. ovata (v. ovata) sec FNA ‘97 References Gleason 1952 Britton & Brown Radford et al Flora Carolinas Stone 1997 Flora North America Six shagbark hickory assertions Possible taxonomic synonyms are listed together

NameAssertionUsage A usage represents a unique combination of an assertion and a name. Usages can be used to track nomenclatural synonyms

1. Carya ovata 2. C. carolinae 3. C. ovata var. australis A.ovata sec. Gleason B.ovata sl sec. FNA C.carolinae sec. Radford D.ovata australis sec. FNA E.ovata sec. Radford F.ovata ovata sec. FNA 1-F OK 2-D OK 3-D Syn Names Assertions ITIS Usage ITIS views the linkage of the assertion “Carya ovata var. australis sec. FNA 1997” with the name “Carya ovata var. australis” as a nomenclatural synonym.

NameAssertionUsage A usage (name assignment) and assertion (taxon concept) can be combined in a single model Reference

Party Perspective The Party Perspective on an Assertion includes: Status – Standard, Nonstandard, Undetermined Correlation with other assertions – Equal, Greater, Lesser, Overlap, Undetermined. Lineage – Predecessor and Successor assertions. Start & Stop dates.

ITIS FNA Committee ABI Carya ovata sec Gleason 1952 Carya ovata (sl) sec FNA 1997 Carya ovata sec Radford 1968 Carya carolinae sec Radford 1968 Carya ovata (ovata) sec FNA 1997 Carya ovata australis sec FNA 1997 PartyAssertion PartyAssertionStatusStart Name ITIS ovata – G52 NS1996 ITIS ovata – R68 St1996ovata ITIScarolinae – R68 St1996carolinae ITIScarolinae – R68 NS2000 ITISovata aust – FNA St2000carolinae ITISovata – R68 NS2000 ITISovata ovata – FNA St2000ovata Status

VegBank taxonomic data model

Concept-based taxonomy is coming! All organisms/specimens in databases should be identified by linkage to an assertion = name and reference! Various standards are being developed by FGDC, TDWG, IOPI, GBIF, etc. Most major databases are working toward inclusion of assertions (e.g. ITIS, IOPI, HDMS). Until standard assertion lists are available, databases that track organisms should include couplets containing both a scientific name and a reference.

(Inter)National Taxonomic Database? Concept-based Party-neutral Synonymy and lineage tracking Perfectly archived An upgrade for ITIS & Species 2000?

Specimen/object databases Information on specimens/objects should be tracked by reference to Place (place or collection) Unique identifier (accession number) Time A museum is a place Annotation should be by assertion (concept)!

Database systems for tracking specimens The following are a few of the many available BioLink Specify Biota Taxis TDWG maintains links to multiple software systems

Plots Database Systems Several plot database systems are available. Among the best know and widely used are: TurboVeg Over 1,000,000 plots stored using TurboVeg Plots (ABI NPS Mapping Project)

A vegetation plot archive? There is currently no standard repository for plot data. A repository is needed for: Plot storage Plot access and identification Plot documentation in literature/databases This would be equivalent to GenBank for vegetation science

Project Plot Observation Taxon Observation Taxon Interpretation Plot Interpretation Core elements of the VegBank

Support multiple interpretations of which concept applies to an organism or community. Various observers will associate different taxonomic concepts with records in a database Provision must be made for inclusion of these taxonomic interpretations. Minimal attributes include Concept applied Date applied Who made the interpretation Links to supporting information

Interface tools Desktop client for data preparation and local use. Loaders for legacy data. Flexible data inport. Tools for linking to taxonomic and community concepts. Standard query, flexible query, SQL query. Flexible data export. Local data refresh Easy web access with consistent interface

Conclusions for database designers 1.Records of organisms should always contain (or point to) couplets consisting of a scientific name and a reference where the name was used. 2.Design for future annotation of organism concepts. 3.Track specimens/objects by location, unique identifier & time. 4.Design for reobservation. Separate permanent from transient attributes. 5.Archival databases should provide multiple or continuous time-specific views.