Presentation is loading. Please wait.

Presentation is loading. Please wait.

Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Larry Speers Global Biodiversity Information Facility Arthur Chapman.

Similar presentations


Presentation on theme: "Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Larry Speers Global Biodiversity Information Facility Arthur Chapman."— Presentation transcript:

1 Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Larry Speers Global Biodiversity Information Facility Arthur Chapman Australian Biodiversity Information Services WWW.GBIF.ORG Developing Uncertainty Measures Related to Taxonomic Determinations

2 Global Biodiversity Information Facility Disclaimers l I intend to draw attention to a problem for users with some GBIF data l I do not intend to present any finalized recommends as to how to deal with this issue l I hope to initiate a broader discussion as to possible solutions and I will present an example solution to initiate this discussion

3 http://www.gbif.org/prog/digit/data_quality/data_quality

4 http://www.gbif.org/prog/digit/data_quality/data_cleaning

5 Global Biodiversity Information Facility Issues with QA/QC l Legacy Data l Need to deal with what we have l Data cleaning tools l New data l Do everything in our power to avoid the problems we find with today’s legacy data

6 Global Biodiversity Information Facility Quality as applied to data, has various definitions but in the geographic world one definition is now largely accepted – that of “fitness for use” (Chrisman 1983). Data Quality

7 Global Biodiversity Information Facility In a database, the data have no actual quality or value; they only have potential value. That value is realized only when someone uses the data to do something useful (English 1999). The quality of data cannot be assessed independently of the users of that data (Strong et al. 1997). Fitness for Use

8 Global Biodiversity Information Facility What do we mean by “fitness for use”? Fitness for use –Does species ‘x’ occur in Tasmania? –Does species ‘x’ occur in National Park ‘y’ X Diagram Compliments Arthur Chapman

9 Global Biodiversity Information Facility Data are of high quality if they are fit for their intended use in operations, decision-making, and planning. (Juran 1964) Fitness for use

10 Taxonomy Geography Time AnimaliaFungiPlantae Annelida Arthropoda Ascomycota Basidiomycota Coniferophyta Equisetophyta India 2006 2000 1950 1900 1800 500 Exploring biodiversity data Asia Africa Europe China Benin Belgium Bangladesh Angola Congo Andorra Italy India Chordata Organisation of biodiversity data: 1.By taxonomy 2.By geography 3.By time

11

12 J. Wieczorek et al. INT. J. GEOGRAPHICAL INFORMATION SCIENCE VOL. 18, NO. 8, DECEMBER 2004, 745–767

13 Global Biodiversity Information Facility Arthur D. Chapman et al. 2006

14 Taxonomy Geography Time AnimaliaFungiPlantae Annelida Arthropoda Ascomycota Basidiomycota Coniferophyta Equisetophyta India 2006 2000 1950 1900 1800 500 Exploring biodiversity data Asia Africa Europe China Benin Belgium Bangladesh Angola Congo Andorra Italy India Chordata Organisation of biodiversity data: 1.By taxonomy 2.By geography 3.By time

15 Global Biodiversity Information Facility Documenting Fitness for Use l In general, error must not be treated as a potentially embarrassing inconvenience, because error or uncertanty provides a critical component in judging fitness for use.

16

17 Global Biodiversity Information Facility “During the revision of Euscelidia, a frightening proportion of the borrowed “determined” material was found to be misidentified (62–73%), and a literature search in a BIOSIS Previews revealed that the problem is widespread.” Meier & Dikow Conservation Biology, Pages 478–488 Volume 18, No. 2, April 2004 Problem: Misidentification

18 Global Biodiversity Information Facility “For example, of the 1522 rove beetle specimens (Staphylinidae: Coleoptera) in the Struve collection 262 (17%) were misidentified (Rose 2000), and Papp (1978) reports that for a collection of Hungarian Lauxaniidae (Diptera) 28 of the 74 species determined and labeled by Szilády were consistently misidentified.” Meier & Dikow Conservation Biology, Pages 478–488 Volume 18, No. 2, April 2004 Problem: Misidentification

19 Global Biodiversity Information Facility “In Euscelidia 13% of all borrowed specimens were classified under an incorrect name, and for a recent inventory of palm collections in botanical gardens, 260 (22%) of the submitted 1208 names were synonyms and 46 (4%) were invalid (Maunder et al. 2001).” Meier & Dikow Conservation Biology, Pages 478–488 Volume 18, No. 2, April 2004 Problem: Use of Invalid Names

20 Taxonomy Geography Time AnimaliaFungiPlantae Annelida Arthropoda Ascomycota Basidiomycota Coniferophyta Equisetophyta India 2006 2000 1950 1900 1800 500 Exploring biodiversity data Asia Africa Europe China Benin Belgium Bangladesh Angola Congo Andorra Italy India Chordata Organisation of biodiversity data: 1.By taxonomy 2.By geography 3.By time

21 Global Biodiversity Information Facility Documenting Taxonomic Determinations l Several methods exist for documenting taxonomic determinations - none are completely satisfactory l Herbarium Information Standards and Protocols for the Interchange of Data (HISPID) l Australian National Fish Collection (1993) l Several others restricted to one or two institutions l Proposal – four level: l Who determined the specimen and when l What was the determination based on: (type specimen, local flora, monograph, etc.) l Level of expertise of the determiner l What confidence did the determiner have in the determination.

22 Global Biodiversity Information Facility Taxon Verification Status - proposed From: Chapman (2005) Principles of Data Quality. GBIF Name of determiner:

23 Global Biodiversity Information Facility Issues with QA/QC l Legacy Data l Need to deal with what we have l Data cleaning tools l New data l Do everything in our power to avoid the problems we find with today’s legacy data

24 Global Biodiversity Information Facility Taxon Verification Status - proposed l identified by World expert in the taxon with high certainty l identified by World expert in the taxon with reasonable certainty l identified by World expert in the taxon with some doubt l identified by regional expert in the taxon with high certainty l identified by regional expert in the taxon with reasonable certainty l identified by regional expert in the taxon with some doubt l identified by non-expert in the taxon high certainty l identified by non-expert in the taxon reasonable certainty l identified by non-expert in the taxon some doubt l identified by the collector with high certainty l identified by the collector with reasonable certainty l identified by the collector with some doubt. From: Chapman (2005) Principles of Data Quality. GBIF Name of determiner: Date of determination: Basis of determination: (e.g. compared with holotype, used national flora)

25 Global Biodiversity Information Facility Where does this discussion fit within the TDWG process?


Download ppt "Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Larry Speers Global Biodiversity Information Facility Arthur Chapman."

Similar presentations


Ads by Google