Presentation is loading. Please wait.

Presentation is loading. Please wait.

IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF-1115210).

Similar presentations


Presentation on theme: "IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF-1115210)."— Presentation transcript:

1 iDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF-1115210). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. Standards and sharing complex primary biodiversity data; and what is an extension anyway? Example extensions to DwC: Audubon Media Description (AC), Identification History, and briefly, the Global Genome Biodiversity Network (GGBN) extensions Deb Paul, Laura Russell, Derek Masaki, (David Shorthouse) Data Sharing, Data Standards, and Demystifying the IPT Workshop – Day 1

2 2 Overview The data landscape (silos) The data landscape (silos) s t a n d a rd ss t a n d a rd s What are extensions, and why does Darwin Core need extensions? One-to-many relationships Identifiers are the key Audubon Media Description Darwin Core Identification History Global Genome Biodiversity Network (GGBN)

3 3 GeneticFunctional Taxonomic/ phylogenetic Molecular -> Ecosystem Tree of Life, phylogenomics Organisms -> species Phenotypic expression Bioactive compounds/chemistry Trophic interactions US National Science Foundation Dimensions of Biodiversity Program Interaction at the intersection of taxonomic, genetic, functional domains

4 4 s t a n d a rd s

5 5 Complex 1E: Theory: Complex primary biodiversity data DwC does not provide fields for every possible type of data. But you have lots of other types of data, right? extension Introducing the extension – http://tools.gbif.org/dwca-validator/extensions.do – There are many! And (no doubt) more to come. 22 registered 23 under development Examples – Audubon Media Description (aka Audubon Core) – Darwin Core Identification History – Global Genome Biodiversity Network (GGBN) extensions iDigBio Uses

6 6 Overview The data landscape (silos) s t a n d a rd s What are extensions, and why does Darwin Core need extensions? What are extensions, and why does Darwin Core need extensions? One-to-many relationships One-to-many relationships Identifiers are the key Identifiers are the key Audubon Media Description Darwin Core Identification History Global Genome Biodiversity Network (GGBN)

7 7 One specimen, so many kinds of data Determination 3 Determination 1 Determination 2 One-to-many relationships Identifiers are the key

8 8 Using the IPT software Inside the DwC-A you are creating… sampleoccurrence.txt meta.xml XML samplemultimedia.txt (core) sampledeterminations.txt Determination n Determination 1 Determination 2 (extension) describes extends eml.xml eml. XML metadata about the dataset, collection and contact information

9 9 Overview The data landscape (silos) s t a n d a rd s What are extensions, and why does Darwin Core need extensions? One-to-many relationships Identifiers are the key Audubon Media Description Audubon Media Description Darwin Core Identification History Global Genome Biodiversity Network (GGBN)

10 10 Audubon Media Description Sharing media – What’s in the image, recording, video? – Who took the photo, made the recording, created the SEM, CT scan? – Is the media under copyright? Or is it public domain? – Where can more / different formats of the media resource be found?

11 11

12 12 Vocabularies Audubon Media Vocabularies Management Attribution Agents Content Coverage Geography Temporal Coverage Taxonomic Coverage Resource Creation Related Resources Service Access Point

13 13 Audubon Media Description aka the Audubon Core – Vocabulariesmetadatabiodiversity multimedia resources collections – Vocabularies to represent metadata for biodiversity multimedia resources and collections. – Purpose is to represent information that will help to determine whether a particular resource or collection is fit for some particular biodiversity science application before acquiring the media. – Vocabularies address such concerns as the management management of the media and collections, content descriptions of their content, their taxonomic, geographic, and temporal coverage, and the retrieveattributereproduce appropriate ways to retrieve, attribute and reproduce them. http://terms.tdwg.org/wiki/Audubon_Core_Term_List Link: http://terms.tdwg.org/wiki/Audubon_Core_Term_List

14 14 Audubon Media Description An example of an extension inside the IPT Got media for your specimens?

15 15 Overview The data landscape (silos) s t a n d a rd s What are extensions, and why does Darwin Core need extensions? One-to-many relationships Identifiers are the key Audubon Media Description Darwin Core Identification History Darwin Core Identification History Global Genome Biodiversity Network (GGBN) What’s in a name? That which we call a rose By any other word would smell as sweet. Romeo and Juliet, Act 2, Scene 2

16 16 Darwin Core Identification History Identification histories Identification histories: sharing the names applied to a given specimen through time identifiedBy – Who applied the name? (identifiedBy) dateIdentified – When? (dateIdentified) identificationRemarksdentificationReferences – With what evidence, resource, or comments? (identificationRemarks, identificationReferences) identificationQualifier – Doubt expressed? (identificationQualifier: cf., near, ?) scientificName – Exact name applied (scientificName)

17 17

18 18 Darwin Core Identification History multiple identification/determinations specimens Support for multiple identification/determinations of species occurrences such as specimens. All identifications including the current one should be listed, while the current should also be repeated in the occurrence core for simple access. identificationID identifiedBy dateIdentified identificationReferences identificationRemarks identificationQualifier identificationVerificationStatus typeStatus taxonId taxonConceptID scientificName scientificName scientificNameID namePublishedIn namePublishedInYear … higherClassification kingdom, … vernacularName taxonRemarks, … Identification Taxon Determination n Determination 1 Determination 2

19 19 Darwin Core Identification History

20 20 Overview The data landscape (silos) s t a n d a rd s What are extensions, and why does Darwin Core need extensions? One-to-many relationships Identifiers are the key Audubon Media Description Darwin Core Identification History Global Genome Biodiversity Network (GGBN) Global Genome Biodiversity Network (GGBN)

21 21 Global Genome Biodiversity Network (GGBN) extensions: require a Material Sample Core The category of information pertaining to the physical results of a sampling (or subsampling) event. In biological collections, the material sample is typically collected, and either preserved or destructively processed. Created 2 Apr 2014 with all Simple Darwin Core ratified terms. GGBN Amplification Extension GGBN DNA Cloning Extension GGBN Gel Image Extension GGBN Loan Extension GGBN Material Sample Extension GGBN Permit Extension GGBN Preparation Extension GGBN Preservation Extension

22 22 Now, back to the DwC-A you are creating,… sampleoccurrence.txt meta.xml XML samplemultimedia.txt (core) sampledeterminations.txt Determination n Determination 1 Determination 2 (extension) describes extends eml.xml eml. XML metadata about the dataset, collection and contact information

23 23 Global Standards Why standards?

24 24 Why Darwin Core / Why Standards? http://www.britishmuseum.org/images/rosettawriting384.jpg My database Your database map to a standard!

25 25 1E: Theory: DC Extensions Defined and Registered with GBIF registry Allow extension while retaining compatibility Extensions are optional data files linked to core A row in an extension file always references the core id corresponding to a taxon or taxon occurrence


Download ppt "IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF-1115210)."

Similar presentations


Ads by Google