Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.

Similar presentations


Presentation on theme: "Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The."— Presentation transcript:

1 Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The BARCODE Data Standard: Enabling Molecular Diagnostics for Biodivesity Western and Central Africa: DNA barcoding Meeting One-day course on DNA barcoding: Practical advice 23rd October 2008

2 New partners

3 The Infrastructure of Taxonomy Collections and databases of specimens Codes of Taxonomic Nomenclature Compilations of taxonomic names Data repositories (characters, gene sequences, images, trees) Monographs Floristic and faunistic surveys/inventories Revisions The (undigitized) Taxonomic Literature

4 International Nucleotide Sequence Database Collaboration http://www.insdc.org/

5 Roles of INSDC an archival database/repository for nucleotide sequence Output of Project A Output of Project B Output of Project C Common access interface Standardization of data structure including data items and values Assignment of a unique identifier (an accession number) to a sequence Users

6 New tools for taxonomy DNA Barcoding The ability to compare genotype information across a huge range of organisms is a powerful tool

7 “Only [27%] of papers had a legitimate specimens examined section, with museum numbers for each voucher, and names of the museums where the specimens used in the study could be examined ”

8 DNA Sequence” Couplets Consisting of: “Species Name - DNA Sequence” Basis of a “look-up table” enabling molecular diagnostic applications However, both elements need validation Underlying specimens and associated raw sequence data are not typically available for secondary inspection

9 Problem Areas TRANSPARENCY AND TRACEABILITY Genetic Data Quality Specimen Data Quality Taxonomy Access to Information

10 Paradigm Shift Barcoders began calling for a Paradigm Shift Depositing barcode sequences in public database, along with primer sequences, trace files and associated quality scores makes this species identification technique widely accessible. Reference DNA barcode sequences should be derived from, and liked to, specimens of known promenance in web-accessible collections in order to validate this system of molecular diagnostics.

11 Rationale for Defining “BARCODE” keyword in GenBank Provides the community with reference records with verifiable and retrievable data: –Associated with retrievable voucher specimens (liberally defined: tissue, DNA, etc.) –Linked to on-line metadata –Meet an agreed upon standard of taxonomic identification –Provide an assured level of data completeness –On an agreed upon gene region –Recommended for use in identifying unknowns

12 Barcode Data Standard The Barcode Data Standard Establishing a new data standard for “BARCODE” keyword records in DDBJ/EMBL/GenBank: 1.Minimum 500bp, <1% ambiguous base calls 2.Double stranded sequence 3.Trace files and associated quality scores 4.Primers used to generate sequence 5.Linkages to: 1.A morphological voucher specimen 2.Structured reference to collections 3.Geospatial reference information 4.Valid species name 5.Who performed the identification 6.Literature citations

13 Features, Qualifiers and Values The Feature table is updated based on discussions at the International Collaborators meeting of INSDC

14 NCBI Trace Archive accepts BARCODE as a keyword that identifies “a DNA sequence analysis of a uniform target gene to enable species identification”

15 Triplet structure for specimen identifiers /specimen_voucher=“ | | ” - abbreviation of the archiving institution - collection within the institution (*) - specimen identifier within the collection The above approach is used in the DarwinCore/GBIF and is parallel to the Life Science Identifier (LSID) that is an Object Management Group (OMG) standard. (*) museums & herbaria culture collections stock centers germplasm repositories (seed banks) frozen tissue banks zoos/aquaria/botanical gardens DNA banks, personal collections e-voucher archives

16

17 Link from GenBank to Museums www.biorepositories.org

18 Process Record

19 acknowledgments Lee Weight, Smithsonian Institution Scott Miller, PI CBOL David Schindel, Executive Secretary, CBOL Sujeevan Ratnasingham, Biodiversity Institute of Ontario (BIO)/BOLD Robert Hanner (BIO) Organizers: Western and Central Africa DNA barcoding Meeting (NABDA & CBOL Secretariat)


Download ppt "Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The."

Similar presentations


Ads by Google