Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sequence Databases What are they and why do we need them.

Similar presentations


Presentation on theme: "Sequence Databases What are they and why do we need them."— Presentation transcript:

1 Sequence Databases What are they and why do we need them

2 DNA, RNA and Protein (Amino Acids) What is sequence data? Why do I need it? Evolution Mutation Natural Selection Intra and Inter-species relationships Niche exploitation Ecosystems REALLY?

3 Phenotypes come from the proteins. Proteins come from the DNA via RNA. Changes in DNA cause changes in proteins. Changes in proteins cause changes in phenotypes. YES! Evolution Mutation Natural Selection Intra and Inter-species relationships Niche exploitation Ecosystems Intra and Inter-species relationships Niche exploitation Ecosystems Phenotypes How do we find those changes? Sequencing

4 What do Databases let you do? Explore and investigate sequence data  Classify organisms  Assign a possible function to a gene  Verify a sequences identity  Annotate a genome  Design primers for PCR and probe experiments Is the Sequence everything? The sequence itself is not informative; it must be analyzed by comparative methods against existing databases to develop hypothesis concerning relatives and function.

5 What is a Database? Databases allow us to more easily find what we need

6 What Databases are there? Ten Important Bioinformatics Databases NameAddressDescription GenBank/DDBJ/EMBLwww.ncbi.nlm.nih.govNucleotide sequences Ensemblwww.ensembl.orgHuman/Mouse genome PubMedwww.ncbi.nlm.nih.govLiterature references NRwww.ncbi.nlm.nih.govProtein sequences SWISS-PROTwww.expasy.chProtein sequences InterProwww.ebi.ac.ukProtein domains OMIMwww.ncbi.nlm.nih.govGenetic diseases Enzymeswww.chem.qmul.ac.ukEnzymes PDBwww.rcsb.org/pdbwww.rcsb.org/pdb/Protein structures KEGGwww.genome.ad.jpMetabolic pathways Many other specialized Databases are available. Bioinformatics for Dummies, 2003

7 What Database should I use? A.K.A. GenBank

8 How big is GenBank? 1977 DNA Sequencing 1985 PCR 1987 Automated Sequencing 1997 Capillary Sequencing

9 Who can put data into GenBank? Sequence data are submitted to GenBank from scientists from around the world. Warning: GenBank does not check the validity or accuracy of sequences submitted. This is left up to the scientific community to verify, like all published scientific data.

10 How do I use GenBank? www.ncbi.nlm.nih.gov Problem 1. You are constructing a phylogeny of Euglenoids and you have determined from the literature that the Beta-tubulin gene is a good gene for this purpose. How do I start???

11 How do I use GenBank? www.ncbi.nlm.nih.gov Euglenozoa AND tubulinNOT kinetoplastida AF182759

12 How do I use GenBank? Problem 2. You are studying domestication of Sorghum vulgare. From reading about sorghum you find out that it is closely related to Zea mays. You also find out that maize has a wild relative teosinte that forms multiple stocks. Domesticated maize forms a single stock. Domesticated sorghum has a single stock while wild sorghum (Johnsongrass) has multiple stocks.

13 Sorghum vulgare Sorghum halepense Johnsongrass Wild Broomcorn (Sorghum) Domesticated

14 How do I use GenBank? Problem 2. Continued Moreover, the paper states that this trait is controlled by a single gene teosinte branched 1 (tb1). You wonder “Does sorghum have this gene?”. The paper does provide a set (Forward and Reverse) PCR primers that where used to isolate and sequence the tb1 gene. Will they work for Sorghum?

15 Sequencing Sorghum

16

17 www.ncbi.nlm.nih.gov/BLAST/ >Sorghum_vulgare_sequence ATGGACTTACCGCTTTACCAACAACTGCAGCTCAGCCCGCCTTCCCCAAAGCCGGACCAATCAAGCAGCT TCTACTGCTGCTACCCATGCTCCCCTCCCTTCGCCGCCGCCGCCGCCGACGCCAGCTTTCACCTGAGCTA CCAGATCGGTAGTGCCGCCGCCGCCATCCCTCCACAAGCCGTGATCAACTCGCCGGAGGACCTGCCGGTG CAGCCGCTGATGGAGCAGGCGCCGGCGCCGCCTACAGAGCTTGTCGCCTGCGCCAGTGGTGGTGCACAAG GCGCCGGCGTCAGCGTCAGCCTGGACAGGGCGGCGGCCGCGGCCGCCGCGAGGAAAGACCGGCACAGCAA GATATGCACCGCCGGCGGGATGAGGGACCGCCGGATGCGGCTGTCCCTTGACGTCGCCCGCAAGTTCTTC GCGCTCCAGGACATGCTTGGCTTCGACAAGGCCAGCAAGACGGTACAATGGCTCCTCAACACGTCCAAGG CCGCCATCCAGGAGATCATGGCCGACGACGTCGACGCGTCGTCGGAGTGCGTGGAGGATGGCTCCAGCAG CCTCTCCGTCGACGGCAAGCACAACCCGGCGGAGCAGCTGGGAGATCAGAAGCCCAAGGGTAATGGCCGC AGCGAGGGGAAGAAGCCGGCCAAGTCAAGGAAGGCGGCGACCACCCCAAAGCCGCCAAGAAAATCGGGGA ATAATGCGCACCCGGTCCCCGACAAGGAGACGAGGGCGAAGGCGAGGGAGAGGGCGAGGGAGCGAACCAA GGAGAAGCACCGGATGCGTTGGGTAAAGCTTGCATCAGCAATTGACGTGGAGGCGGCGGCTGCCTCGGTG GCTAGCGACAGGCCGAGCTCGAACCATTTGAACCACCACCACCACTCATCGTCGTCCATGAACATGCCGC GTGCTGCGGAGGCTGAATTGGAGGAGAGGGAGAGGTGCTCATCAACTCTCAACAATAGAGGAAGGATGCA AGAAATCACAGGGGCGAGCGAGGTGGTCCTAGGCTTTGGCAACGGAGGAGGATACGGCGGCGGCAACTAC TACTGCCAAGAACAATGGGAACTCGGTGGAGTCGTCTTTCAGCAGAACTCACGCTTCTACTGA Does sorghum have the tb1 gene?

18 Resources at NCBI GenBank – Molecular Databases Nucleotides, Proteins, Structures, Expression (ESTs) and Taxonomy. Literature Databases PubMed, Journals, OMIM, Book, and Citation Matcher. Genomes and Maps – Entrez Map Viewer, UniGene, COGs, Organism-specific, Organelle, Virus, and Plasmids. Tools – Software Engineering BLAST, Sequence Analysis, 3-D Structures, Gene Expression, Literature and Genome Analysis. Education Books, Courses, Public Information. Research Biology, Computers.

19 Objectives 1.Explain what can you do with sequence data. 2.Explain what a database is. 3.Describe what kinds of data and resources are available. 4.Describe some of the uses of databases.

20 Other Specialty Databases


Download ppt "Sequence Databases What are they and why do we need them."

Similar presentations


Ads by Google