Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Sense of Sequense The Sense of Sequense Chris Evelo BiGCaT Bioinformatics Universiteit Maastricht.

Similar presentations


Presentation on theme: "The Sense of Sequense The Sense of Sequense Chris Evelo BiGCaT Bioinformatics Universiteit Maastricht."— Presentation transcript:

1 The Sense of Sequense The Sense of Sequense Chris Evelo BiGCaT Bioinformatics Universiteit Maastricht

2 The Sense of Sequense The Sense of Sequense Databases. What have we got to compare our sequences with? Chris Evelo. Is that gene on my array? A simple question with a complicated answer. Gontran Zepeda. Annotating a full array. Evading the EST trap and the Affymetrix challenge. Stan Gaj.

3 First questions to answer. How to sequence an entire genome? Typical errors? Why not start with chromosome 1? Is it useful?

4 How to sequence an entire genome Start show

5 Example trace file. DNA sequence trace showing a portion of the nucleotide sequence of the gene encoding the envelope protein of the Human Immunodeficiency Virus, HIV-1.

6 Typical errors. Not all base/dye combo’s same mobility (typically corrected by software) Bad quality at start and end of sequences Bad separation in front runners Typical low broad peeks at the end As a result multiple equal bases overlap

7 Why not start with chromosome 1? …

8 Is it useful?

9 Are genome databases useful? Copied DNA to computer disks. Computers can read bits easier than bases. But why read them? Or better, how read them. We need more information.

10 Figure 3-15. The transfer of information from DNA to protein. The transfer proceeds by means of an RNA intermediate called messenger RNA (mRNA). In procaryotic cells the process is simpler than in eucaryotic cells. In eucaryotes the coding regions of the DNA (in the exons,shown in color) are separated by noncoding regions (the introns). As indicated, these introns must be removed by an enzymatically catalyzed RNA-splicing reaction to form the mRNA. Alberts et al. Molecular Biology of the Cell, 3rd edn. Gene expression

11 Three levels And we need them all… DNA, mRNA and protein Protein information comes from biochemistry and physiology: Main database is Swissprot (high quality/ highly curated) US has PIR Hypothetical proteins: Main database trEMBL Databases now combined: UniProt

12

13 Swissprot

14 SwissProt

15 Three levels DNA Genome data mRNA ?? Protein: Swissprot trEMBL = UniProt

16 mRNA. Measuring mRNA is easy Use PolyA tail to isolate PCR and blot (use primer if known) Clone and sequence And what do you know then? “It’s an expressed sequence tag…”

17 Three levels DNA Genome data mRNA ESTs (EMBL) Protein: Swissprot trEMBL = UniProt

18 Annotate! DNA: Genome data mRNA: ESTs - EMBL Clustered - Unigene Protein: - Swissprot - trEMBL = UniProt


Download ppt "The Sense of Sequense The Sense of Sequense Chris Evelo BiGCaT Bioinformatics Universiteit Maastricht."

Similar presentations


Ads by Google