Gene Finding. Biological Background The Central Dogma Transcription RNA Translation Protein DNA.

Slides:



Advertisements
Similar presentations
Chapter 10 How proteins are made.
Advertisements

The Central Dogma Information flow in cells DNA RNA Protein Transcription Translation Language The cat sat on the mat THE CAT SAT ON THE MAT Le chat sest.
An Introduction to Bioinformatics Finding genes in prokaryotes.
CSCE555 Bioinformatics Lecture 3 Gene Finding Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
Central dogma of genetics Lecture 4. The conversion of DNA to Proteins.
Lecture 4: DNA transcription
Gene Expression Eukaryotic versus Prokaryotic
Prof. Drs. Sutarno, MSc., PhD.. Biology is Study of Life Molecular Biology  Studying life at a molecular level Molecular Biology  modern Biology The.
Central Dogma Big Idea 3: Living systems store, retrieve, transmit, and respond to info essential to life processes.
Protein Synthesis.
RNA and Protein Synthesis
Central Dogma of Biology. How genes control biological activity.
© 2006 W.W. Norton & Company, Inc. DISCOVER BIOLOGY 3/e
Protein Synthesis.
Finding prokaryotic genes and non intronic eukaryotic genes
FROM GENE TO PROTEIN: TRANSCRIPTION & RNA PROCESSING Chapter 17.
The Genetic Code and Transcription
Transcription Nicky Mulder Acknowledgements: Anna Kramvis for lecture material (adapted here)
Gene Structure and Identification
Chapter 6 Gene Prediction: Finding Genes in the Human Genome.
Chapter 17 Notes From Gene to Protein.
Transcription and Translation
Gene Regulation An expressed gene is one that is transcribed into RNA
CENTRAL DOGMA OF BIOLOGY. Transcription & Translation How do we make sense of the DNA message? Genotype to Phenotype.
RNA and Protein Synthesis
Part Transcription 1 Transcription 2 Translation.
1 Genes and How They Work Chapter Outline Cells Use RNA to Make Protein Gene Expression Genetic Code Transcription Translation Spliced Genes – Introns.
Protein Synthesis. Transcription DNA  mRNA Occurs in the nucleus Translation mRNA  tRNA  AA Occurs at the ribosome.
Copyright © 2009 Pearson Education, Inc. Chapter 14 The Genetic Code and Transcription Copyright © 2009 Pearson Education, Inc.
 The central concept in biology is:  DNA determines what protein is made  RNA takes instructions from DNA  RNA programs the production of protein.
3.A.1 DNA and RNA Part IV: Translation DNA, and in some cases RNA, is the primary source of heritable information. DNA, and in some cases RNA, is the primary.
Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell.
Complexities of Gene Expression Cells have regulated, complex systems –Not all genes are expressed in every cell –Many genes are not expressed all of.
Eukaryotic Gene Structure. 2 Terminology Genome – entire genetic material of an individual Transcriptome – set of transcribed sequences Proteome – set.
DNA in the Cell Stored in Number of Chromosomes (24 in Human Genome) Tightly coiled threads of DNA and Associated Proteins: Chromatin 3 billion bp in Human.
The Genetic Code. The DNA that makes up the human genome can be subdivided into information bytes called genes. Each gene encodes a unique protein that.
RNA and Gene Expression BIO 224 Intro to Molecular and Cell Biology.
Crash Course!  Introduction to Molecular Biology.
Cells use information in genes to build several thousands of different proteins, each with a unique function. But not all proteins are required by the.
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
The Central Dogma 12.3 HW tonight read 12.4 and review this stuff!
The Central Dogma of Molecular Biology DNA  RNA  Protein  Trait.
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings PowerPoint Lectures for Biology, Seventh Edition Neil Campbell and Jane Reece.
Protein Synthesis. RNA vs. DNA Both nucleic acids – Chains of nucleotides Different: – Sugar – Types of bases – Numbers of bases – Number of chains –
Chapter 17 From Gene to Protein. One gene, one protein Chapter 17 From Gene to Protein.
Lesson 4- Gene Expression PART 2 - TRANSLATION. Warm-Up Name 10 differences between DNA replication and transcription.
1 Gene Finding. 2 “The Central Dogma” TranscriptionTranslation RNA Protein.
Bacterial infection by lytic virus
Bacterial infection by lytic virus
Eukaryotic Gene Structure
CENTRAL DOGMA OF BIOLOGY
Transcription.
The Central Dogma Transcription & Translation
Molecular Biology DNA Expression
13.3 RNA & Gene Expression I. An Overview of Gene _____________ A. RNA
Protein Synthesis.
BTY100-Lec#4.2 DNA to Protein (Central Dogma).
RNA, & Protein Synthesis
RNA and Protein Synthesis
Recitation 7 2/4/09 PSSMs+Gene finding
Analogy Video Central Dogma Analogy Video (Resources Page)
Translation.
Central Dogma Central Dogma categorized by: DNA Replication Transcription Translation From that, we find the flow of.
mRNA Degradation and Translation Control
How genes on a chromosome determine what proteins to make
Compared to DNA and Types
The Structure of the Genome
Gene Structure.
So how do we get from DNA to Protein?
Gene Structure.
Presentation transcript:

Gene Finding

Biological Background

The Central Dogma Transcription RNA Translation Protein DNA

Background *Essential Cell Biology; p.268 Non-coding regions  gene regulation wVicinity of TSS: direct interactions with Pol-II complex wLarger vicinity – indirect interactions (chromatin remodelling)

The Genetic Code First Letter Second Letter Third Letter

tRNA – Responsible for Translation Adopted from Genetic Analysis V, p.388

tRNA – Responsible for Translation Adopted from Genetic Analysis V, p.388

Frame Shifts wCode Triplets (“codons”) are not overlapping w  3x2 possible ways of reading depending on strand and the relative position where reading starts wThis is not just our concern when looking for genes, it is also the cell’s concern in terms of mutations: wOriginal: THE FAT CAT ATE THE BIG RAT wDelete C:THE FAT ATA TET HEB IGR AT w

Prokaryotes Gene Finding wNo noclues wMost DNA is coding (e.g. 70% in H.influenza) wEach gene is one contiunes DNA sequence (no introns) wPolyI – rRNA, PolyII – mRNA, PolyIII - tRNA

Detecting ORF wSimple Idea:  If there is no gene encoded then the expected frequency of STOP codon is 3/64 codons  ORF – open reading frame, a sequence of codons with no STOP codon  Simple Algorithm: 1.scan until you find a stop condon, in all reading frames. 2.Scan back to find a start codon. 3.If it’s long ehough, report this ORF as a putative gene Cons: Can’t detect short genes High FP ( E.Coli has 6500 ORFS but only 1100 genes)

Coding vs. Non coding regions Codon frequencies wCodon usage in coding regions is different wLeucine, Alanine, Tryptophan are coded in 6:4:1 different codons w  Expect to see a ratio of 6:4:1 in random sequence wIn proteins the appear in 6.9:6.5:1 ratio wAnother example: A or T appear in 90% of the case as the last letter of a codon in protein coding regions

Nocleutide MM for Gene Detection

2 nd Order MM Idea: extend the model to capture codons Results: poor…. Code overlap in this model

MM over codons Idea: Transform the code into codons, then use 1 rd MM

Why not use codon frequencies directly? “Codon Preferences” program:

“Codon Preferences” program Uses a window of 25 codons around each point Score:

Using Promoter’s Signal wWe are still far from perfect… w  idea: try to detect signals in the promoter regions, to help descriminate real genes in ORFs wProkaryotes: ~-35 tss: TTGACA ~-10 tss: TATAAT (“TATA box” signal) wNo single promoter has the exact consensus wNearly all promoters have 2-3 from TAxyzT w80-90% have all 3 wIn 50% xyz = TAA

Up To here summary wWe have seen the problems in trying to find genes in wide genome scan – Prokaryotes! wThe bottom line is that the problem is not really solved, but most research in gene finding focus on Eukaryotes, where the main interest lies … wNext lecture – much more sophisticated models, to handle the much more complex situation in Eukaryotes in general, and Human in particular