Chapter 17. From Gene to Protein 2005-2006
Metabolism teaches us about genes Metabolic defects studying metabolic diseases suggested that genes specified proteins alkaptonuria (black urine from alkapton) PKU (phenylketonuria) each disease is caused by non-functional enzyme Genes create phenotype A B C D E 2005-2006
2005-2006
1 gene – 1 enzyme hypothesis Beadle & Tatum Compared mutants of bread mold, Neurospora fungus created mutations by X-ray treatments X-rays break DNA inactivate a gene wild type grows on “minimal” media sugars + required precursor nutrient to synthesize essential amino acids mutants require added amino acids each type of mutant lacks a certain enzyme needed to produce a certain amino acid non-functional enzyme = broken gene 2005-2006
1941 | 1958 Beadle & Tatum George Beadle Edward Tatum 2005-2006
Beadle & Tatum’s Neurospora experiment 2005-2006
Where does that leave us?! So… What is a gene? One gene – one enzyme but not all proteins are enzymes but all proteins are coded by genes One gene – one protein but many proteins are composed of several polypeptides but each polypeptide has its own gene One gene – one polypeptide but many genes only code for RNA One gene – one product but many genes code for more than one product … Where does that leave us?! 2005-2006
if you don’t know what a wabbit looks like. Defining a gene… “Defining a gene is problematic because… one gene can code for several protein products, some genes code only for RNA, two genes can overlap, and there are many other complications.” – Elizabeth Pennisi, Science 2003 gene RNA It’s hard to hunt for wabbits, if you don’t know what a wabbit looks like. 1990s -- thought humans had 100,000 genes 2000 -- 40,000 was considered a good estimate 2004 -- 30,000 2006 -- 25,000 is our best estimate polypeptide 1 polypeptide 2 polypeptide 3 gene 2005-2006
let’s go back to genes that code for proteins… The “Central Dogma” How do we move information from DNA to proteins? transcription translation DNA RNA protein For simplicity sake, let’s go back to genes that code for proteins… replication 2005-2006
From nucleus to cytoplasm… Where are the genes? genes are on chromosomes in nucleus Where are proteins synthesized? proteins made in cytoplasm by ribosomes How does the information get from nucleus to cytoplasm? messenger RNA 2005-2006 nucleus
transcription and translation RNA ribose sugar N-bases uracil instead of thymine U : A C : G single stranded mRNA, rRNA, tRNA, siRNA…. To get from the chemical language of DNA to the chemical language of proteins requires 2 major stages: transcription and translation transcription DNA RNA 2005-2006
Transcription Transcribed DNA strand = template strand untranscribed DNA strand = coding strand Synthesis of complementary RNA strand transcription bubble Enzyme RNA polymerase 2005-2006
Transcription in Prokaryotes Initiation RNA polymerase binds to promoter sequence on DNA Role of promoter 1. Where to start reading = starting point 2. Which strand to read = template strand 3. Direction on DNA = always reads DNA 3'5' 2005-2006
Transcription in Prokaryotes Promoter sequences RNA polymerase molecules bound to bacterial DNA 2005-2006
Transcription in Prokaryotes Elongation RNA polymerase unwinds DNA ~20 base pairs at a time reads DNA 3’5’ builds RNA 5’3’ (the energy governs the synthesis!) No proofreading 1 error/105 bases many copies short life not worth it! 2005-2006
Transcription RNA 2005-2006
Transcription in Prokaryotes Termination RNA polymerase stops at termination sequence mRNA leaves nucleus through pores RNA GC hairpin turn 2005-2006
Transcription in Eukaryotes 2005-2006
Prokaryote vs. Eukaryote genes Prokaryotes DNA in cytoplasm circular chromosome naked DNA no introns Eukaryotes DNA in nucleus linear chromosomes DNA wound on histone proteins introns vs. exons intron = noncoding (inbetween) sequence eukaryotic DNA exon = coding (expressed) sequence 2005-2006
Transcription in Eukaryotes 3 RNA polymerase enzymes RNA polymerase I only transcribes rRNA genes RNA polymerase I I transcribes genes into mRNA RNA polymerase I I I each has a specific promoter sequence it recognizes 2005-2006
Transcription in Eukaryotes Initiation complex transcription factors bind to promoter region upstream of gene proteins which bind to DNA & turn on or off transcription TATA box binding site only then does RNA polymerase bind to DNA 2005-2006
Post-transcriptional processing Primary transcript eukaryotic mRNA needs work after transcription Protect mRNA from RNase enzymes in cytoplasm add 5' cap add polyA tail Edit out introns A 3' poly-A tail CH3 mRNA 5' 5' cap 3' G P 50-250 A’s intron = noncoding (inbetween) sequence eukaryotic DNA exon = coding (expressed) sequence pre-mRNA primary mRNA transcript mature mRNA transcript 2005-2006 spliced mRNA
Transcription to translation Differences between prokaryotes & eukaryotes time & physical separation between processes RNA processing 2005-2006
Translation in Prokaryotes Transcription & translation are simultaneous in bacteria DNA is in cytoplasm no mRNA editing needed 2005-2006
DNA mRNA protein From gene to protein transcription translation aa transcription translation DNA mRNA protein ribosome mRNA leaves nucleus through nuclear pores proteins synthesized by ribosomes using instructions on mRNA nucleus cytoplasm 2005-2006
How does mRNA code for proteins? TACGCACATTTACGTACGCGG DNA AUGCGUGUAAAUGCAUGCGCC mRNA ? Met Arg Val Asn Ala Cys Ala protein How can you code for 20 amino acids with only 4 nucleotide bases (A,U,G,C)? 2005-2006
Cracking the code 1960 | 1968 Nirenberg & Matthaei determined 1st codon–amino acid match UUU coded for phenylalanine created artificial poly(U) mRNA added mRNA to test tube of ribosomes, tRNA & amino acids mRNA synthesized single amino acid polypeptide chain phe–phe–phe–phe–phe–phe 2005-2006
2005-2006 Heinrich Matthaei Marshall Nirenberg
Translation Codons blocks of 3 nucleotides decoded into the sequence of amino acids 2005-2006
mRNA codes for proteins in triplets TACGCACATTTACGTACGCGG DNA AUGCGUGUAAAUGCAUGCGCC mRNA AUGCGUGUAAAUGCAUGCGCC mRNA ? Met Arg Val Asn Ala Cys Ala protein 2005-2006
The code For ALL life! Code is redundant Why is this a good thing? strongest support for a common origin for all life Code is redundant several codons for each amino acid Why is this a good thing? Strong evidence for a single origin in evolutionary theory. Start codon AUG methionine Stop codons UGA, UAA, UAG 2005-2006
How are the codons matched to amino acids? 3' 5' TACGCACATTTACGTACGCGG DNA 5' AUGCGUGUAAAUGCAUGCGCC 3' mRNA codon 3' 5' UAC Met tRNA GCA Arg amino acid CAU Val anti-codon 2005-2006
cytoplasm transcription translation protein nucleus 2005-2006
tRNA structure “Clover leaf” structure anticodon on “clover leaf” end amino acid attached on 3' end 2005-2006
Loading tRNA Aminoacyl tRNA synthetase enzyme which bonds amino acid to tRNA endergonic reaction ATP AMP energy stored in tRNA-amino acid bond unstable so it can release amino acid at ribosome The tRNA-amino acid bond is unstable. This makes it easy for the tRNA to later give up the amino acid to a growing polypeptide chain in a ribosome. 2005-2006
Ribosomes Facilitate coupling of tRNA anticodon to mRNA codon organelle or enzyme? Structure ribosomal RNA (rRNA) & proteins 2 subunits large small 2005-2006
Ribosomes P site (peptidyl-tRNA site) A site (aminoacyl-tRNA site) holds tRNA carrying growing polypeptide chain A site (aminoacyl-tRNA site) holds tRNA carrying next amino acid to be added to chain E site (exit site) empty tRNA leaves ribosome from exit site 2005-2006
Building a polypeptide Initiation brings together mRNA, ribosome subunits, proteins & initiator tRNA Elongation Termination 2005-2006
Elongation: growing a polypeptide 2005-2006
Termination: release polypeptide Release factor “release protein” bonds to A site bonds water molecule to polypeptide chain Now what happens to the polypeptide? 2005-2006
start of a secretory pathway Destinations: secretion nucleus mitochondria chloroplasts cell membrane cytoplasm Protein targeting Signal peptide address label start of a secretory pathway 2005-2006
Can you tell the story? RNA polymerase DNA amino acids tRNA pre-mRNA exon intron tRNA pre-mRNA 5' cap mature mRNA aminoacyl tRNA synthetase polyA tail 3' large subunit polypeptide ribosome 5' tRNA small subunit E P A 2005-2006
Put it all together… 2005-2006
Any Questions?? 2005-2006
Chapter 17. Mutations 2005-2006
Universal code Code is redundant several codons for each amino acid “wobble” in the tRNA “wobble” in the aminoacyl-tRNA synthetase enzyme that loads the tRNA Strong evidence for a single origin in evolutionary theory. 2005-2006
Mutations Point mutations single base change base-pair substitution silent mutation no amino acid change redundancy in code missense change amino acid nonsense change to stop codon When do mutations affect the next generation? 2005-2006
Point mutation leads to Sickle cell anemia What kind of mutation? 2005-2006
Sickle cell anemia 2005-2006
Mutations Frameshift shift in the reading frame insertions deletions changes everything “downstream” insertions adding base(s) deletions losing base(s) 2005-2006
What’s the value of mutations? 2005-2006
Chapter 17. RNA Processing 2005-2006
Transcription -- another look The process of transcription includes many points of control when to start reading DNA where to start reading DNA where to stop reading DNA editing the mRNA protecting mRNA as it travels through cell 2005-2006
Primary transcript Processing mRNA protecting RNA from RNase in cytoplasm add 5’ cap add polyA tail remove introns AUG UGA 2005-2006
Protecting RNA 5’ cap added 3’ poly-A tail added G trinucleoside (G-P-P-P) protects mRNA from RNase (hydrolytic enzymes) 3’ poly-A tail added 50-250 A’s helps export of RNA from nucleus UTR UTR 2005-2006
Dicing & splicing mRNA Pre-mRNA mRNA edit out introns intervening sequences splice together exons expressed sequences In higher eukaryotes 90% or more of gene can be intron no one knows why…yet there’s a Nobel prize waiting… “AVERAGE”… “gene” = 8000b pre-mRNA = 8000b mature mRNA = 1200b protein = 400aa lotsa “JUNK”! average size gene (transcription unit) = 8000 bases average size primary transcript = 8000 bases average size mature RNA = 1200 bases average size protein = 400 amino acids lots of “junk DNA” 2005-2006
Discovery of Split genes 1977 | 1993 Discovery of Split genes Richard Roberts Philip Sharp adenovirus NE BioLabs MIT common cold 2005-2006
Splicing enzymes snRNPs Spliceosome RNA as ribozyme several snRNPs small nuclear RNA RNA + proteins Spliceosome several snRNPs recognize splice site sequence cut & paste RNA as ribozyme some mRNA can splice itself RNA as enzyme 2005-2006
Ribozyme 1982 | 1989 RNA as enzyme Sidney Altman Thomas Cech Yale U of Colorado 2005-2006
Splicing details No room for mistakes! editing & splicing must be exactly accurate a single base added or lost throws off the reading frame AUGCGGCTATGGGUCCGAUAAGGGCCAU AUGCGGUCCGAUAAGGGCCAU AUG|CGG|UCC|GAU|AAG|GGC|CAU Met|Arg|Ser|Asp|Lys|Gly|His AUGCGGCTATGGGUCCGAUAAGGGCCAU AUGCGGGUCCGAUAAGGGCCAU AUG|CGG|GUC|CGA|UAA|GGG|CCA|U 2005-2006 Met|Arg|Val|Arg|STOP|
Alternative splicing Alternative mRNAs produced from same gene when is an intron not an intron… different segments treated as exons Hard to define a gene! 2005-2006
Domains Modular architecture of many proteins separate functional & structural regions coded by different exons in same “gene” 2005-2006
The Transcriptional unit (gene?) enhancer 1000+b translation start translation stop exons 20-30b transcriptional unit RNA polymerase 3' TAC ACT 5' TATA DNA transcription start UTR introns transcription stop UTR promoter DNA pre-mRNA 5' 3' mature mRNA 5' 3' 2005-2006 GTP AAAAAAAA
Any Questions?? 2005-2006
The Transcriptional unit enhancer 1000+b exons 20-30b transcriptional unit RNA polymerase 3' TAC ACT 5' TATA DNA introns 5' 3' 5' 3' 2005-2006