Download presentation
Presentation is loading. Please wait.
1
Basic concepts in molecular evolution
Gene: Sequence of DNA or RNA that has a potentially functional transcript: protein-coding, tRNA, rRNA, snRNA, miRNA, ... (sometimes referred to as "productive gene" serves as a regulatory or structural element: enhancer hub, centromeres, telomeres ... (sometimes refered to as "untranscribed gene" in literature) Pseudogene: A non-functional DNA element with high degree of similarity to a functional (typically productive) gene U.S. Dept of Energy Human Genome Program, How would I know if a piece of DNA is functional or not?
2
Homologous genes: genes that share a common evolutionary origin
1. Orthologous genes - descendants of an ancestral gene that was present in the last common ancestor of two or more species ... so resulting from speciation event eg. a-globin in mouse & a-globin in human 2. Paralogous genes - arose by gene duplication within a lineage eg. a-globin in mouse & b-globin in mouse Memory aid: if similar genes are present in same genome, they must be paralogues But that does not necessarily mean that similar genes in different organisms are orthologues. eg. a-globin in mouse & b-globin in human are paralogues … because a-globin and b-globin genes arose by duplication (long ago in ancestor of mouse and human)
3
“Typical” eukaryotic protein-coding gene
mRNA coding sequence Where is the promoter? 5’ UTR ? 3’ UTR ? What regions will be present in the mature mRNA? Is there an error in this figure? Fig.1.4
4
Fig.1.4 Cis-acting element:
“RNA polymerase” Promoter “Splicing machinery” AUG UAA 5’ 3’ pre-mRNA eg. cis-element for RNA stability in 3’ UTR eg. RNA cis-element (5’ splice site) mRNA 5’ 3’ AUG UAA 5’ 3’ Regulatory small RNA (antisense) Cis-acting element: DNA (or RNA) sequences near a gene, that are important for its expression Trans-acting factor: protein (or RNA) that binds to cis-element to control gene expression Fig.1.4
5
“Typical” bacterial gene organization
How many promoters are in the region shown in this figure? 2 How many proteins are encoded? 3 Operon = cluster of co-transcribed genes Evolutionary advantages of operon organization? - efficiency - co-ordination of gene expression - economy - less space in genome Fig.1.6
6
Typical prokaryotic gene: lacI in E. coli
----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----| lacI CGAAGCGGCAUGCAUUUACGUUGACACCAUCGAAUGGCGCAAAACCUUUCGCGGUAUGGCAUGAUAGCGCCCGGAAGAGAGUCAAUUCAG lacI GGUGGUGAAUGUGAAACCAGUAACGUUAUACGAUGUCGCAGAGUAUGCCGGUGUCUCUUAUCAGACCGUUUCCCGCGUGGUGAACCAGGC ...... lacI GCAGCUGGCACGACAGGUUUCCCGACUGGAAAGCGGGCAGUGAGCGCAACGCAAUUAAUGUGAGUUAGCUCACUCAUUAGGCACCCCAGG ----|----|----|----|----|----|----|----|----|----|--- lacI CUUUACACUUUAUGCUUCCGGCUCGUAUGUUGUGUGGAAUUGUGAGCGGAUAA stop codon of the upstream mhpR transcription start site SD start codon stop codon ssu rRNA --GAUCACCUCCUUA 3' mRNA GUGGUGGGA---- 5' Frequent overlap between stop codon and start codon (of the downstream gene) ---UAAUG---, ---AUGA---
7
Gene family an inclusive set of functionally diverged paralogous genes (Gene duplication is typically followed by subfunctionalization, neofunctionalization or degradation/deletion). It may include pseudogenes. examples: Human immunoglobulin genes: IgA, IgD, IgE, IgG, IgI Alpha globin gene family Beta globin gene family Ubiquitin-specific protease gene family Xuhua Xia
8
Human HBA on Chr 16 Xuhua Xia
9
Human HBB family at Chr 11 Xuhua Xia
10
Human HBB family at Chr 11 G A Xuhua Xia
11
Protein-coding genes mRNA 5’ …. AUG GGA UUG CCC GCC …. 3’
5’ …. ATG GGA TTG CCC GCC …. 3’ “coding strand” DNA 3’ .… TAC CCT AAC GGG CGG …. 5’ “template strand” mRNA 5’ …. AUG GGA UUG CCC GCC …. 3’ DNA usually shown as single-stranded with coding strand in 5’ to 3’ orientation … so genetic code table can be used directly
12
Transcription and Translation
Gene Gene Gene 3 Polycistronic mRNA RNA polymerase Ribosome GCC~tRNAGly UCC~tRNAGly Protein UCC~tRNAGly Initiation: Met-Gly-... Elongation: Mn + M Mn+1 UCC~tRNAGly Xuhua Xia
13
Ribonucleotide concentration
rATP 1890 rCTP 53 rGTP 190 rUTP 130 Measured in the exponentially proliferating chick embryo fibroblasts, 2hrs, in moles 10-12 per 106 cells. The difference is expected to be more extreme in mitochondria. NNA would seem to be a more efficient codon than NNC XIA, X., Genetics 144: Xuhua Xia
14
Standard Genetic Code Codon families have 1 – 6 members
Synonymous and nonsynonymous substitutions 0-fold, 2-fold, 3-fold, 4-fold degenerate sites 0-fold degenerate = non-degenerate 5’ …. AUG GGA UUG CCC CAC …. 3’ Xuhua Xia
15
Standard code 43 = 64 possible codons Codon families have 1 – 6
members Initiation codon When translating a nt sequence, always be sure to read it in the 5’ to 3’ direction !! 5’ …. AUG GGA UUG CCC CAC …. 3’ N-terminus… Met Gly Leu Pro His … C-terminus
16
Genetic code is not “universal”
Some mitochondria, a few bacteria, a few protists use a non-standard code Table Vertebrate mitochondrial code UGA = Trp (instead of stop codon) AUA, AUG = Met AGA, AGG = stop codons Possible implications of different codes in nature? “Defense” against foreign DNA invading the genome? “standard”genetic code
17
AMINO ACIDS – Venn diagram showing properties
acidic: Asp = GAU, GAC Glu = GAA, GAG Basic: His, Lys, Arg Fig. 1.9
18
Why study amino acid properties?
Protein properties often depends on the properties of their amino acids: Effect of mutation Diagnosis, e.g., protein electrophoresis Normal polypeptide (Hb-A): Val-His-Leu-Thr-Pro-Glu-Glu…… GAA Sickel-cell polypeptide (Hb-S): Val-His-Leu-Thr-Pro-Val-Glu…… GUA
19
Amino acid substitutions:
(polarity, molecular volume...) Grantham’s distance: F(V, P, C) Miyata’s distance: F(V, P) Amino acid substitutions: Conservative Ile Leu Radical Table 4.7 Cys Trp
20
Amino acid substitution matrices
----|----|----|----|----|----|----|----|----|----|----|----|-- S1 RWFFSTNHKDIGTLYLVFGAWAGMVGTALSLLIRAELSQPGALLGDDQIYNVIVTAHAFVMI S2 RWLFSTNHKDIGTLYLLFGAWAGVLGTALSLLIRAELGQPGNLLGNDHIYNVIVTAHAFVMI BLOSUM = BLOcks Substitution Matrix a substitution matrix used for sequence alignment of proteins (to score alignments between evolutionarily divergent protein sequences). Xuhua Xia
21
- based on observed frequencies of amino acids
BLOSUM62 matrix - based on observed frequencies of amino acids replacing other amino acids during protein evolution, particularly within conserved regions Positive value for chemically similar substitutions Leu to Ile = 2 Negative value for dissimilar Cys to Trp = - 2 Large value for rare amino acids, & no change (diagonal) usually correlated with important function Cys unchanged = 9
22
For the 61 sense codons, how many substitution mutations are possible?
43 = 64 possible codons 3 are stop codons UAA, UAG, UGA For the 61 sense codons, how many substitution mutations are possible? Change of 1st position of codon to each of 3 other nucleotides… 2nd position of codon… 3rd position of codon… ( ) x 61 = 549 How many lead to amino acid changes ? (Table 1.5) (i.e. non-synonymous substitutions )
23
change in 2d position of codon always alters the amino acid encoded
Synonymous Nonsynonymous 1st position Note: This table summarizes outcome of nt sub at each site within codon, not the frequency of change seen in nature 2nd position change in 2d position of codon always alters the amino acid encoded 3rd position 3rd position change often is “silent” (encoding same aa)
24
… but in nature, nt subs do not all occur with equal frequency:
From Table 1.3, can see that most types of nt substitutions result in amino acid changes… … but in nature, nt subs do not all occur with equal frequency: & synonymous subs occur much more frequently than non-synonymous ones - that’s expected because most amino acids changes would be detrimental… syn subs usually >> non-syn subs Also some amino acids are more common than others in proteins eg. Cys typically rare (but often has important function in protein folding) For protein Y of 200 amino acids, approximately how many non-synonymous sites are expected in its mRNA?
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.