Download presentation
Presentation is loading. Please wait.
1
Changes in DNA
2
Mutations Any change in the DNA sequence of an organism is a mutation.
Mutation is a decay force whose ultimate roots are in the second law of thermodynamics (entropy). Living things survive inevitable mutations by a combination of being tolerant of a certain level of mutation, repairing mutational damage, killing cells that are mutated beyond repair, and relying on natural selection to remove individuals with unfavorable mutations. Mutations are the source of the altered versions of genes that provide the raw material for evolution. A central tenet of biology is that the flow of information from DNA to protein is one way. DNA cannot be altered in a directed way by changing the environment. Only random DNA changes occur. Some terminology: the genotype is the organism’s genetic constitution, at the bottom, the sequence of its DNA. The phenotype is the physical characteristics of the organism: its appearance, biochemistry, reactions to the environment, etc. before DNA sequencing, the genotype was deduced from the phenotypes of parents and offspring. the point of genome annotation is to deduce the phenotype that will result from a given genotype.
3
More Mutation Generalities
Most mutations have no effect on the organism, especially among the eukaryotes, because a large portion of the DNA is not in genes and thus does not affect the organism’s phenotype. Even within genes, mutations can have little or no effect the genetic code is degenerate: some mutations ar translated into the same amino acid many amino acid changes have little or no effect on protein function. Of the mutations that do affect the phenotype, the most common effect of mutations is lethality, because most genes are necessary for life. From a bioinformatics point of view, the three simplest types of mutation: base substitution, small insertions and deletions (indels), and simple sequence repeats, affect sequence alignment programs. Larger mutations such as transposable element movements, recombination-induced mutations, and general chromosome rearrangements, affect large scale issues such as genomic maps.
4
Base Change Mutations A C G T 0.6 0.1 0.2
The simplest mutations are base changes, where one base is converted to another. (Also called “substitutions”, or “point mutations”.) These can be classified as either: --“transitions”, where one purine is changed to another purine (A -> G, for example), or one pyrimidine is changed to another pyrimidine (T -> C, for example). “transversions”, where a purine is substituted for a pyrimidine, or a pyrimidine is substituted for a purine. For example, A -> C. Transitions are more common than transversions, because they are easier to create, and because transitions often have less drastic effects than transversions. Base change mutations are the cause of single nucleotide polymorphisms (SNPs). Mapping SNPs is the current best way to locate human disease genes. Base change mutations are the most common mutations, and they are the easiest to handle for statistics and evolutionary studies. A C G T 0.6 0.1 0.2
5
Base Change Causes Base changes occur naturally as errors in replication: the wrong base gets inserted. DNA polymerase has an editing function that detects most errors, then backs up, removes the wrong base and puts in the proper base. enzymes that replicate RNA don’t have the editing function, so their error rate is 100 x that of DNA polymerase, causing the high mutation rate of RNA viruses. Various chemical changes in a base can cause mutation. For instance, the spontaneous loss of the amino group on cytosine converts it to uracil (which will pair with A, not G). environmental chemicals that attach bulky groups onto bases (alkylating agents) can cause the bases to be mis-read by DNA polymerase.
6
Phenotypic Effects of Base Changes
Mutations can be classified according to their effects on the protein (or mRNA) produced by the gene that is mutated. 1. Silent mutations (synonymous mutations). Since the genetic code is degenerate, several codons produce the same amino acid. Especially, third base changes often have no effect on the amino acid sequence of the protein. These mutations affect the DNA but not the protein. Therefore they are called neutral mutations, mutations which should have no effect on the organism’s phenotype. 2. Missense mutations. Missense mutations substitute one amino acid for another. Some missense mutations have very large effects, while others have minimal or no effect. It depends on where the mutation occurs in the protein’s structure, and how big a change in the type of amino acid it is. 3. Nonsense mutations convert an amino acid into a stop codon. The effect is to shorten the resulting protein. Sometimes this has only a little effect, as the ends of proteins are often relatively unimportant to function. However, often nonsense mutations result in completely non-functional proteins. 4. Sense mutations are the opposite of nonsense mutations. Here, a stop codon is converted into an amino acid codon. Since DNA outside of protein-coding regions contains an average of 3 stop codons per 64, the translation process usually stops after producing a slightly longer protein. Base changes can also affect RNA initiation, splicing and termination.
8
More on Substitution In addition to synonymous mutations, some amino acid changes are “conservative” in that they have little or no affect on the protein’s function. for example, isoleucine and valine are both hydrophobic and readily substitute for each other. other amino acid substitutions are very unlikely: leucine (hydrophobic) for aspartic acid (hydrophilic and charged). This would be a non-conservative substitution. Some amino acids play unique roles: cysteines form disulfide bridges, prolines induce kinks in the chain, etc. However, some amino acids are critical fro active sites and cannot be substituted. Tables of substitution frequencies for all pairs of amino acids have been generated. BLOSUM62 Table. Numbers on the diagonal indicate the likelihood of the amino acid staying the same. The off-diagonal numbers are relative substitution frequencies.
9
Indels Another simple type of mutation is the gain or loss of one or a few bases. These mutations are called indels, which is short for “insertion/deletion”. When comparing two species it isn’t easy to tell whether an insertion occurred in one species or a deletion occurred in the other. Indels are thought to be generated when the DNA polymerase slips forward or backward on the template DNA it is copying. This occurs most easily in repeated sequences, but can occur anywhere. A second cause of short indels is chemical- or radiation-induced loss of the base portion of the nucleotide. The DNA polymerase often skips right over these sugar/phosphate stumps, leaving a missing base in the resulting DNA chain.
10
Frameshifts and Reversions
Translation occurs codon by codon, examining nucleotides in groups of 3. If a nucleotide or two is added or removed, the groupings of the codons is altered. This is a frameshift mutation, where the reading frame of the ribosome is altered. Frameshift mutations result in all amino acids downstream from the mutation site being completely different from wild type. These proteins are generally non-functional. A reversion is a second mutation that reverse the effects of an initial mutation, bringing the phenotype back to wild type (or almost). Frameshift mutations sometimes have “second site reversions”, where a second frameshift downstream from the first frameshift reverses the effect.
11
Microsatellites/Simple Sequence Repeats
Two words for the same phenomenon. During replication, DNA polymerase can “stutter” when it replicates several tandem copies of a short sequence, say 2-5 bp. For example, CAGCAGCAGCAG, 4 copies of CAG, will occasionally be converted to 3 copies or 5 copies by DNA polymerase stuttering. Outside of genes, this effect produces useful genetic markers called SSR (simple sequence repeats). They are heavily used in genetic mapping, for several reasons. They are easy to detect, They are fairly stable across generations yet have a high enough mutation rate that many alleles exist in the population. They are found in many locations in the genome of all organisms. Within a gene, this effect can cause certain amino acids to be repeated many times within the protein. In some cases this causes disease
12
Huntington Disease Huntington Disease. A dominant autosomal disease, with most people heterozygotes. Onset usually in middle age. Neurological: starts with irritability and depression, includes fidgety behavior and involuntary movement (chorea), followed by psychosis and death. Caused by CAG repeats within the coding region, giving a tract of glutamines. Below 28 copies is normal, between 28 and 34 copies is the premutation allele: normal phenotype but unstable copy number that puts the next generation at risk. Above 34 copies gives the disease. HD shows “anticipation”: the age of onset gets earlier with every generation. This is due to a direct correlation between copy number and age of onset. There is a genetic test for the disease, but in the absence of effective treatment few actually take the test. Function of the protein remains unknown, the excess glutamines may cause it to aggregate and lose function.
13
Larger Scale Mutations
Larger mutations include insertion of whole new sequences, often due to movements of transposable elements in the DNA or to chromosome changes such as inversions or translocations. Deletions of large segments of DNA also occurs. These phenomena affect the order of genes on the chromosome. In classical genetics, synteny means that two genes are on the same chromosome. This term has a slightly different meaning in genomics and bioinformatics: that a group of genes are in the same order on the chromosome in different species. Synteny tends to be conserved in closely related species, but breaks down in more distantly related species. Also, the genes at the breakpoints of a large scale mutation are often broken in half or otherwise disrupted.
14
Transposable Elements
Transposable elements are DNA sequences that move from place to place in the genome. Unlike genes, transposable elements don’t have a fixed location on the chromosome. Transposable elements are essentially parasites. In general they don’t contribute to the evolutionary fitness of the organism. Most of the genes in an organism are necessary, at least under some circumstances, for the organism’s survival. Genes avoid being destroyed by random mutations because individuals with mutated genes are less fit: don’t survive or reproduce as well as unmutated individuals. Transposable elements avoid being destroyed by increasing their numbers by enough to keep some functional copies present even if some are destroyed. However, too much increase in numbers will kill the organism because sometimes transposable elements insert within a gene, inactivating it.
15
More Transposable Elements
Two basic types: those that are strictly DNA, and those that replicate through an RNA intermediate. These are sometimes called type 1 and type 2, but I have a hard time keeping those arbitrary numbers straight. The most important nomenclature issue is that the prefix “retro-” implies the use of reverse transcriptase, which copied RNA into DNA, the defining characteristic of RNA-intermediate transposable elements. Eukaryotes often contain very short ( bp) elements that contain the ends of a longer DNA transposon and miscellaneous junk inside. They move to new locations using the transposase enzyme from a full length element. Most bacterial TEs are DNA only. In eukaryotes, DNA transposable elements occur, but are less common than retrotransposons. Transposable elements were first studied by Barbara McClintock in corn. They are an important source of the variation seen in ornamental flowers. Most common type in bacteria: Insertion Sequences (IS) roughly 1-3 kbp long, containing a transposase gene, and are bounded by short (10-40 bp) inverted repeats many different families, not well conserved across species Transposons are longer TEs, usually composed of 2 IS elements and a gene(s) in between, often an antibiotic resistance gene.
16
Retro Elements RNA transposable elements are called retrotransposons in eukaryotes. They are characterized by the use of reverse transcriptase in their life cycle. They are related to retroviruses, such as HIV, feline leukemia virus, etc . Retrotransposons lack the gene necessary to move outside the cell. There are a variety of retro element types, some of which contain long terminal repeats (LTRs) and some of which don’t. Also, there are many non-functional, degenerate sequences in eukaryotic genomes that started out as retrotransposons. Up to 25% of the human genome. In bacteria, the common RNA TE is a “mobile group II intron”. When transcribed into messenger RNA they can splice themselves out without the need for proteins group II introns contain a gene for reverse transcriptase, which copies the RNA back into DNA at a new location in the genome.
17
Recombination-Induced Mutations
Most recombination occurs between homologous sites: two chromosomes line up in meiosis and have a break-and-rejoin event at the same location, resulting in daughter chromosomes that contain a mixture of alleles from both parents. However, any two sites that contain similar DNA sequences can pair up and have a crossover. These events can significantly rearrange the genome.
18
Hemophilia A: Inversion Problems
The clotting factor VIII gene, F8, is on the X chromosome and is the major cause of hemophilia. F8 is a large gene, and completely contained within intron 22 are two small genes transcribed from the opposite strand. One of these genes, F8A, has another copy several hundred kb away, on the opposite strand. Thus, these two very similar genes are in opposite orientation. Sometimes crossing over during meiosis will pair these regions are recombination will occur. This results in an inversion. The inversion completely disrupts the main F8 gene, because its 5’ half is now inverted and far away from its 3’ half. This accounts for about 45% of hemophilia A cases. Almost all new cases arise during male meiosis: in females, the two homologous X chromosomes are paired, which seems to inhibit this inversion.
19
Tandem Duplications Genes are duplicated if there is more than one copy present in the haploid genome. Some duplications are “dispersed”, found in very different locations from each other. Other duplications are “tandem”, found next to each other. Tandem duplications play a major role in evolution, because it is easy to generate extra copies of the duplicated genes through the process of unequal crossing over. These extra copies can then mutate to take on altered roles in the cell, or they can become pseudogenes, inactive forms of the gene, by mutation. Most commonly tandem duplications affect only one gene, resulting in an array of very similar genes. Sometimes duplicated regions exist within a gene, which can cause havoc in trying to align the sequences
20
Unequal Crossing Over Unequal crossing over happens during prophase of meiosis 1. Homologous chromosomes pair at this stage, and sometimes pairing occurs between the similar but not identical copies of a tandem duplication. If a crossover occurs within the mispaired copies, one of the resulting gametes will have an extra copy of the duplication and the other will be missing a copy. As an example, the beta-globin gene cluster in humans contains 6 genes, called epsilon (an embryonic form), gamma-G, gamma-A (the gammas are fetal forms), pseudo-beta-one (an inactive pseudogene), delta (1% of adult beta-type globin), and beta (99% of adult beta-type globin. Gamma-G and gamma-A are very similar, differing by only 1 amino acid. If mispairing in meiosis occurs, followed by a crossover between delta and beta, the hemoglobin variant Hb-Lepore is formed. This is a gene that starts out delta and ends as beta. Since the gene is controlled by DNA sequences upstream from the gene, Hb-Lepore is expressed as if it were a delta. That is, it is expressed at about 1% of the level that beta is expressed. Since normal beta globin is absent in Hb-Lepore, the person has severe anemia.
21
Chromosome Breaks DNA sometimes breaks due to mechanical stress, ionizing radiation, or chemical attack. Most organisms contain enzymes that reassemble broken DNA molecules, called non-homologous end joining. If there is more than one break, ends are joined randomly, which can lead to a rearranged genome. This breaks up blocks of genes over evolutionary time
22
Chromosome Rearrangements
When comparing mammalian genomes, it is clear that synteny is common: when two genes are neighbors in one species, they are usually neighbors in other species. However, comparing the genomes of two species show the results of multiple translocations and inversions. Blocks of syntenic genes are seen, but often spread across multiple chromosomes. Average size of synteny blocks between mouse and humans is 10 Mbp. Partly a consequence of the fact that genes on a chromosome mostly don’t interact with their neighbors. New centromeres often form in what was previously euchromatin. Centromere sequences evolve rapidly. The difference between human and chimp chromosomes (23 vs 24) is due to a translocation that connected the long arms of two ape chromosomes into a single human chromosome. Notable exception is the X chromosome: most X genes stay on the X over long evolutionary time. Problems with dosage compensation.
24
Genome Changes in Evolution
There are very few genes found in humans and nowhere else. Most of the differences between us and our closest relatives are changes in gene families, altered functions of existing genes, and changes in regulatory sequences. Human vs. chimpanzee: For sequences that can be aligned: 1.2% base substitutions, plus 3% differences in insertions and deletions (indels). There are fewer indels than base substitutions, but indels can cover many more bases. 1500 inversions, from very small (23 bp to 62 Mbp). 23 bp is at the detection limit for BLAST searches, and there are probably plenty of smaller inversions. Several hundred changes in gene family copy number Lots of changes in repeat sequences (3 x as many Alu elements in humans as in chimps) Loss of function in about 80 genes (half of which are olfactory receptors). About 29% of all proteins with clear orthologs are identical between humans and chimps, and most of the rest differ by only 1 or 2 amino acids.
25
Whole Genome Duplication
As the name implies, a whole genome duplication is an event where the genome size doubles, going from diploid to tetraploid. These events also require the chromosomes to pair up as if they were diploids during meiosis. Otherwise the organism would not produce offspring. Common in plants, but very rare in animals. Plants can undergo many generations of clonal (non-sexual) propagation. Two duplications in vertebrate lineage between when tunicates (urochordates) split from the rest of the chordates and when the cephalochordates (like Amphixous) split off. A third duplication in bony fish lineage, after they split from the tetrapod lineage. Maintaining a polyploid state occurs frequently in amphibians and reptiles, but it is thought that X chromosome inactivation and the problems of maintaining gene balance with 2 different sex chromosomes makes this very difficult in the mammals. The problem can be seen with the abnormalities associated with XXY and XO indivuals: Klinefelter and Turner syndromes.
26
Diploidization After a genome duplication, most of the genes are duplicated. What follows is a period of diploidization, trying to regain the stable diploid state, during which many genes lose one or the other copy. The result is that most genes end up with just one copy. Some genes retain both copies, and often there will be a functional divergence: they take on different roles. Notably, the Hox genes have retained all 4 copies: there are 4 clusters on different chromosomes that are recognizably similar all the way from the coelocanths (cartiligenous fishes on the tetrapod side of the fish/tetrapod split) to humans.
27
Hox Genes Hox genes specify segment identity: different members of the cluster are expressed in different segments as you move from anterior to posterior. Hox genes make transcription factors. Order of expression on the chromosome is the same as order in the body. Same mechanism used in and all bilateran animals. First described and understood Drosophila. Conservation is enough that a Drosophila Hox gene works correctly when put into chickens. Hox genes contain a homeobox domain, which is also found in plants and serves a similar role in development.
28
Horizontal Gene Transfer
In eukaryotes, there is little doubt that almost all genes are transmitted from parent to offspring, with each species having a separate line of descent. Large exceptions: endosymbionts, the mitochondria and chloroplasts. Many genes from these formerly free-living organisms have migrated into the nucleus. There are other cases of single genes being transferred horizontally. This is much less true in the prokaryotes, where a great deal of DNA is transferred across species lines. I have seen an estimate of 15% of all prokaryotic genes are derived from horizontal transfers Horizontal gene transfer is usually identified by performing phylogenetic linage studies on individual genes, and seeing that some gene has more in common with genes in distant species than with genes in closely related species.
29
Sources of New DNA Bacteria reproduce by binary fission: replicating their DNA, then splitting in half. Each cell has only 1 parent, and there is no regular sexual process. Bacteria have 3 main ways of bringing in new DNA: conjugation: direct transfer of DNA between 2 cells (although not necessarily of the same species) transduction: transfer of DNA between cells using a bacteriophage (virus) as an intermediate transformation: the cell takes up DNA molecules from the environment
30
Lysogenic Bacteriophage
Bacteriophage (phage) are bacterial viruses: DNA (or RNA) surrounded by a protein coat, but with no internal metabolic activity. Most bacteriophage enter the cell, hijack its machinery to reproduce themselves, and then kill the cell by lysing it (breaking it open). This is called the lytic cycle. Some phage have the ability to insert themselves into the bacterial genome and remain there, inactive, for many generations: the lysogenic cycle. First described in phage lambda the inserted phage chromosome is called the prophage. When conditions get harsh, the phage DNA comes out of the chromosome and enters the normal lytic pathway. It reproduces and kills the host cell. Sometimes the prophage is inactivated by mutation and becomes a permanent part of the chromosome.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.