Presentation is loading. Please wait.

Presentation is loading. Please wait.

DNA Polymerase and the Replication Fork

Similar presentations


Presentation on theme: "DNA Polymerase and the Replication Fork"— Presentation transcript:

1 DNA Polymerase and the Replication Fork
Lecture 1: DNA Polymerase and the Replication Fork Use of biochemistry (assays) and genetics (mutant phenotypes) to define function Fidelity/Specificity: bioregulation through substrate control of molecular choice Goal of lecture is to introduce you to some of the concepts and approaches that will be used throughout the course It will also serve as a bridge to some of the concepts and approaches that you learned in Macromolecules and that you will learn in Genetics

2 Using both Biochemistry and Genetics to understand function
reverse genetics (targeted gene disruption by HR, ZFN, TALEN, CRISPR/Cas9) We are trying to understand how biological systems function. These functions are operationally defined by biochemical assays, which specify protein activities, or by mutant phenotypes, which specify gene functions. Hence, developing ways to assay activities and characterize phenotypes is critical to understanding function. The most straightforward biochemical assays to develop and the ones you see most frequently are assays that simply detect binding interactions. More challenging are assays that detect chemical changes, such as biomolecular synthesis or protein modification For many other activities, however, developing a biochemical assay can be quite difficult. These include Activities that change the conformation of a protein, nucleic acid, or complex Activities that modulate the fidelity, speed, reliability, specificity, timing, or dependency of another activity. Activities that are only released when specific proteins or nucleic acid components are present in just the right context. Nonetheless, once you have an assay in hand, you can purify the activity, identify the proteins and/or nucleic acids responsible for that activity, and study the function biochemically The advantage of using biochemistry to study function is: one can demonstrate that a specific set of proteins and/or nucleic acids act directly to carry out the function one can show that they are all that you need to carry out the function (i.e. they are sufficient) one can dissect the function quantitatively at an enzymological, chemical, even atomic level The disadvantage of studying functions biochemically is that there is always concern that the conditions you create in the test tube for your assay is not fully relevant to the biological setting in the cell or organism. As you will see in the course, protein activity can be greatly modulated and controlled by interactions with other proteins. And proteins that interact together can carry out more complex regulated processes than the individual proteins alone So finding interacting proteins is an important part of the biochemical analysis of function. expression interference (RNAi, siRNA, CRISPRi)

3 Using both Biochemistry and Genetics to understand function
reverse genetics (targeted gene disruption by HR, ZFN, TALEN, CRISPR/Cas9) To study function in the context of a cell or organisms requires a genetic analysis of function. This is most reliably done using “loss of function” mutations, I.e. mutations that disrupt the function of a gene The mutant phenotype one looks for is a function or behavior that one can normally detect in the wild-type but is missing or perturbed in the mutant. From that phenotype can infer that that function or behavior requires the gene you have mutated. The advantage of studying function genetically is that you are looking at a gene’s function in is natural biological setting, i.e. in the context of the whole cell or organism The disadvantages are: (1) you don’t know whether the gene is directly or indirectly carrying out that function (2) nature often has backup mechanisms that will mask mutant phenotypes (what people often refer to as functional redundancy) (3) it is harder to carry out precise quantitative analysis of molecular function in whole cells or organisms. Starting with a mutation there are various ways to identify and clone the gene that is mutated. And with both genes and mutations there are ways to identify interacting genes that also participate in the function of interest. Thus, genetic and biochemical approaches have complementary advantages and disadvantages, and ideally one would like to use both approaches to study a molecular process. expression interference (RNAi, siRNA, CRISPRi)

4 Using both Biochemistry and Genetics to understand function
reverse genetics (targeted gene disruption by HR, ZFN, TALEN, CRISPR/Cas9) Molecular Biology has revolutionized our ability to identify and work with proteins and genes. With a gene we can make readily make large amounts of protein and with a protein we can easily identify the gene The huge number of proteins and genes characterized by molecular biology has also shown us that similarity in protein primary structure (I.e. protein homology) is often associated with similarity in protein function. Hence, finding homologous genes (and proteins) that already have known functions (or activities) can provide hints to the function of your gene or proteins Nonetheless, demonstrating that a protein has a specific function (whether or not it shares homology with another protein of known function) still requires developing biochemical assays or performing genetic analysis of mutant phenotypes Finally molecular biology has allowed us to identify functionally related genes and proteins on a comprehensive scale through genomics and proteomics, making it easier to identify most if not all the genes and proteins involved in a process. Despite this explosion of information about genes and proteins at this level, real functional understanding still requires us to take things to the top level of biochemical assays and genetic phenotypes So how do we get to this level from genes and proteins expression interference (RNAi, siRNA, CRISPRi)

5 Using both Biochemistry and Genetics to understand function
reverse genetics (targeted gene disruption by HR, ZFN, TALEN, CRISPR/Cas9) On the genetic side the key has been the development of molecular biological tools that enhance our ability to do reverse genetics or reverse “pseudogenetics”(I.e. expression interference), i.e.starting with a gene, mutating that gene or disrupting its action in a cell or organism so we can assess its function. In yeasts, powerful endogenous homologous recombination machineries have long allowed virtually any type of mutation you can generate through molecular biology to be introduced into the yeast genome. The malleability of its genome has made yeast a premier systerm for understanding eukaryotic function and mechanism HR mediated gene disruptions were later developed for mice using embryonic stem cells, but this tool still requires considerably more time and effort than in yeast. Ways of interfering with the transcription or translation of genes using double stranded RNA were developed around 15 years ago. These “pseudogenetic” expression interference methods provided the first abilty to perform functional analysis in many cells and organisms, but were often slow, leaky, and not completely specific. To perform targeted disruptions of the genes themselves techniques have been developed to fuse site specific DNA binding proteins to nonspecific endonucleases. These include Zinc Finger Nucleases (ZFN) and Transcription Activator-Like Effector Nucleases (TALEN). However, the most powerful method was only developed about 4 years ago: Clustered Regularly Interspaced Short Palindromic Repeats with the Cas9 endonuclease (CRISPR-Cas9) The last method has been revolutionary because it has brought the ease of yeast genetics closer to most other cells and orgasms. expression interference (RNAi, siRNA, CRISPRi)

6 Using both Biochemistry and Genetics to understand function
reverse genetics (targeted gene disruption by HR, ZFN, TALEN, CRISPR/Cas9) On the biochemical side, molecular biology has not been as succesful in faciliating the development of new biochemical assays for new proteins That is because recreating the specific conditions and settings needed to uncover each protein’s true biochemical functions still requires a lot of guesswork and trial and error. Thus, the development of biochemical assays remains a major bottleneck for the biochemical analysis of protein activities. In this lecture I will provide an example of how such a biochemical assay was developed and how studies with that assay led to new biochemical functions. Finally, the presentation of functional anaylsis as a dichotomy between biochemical and genetic approaches is a bit of an oversimpliification. For example, mutations are often used to perturb the structure of proteins to understand how that structure contributes to function, e.g. what is the active site of an enzyme And the analysis of protein behavior can provide a way of analyzing the mutant phenotypes Nonetheless, this dichotomy provides a useful framework for thinking about two major ways of analyzing biological function. expression interference (RNAi, siRNA, CRISPRi)

7 DNA Replication: The Task and Challenge
Semiconservative Duplication Another source of insight into biological function comes from structural studies These studies can range from the identification of the proteins in a complex, to atomic level analysis of protein or nucleic acid structures Francis Crick has been quoted as saying “If you want to understand function, study structure” It is important to remember, however, that structural insights are only clues, and that the functions that they suggest need to be verified by experimental test. A classic case where structural clues initially led in the wrong direction was in figuring out the active site of the ribosome, as will be discussed by Raul Andino in the translation segment of this course. The structure of DNA first proposed by Watson and Crick proved to be a rich source of functional clues, which may have motivate the Crick quote. It provided the first hint of how DNA might be replicated, namely semiconservatively with each parental strand providing the instructions for how to build a complementary daughter strand. This semiconservative mechanism was demonstrated by Meselson and Stahl in a paper you will read for discussion This paper provides a prime example of how to address one type of frequently asked scientific question (what is the fate of molecules), as they needed to follow both the fate of the parental DNA strands and the fate of the newly incorporated nucleotides in order to demonstrate that replication was indeed semiconservative Other observations describing the nature of replication are summarized in this slide and they required assays to define and measure these various characteristics. We would like to obtain a molecular understanding of how these characteristics of replication are achieved. Speed: very rapid duplication of every nucleotide (ex: 6 x 109 bp in 8 hrs in humans) Fidelity: extremely low error rate (~1/109 nucleotide error rate) Count: exactly two copies of every sequence per cell cycle Regulation: coordination with other chromosomal events (eg.mitosis, repair, recombination, transcription, chromatin packaging)

8 Enzymology of DNA Synthesis: DNA Polymerases
dNTP precursor - pyrophosphate release provides energy Instructed by single-stranded template - senses complementarity of new nucleotide At the core of how DNA replicates is an enzyme, DNA polymerase, which was discovered by Arthur Kornberg Shown here is a textbook description of this enzyme listing some of its key properties Two of them (asterisked) are important for he fidelty of the polymerase and the first, the primer requirement, will be discussed in more detail below. Most of you learned about these properties in high school or college. In Bioreg we are interested in HOW these properties were determined. Without knowing whether a DNA polymerase actually exists or how it might work, how did Kornberg look for an activity that synthesizes DNA? Primer requirement* - senses complementarity of primer 5’ > 3’ polymerization off primer* - extension off 3’ hydroxyl - moving 3’> 5’ on template * enhances fidelity by allowing error correction

9 Assaying DNA Polymerase Activity
In principle: Monitor incorporation of radioactive nucleotide precursors ( ) into acid insoluble form (physically separate product from precursor) The key was to first design an assay for a DNA polymerizing activity. The principle was very simple: look for an activity that converts labeled nucleotides, which are acid soluble, into polynucleotide, which are acid insoluble and thus can be trapped on a filter. Applying the principle in practice, however, can be quite complex Importantly, the details of the assay determine whether you can actually see an activity and exactly what type of activity you are assaying. This creates a Catch-22, because without the assay, you can’t study and identify the specific requirements of an activity, yet without knowing these specific requirements it is difficult to devise a workable assay. This is why biochemical assay development cannot be standardized and often requires bootstrapping your way up. It illustrates how experimental science, even when driven by first principles and logic, can require a lot of artistry and serendipity. In Kownberg’s case his lab tried to detect polymerizing activity in extracts of E. coli cells using C14 labeled thymidine, a nucleotide precursor he could scavenge from a colleague’s experiments DNA was added in the hope that it would act as a nuclease decoy so that the desired products would not be immediately digested Given what we now know to be the requirements for DNA polymerase activity, these reagents should not have supported this activity, but serendipity saved the day. The E. coli extract was not just a source of polymerase activity but was also a source of kinase activity needed to phosphorylate thymidine (T) to the full triphosphate form The DNA when attacked by the nucleases in the extracts provided a source of A, G, and C mononucleotides, which were also converted to triphosphates by kinases in the extract The DNA was also nicked and gapped by nucleases in the extract to provide the primer-template In other words the initial assay was actually detecting a complex mixture of activities, so it Is not surprising in retrospect that the first positive results showed only 50 out of 1 million cpm of C14 thymidine converted to acid insoluble form. Presumably this weak signal was believable because it was better than control reactions (leaving out a component? or heat inactivating the extract?) And as long as there is some believable activity, a biochemist can gain confidence that it is real by seeing if it can be enhanced by purification The idea is that, as you fractionate the protein mixture and identify the fractions containing your activity, the ratio of activity per protein, i.e. the specific activity, of the active fractions should increase. In this case, however, the purification was fraught with difficulty because their initial activity was actually a composite of many different activities Hence, under their original assay conditions, their composite activity would have disappeared as the proteins mixture was fractionated. By figuring out how to change their assay conditions (particularly their assay reagents) to restore incorporation activity, they also began to figure out the true requirements and properties of their polymerase activity. For example, one can imagine that the incorporation activity was lost on columns that separate the thymidine kinase activity from the DNA polymerase activity. When they figured out that they could restore activity by using dTTP instead of Thymidine, they also figured out that the true monomer substrates for the polymerase activity were nucleotide triphosphates. With such challenges at every stage of purification it took almost 10 years before they could claim to have purified their DNA polymerase activity. Finally, in experiments I do not have time to discuss, careful characterization of the reaction product showed that the order of nucleotide polymerization was “instructed” by the sequence of the template. In practice: Can be difficult to devise the right assay conditions when you do not know the precise nature of the activity E. coli extract - source of polymerase activity but also kinase and nuclease activity 14C Thymidine - converted to thymidine triphosphate by kinases in extract DNA intended as nuclease decoy but nucleases convert to primer-template and source of A,G,C nucleotides Initial conditions used were really assaying a complex mixture of activities: First Experiment: 50 out of 1 million cpm insoluble Ten Years Later: purify DNA Polymerase I, show it is template-directed and figure out enzyme requirements

10 DNA Polymerase Structure and Catalysis
Crystal structure of bacteriophage T7 DNA Polymerase complexed with primer-template and dNTP Two Mg++ ions positioned by conserved acidic residues catalyze reaction So how does DNA polymerase catalyze polymerization of nucleotides in a manner instructed by a single stranded template? This slide illustrates the use of structure and evolution to provide hints for biochemical function For enzymes one would like structures of the protein bound to substrates, and ideally with substrate analogs that can trap the enzyme in the middle of a reaction. From this we can hypothesize a precise series of chemical steps based on the proximity of various reactive groups in the enzyme and substrate But as mentioned earlier structure can only provide functional hints Genetic disruption of residues followed by further structural and biochemical studies are often needed to establish the importance of amino acids in active site and reveal what they Another important source of clues about function comes from evolution, which is essentially genetics performed by nature using a very simple phenotype: competitive survival Evolutionary conservation of residues provides hints of the functional importance of key residues, but it is up to the scientist to figure out the precise nature of that function In all DNA polymerases, for example there are two conserved acidic residues in the active site that appear from structural studies to position two metal ions so that they can catalyze the nucleophilic attack of the 3’ hydroxyl of the primer end onto the 5’ phosphate of the incoming nucleotide. Mutation of these acidic residues has confirmed that they are indeed critical for this chemical step Primer Template Rest of enzyme positions primer-template and dNTP and ensures catalysis only occurs with proper “fit” Structure resembles a right hand

11 DNA Pol I has 3’ > 5’ Exonuclease Activity
Careful quantitative analysis of biochemical activity can suggest biological function * T * T T T T T T T Exo Assay: 5’ 3’ DNA Pol I 5’ 3’ AAAAAAAA 3’ AAAAAAAA 5’ no dTTP 3’ 5’ exo activity is slow relative to pol activity exo activity is enhanced by stalling pol activity or making 3’ end single-stranded 3’ mismatch generates both conditions Sometimes seredipity can play a role in defining new assays and functions, although as Louis Pasteur noted “Chance favors the prepared mind.” When a labeled DNA substrate was incubated with DNA Pol 1 in the absence of nucleotide precursors, acid precipitable counts became acid soluble Thus an exonuclease activity was discovered that was eventually demonstrated to work in a 3’ to 5’ direction on the primer strand, I.e. in direct opposition to the 5’ to 3’ polymerase activity First thought a biochemist might have is maybe my “pure” polymerase activity is not so pure after all. Could this exonuclease activity belong to a contaminating protein? Formally you can never say that an enzyme prep is absolutely pure as you can always detect an impurity if your detection methods are sensitive enough But you can subject a prep to further purification, using the specific activity to ensure that you increasing the purity of the prep And, if during further purification two activities always cosegregate with overlapping activity peaks, it becomes less and less likely that there is a contaminant Second thought you might have is that the exonuclease activity is simply the reverse of the polymerase activity After all in Macromolecules you learned that enzymes enhance the rate of both forward and reverse reactions by lowering the free energy for the common transition state of both reactions. However, the products of the exonuclease reaction were not nucleoside monophosphates, not triphosphates as one would expect with a reverse reaction So it became apparent that the DNA polymerase I does indeed have a 3’ to 5’ exonulease activity in additoin to its 5’ to 3’ polymerization activity. How does one make biological sense of these opposing activities that in principle can cancel each other out? Important hints came from careful quantification of these activities and identification of their dependencies. Exo activity was very slow relative to the polymerase activity but could be enhanced by stalling the polymerase or making the 3’ end of primer more single-stranded Both conditions are generatred by a 3’ mismatch. So the idea arose that the exo activity acts primarily when there is a mismatch and not as much when there is correct base pairing; I.e it allows polymerase to proofread and correct mistakes by providing a baskspace or delete button This led to the design of a “proofreading” assay to show that the exo activity can indeed work to correct mismatches at the ends of primers These experiments showed that both exo and polymerase activities are sensing the primer-template pairing and provides a reasonable explanation for why polymerase evolved to depend on a primer As we will discuss in more detail later, the primer requirement allows an enhancement of fidelity through exonuclease error-correction. Proofread Assay: T AAAAAAAA DNA Pol I + dTTP 5’ 3’ * C mismatch specific exo activity under normal pol conditions both pol and exo activities are sensing primer-template pairing *

12 The Polymerase and Exonuclease Activities of
Replicative DNA Polymerases Reside in Distinct Domains 2- Mode Model for Polymerase Function Polymerizing Editing Movement between P and E sites requires primer-template unwinding translocation of 3’ end The presence of two distinct activities in DNA polymerases I illustrates another important point about proteins Most proteins have multiple distinguishable functions Structure function analysis of protein helps us understand the different activities and allows us to specifically disrupt one activity to understand the function of that one activity. Exonuclease activity mapped to a different site than polymerization activity; key residues were identified that chelate a magnesium used to catalyze the exonuclease activity. The distance between the two active sites is about angsroms, which explain why primer template unwinding (about 3 bp unwinding is needed), which is needed to get the 3’ end of the primer into the exo active site, facilitates exo activity. How do you establish that this exo activity really does promote incorporation fidelity? Here molecular genetics is again needed to establish function in cells. You first need an assay to monitor the fidelity of incorporation into the cell’s DNA Then you mutate the key exonuclese residues and measure the effect on incorporation fidelity. Later Carol will show you that RNA polymerase, also can hydrolyze and remove incorporation errors, but it requires a second protein to modify the polymerizing active site, and does not use a distinct site. Polymerase Active Site ~ 40 Å Exonuclease Active Site

13 Using both Biochemistry and Genetics to understand function
reverse genetics (targeted gene disruption by HR, ZFN, TALEN, CRISPR/Cas9) So let’s review our discussion of DNA polymerase in the context of our discusion on how to understand function through biochemistry and genetics Arthur Kornberg wanted to look for a nucleotide polymerization activity, developed an assay for this activity, then used the assay to purify DNA polymerase During the purification he defined the requirements for this activity and, once purified, quantified its properties. While studying the polymerization activity of this protein he stumbled upon a second assay that revealed an exonuclease activity for the protein From careful quantitative analysis of this activity, he suspected it provded a proofreading function to remove misincorporated nucleotides and he developed a third assay for this function. Kornberg received the Nobel prize for his discovery of DNA polymerase 1 And initially it was thought to be the replicative polymerase resonsible for duplicating the E. coli genomed But whether this was indeed the case could not be addressed by all his elegant biochemical studies Genetic studies that mutated the gene for DNA polymerase were needed Unfortunately, the tools that we now have to shift to the left side of this diagram were not available It was not possible to identify the gene based on the protein, much less obtain a targeted mutation of that gene by reverse genetics. Moreover, Kornberg was a dyed-in-the-wool biochemist with little interest in doing genetics Thus it took another scientist, John Cairns, to investigate the function of DNA polymerase 1 by genetics expression interference (RNAi, siRNA, CRISPRi)

14 DNA Pol I is not the replicative DNA polymerase in E. coli
Illustrates importance of genetics for establishing functional relevance in cell Use biochemical assay to screen for mutants lacking DNA polymerase activity mutagenize plate phenotype: assay dNTP incorporation into DNA E. coli mutant E. coli When Cairns started to address this question doubts were beginning to be raised about whether DNA polymerase 1 was indeed the replcative polymerase for E. coli First its incorporation rate in the test tube was way too slow to account for the speed at which the E. coli genome was replicated Second, DNA polymerase I had a third independent activity that one could imagine would make more sense if it were a repair polymerase rather than a replicative polymerase This activity was a 5’ to 3’ exonuclease activity that could digest DNA in front of the path of the 5’ to 3’ polymerizing activity (When you hear the term Klenow fragment, it is referring to the DNA polymerase I protein with this 5’ to 3’ exonuclease domain proteolytically cleaved off) Hence, Cairns started his investigation with the bias that DNA Pol1 might not be the replicative polymerase and thus its gene might not be essential He had his technician Paula De Luca randomly mutagenize E. coli and used the biochemical assay for DNA polymerizing activity in crude cell extracts to screen for mutant isolates that lacked DNA polymerase I activity. This was a brute force screen, but given the available tools it was arguably the most direct and efficient way to search for a DNA Polymerase I mutant. They found their holy grail with the 3473rd extract, which showed < 1% of wild type polymerizing activity De Luca and Cairns inferred that this mutant had a mutation in the gene for DNA Polymerase 1, or at least had greatly reduced DNA Pol I activity in the cell. In honor of his technician Paula, Cairns named the gene polA and the mutant polA1, but the inclusion of the letter A also anticipated the possibilitiy that other DNA polymerase genes would be discovered in the future The fact that this mutant was normal for growth provided the first solid evidence that DNA Pol1 might not be the replicative polymerase While acknowledging this likelihood Cairns was careful NOT to rule out DNA Pol1 as the replicative polymerase or having a role in DNA replication In fact, although it was later confirmed that DNA Pol1 is NOT the replicative polymerase, we now know it is important for DNA replication, specifically in processing okazaki fragments. Cairn’s Discussion is an illustration of how cautious one should be about inferring function from genetics. And to get practice thinking about the limitations of genetic analysis for your bioreg final exam, I would strongly recommned reading the Appendix slide “The Awesome Challenges of Genetics”, which describes the caveats to the genetic analysis of polA mutants Finally, we note that if DNA PolI had provided an essential function, the mutants he wanted would have been dead and he would have come up empty handed. In that case, Cairns would have had to redesign the screen so that he could isolate a temperature sensitive conditional mutation in DNA Pol I The mutagenesis and initial plating would be done at room temp (instead of 37° C, the normal growth temp for E. coli), followed by replica plating at an elevated temp of 42° C to first identify those colonies that were temperature sensitive. Then only the temperature sensitive colonies would be screened with the cells being shifted to 42° C several hours before making the extracts. extracts from single mutant colonies mutant 3473 (polA1) has <1% wt activity normal growth repair deficient Purification of residual polymerase activity from polA1 yields DNA Pol II and Pol III Genetics and biochemistry later show: - DNA Pol III is the replicative polymerase - DNA Pol I is important for okazaki fragment maturation

15 Purification of DNA Pol III:
Different Template, Different Assay, Different Activity Introducing the concept of holoenzymes and modular enzyme subassemblies The isolation of the polA1 mutant also provided a very useful biochemical tool. Under the assay conditions initially used to detect DNA polymerase, DNA polymerase 1 was the predominant polymerase activity detected. This is partly because there is so much DNA Pol1 activity in the cell, but also because the nicked gapped DNA used in the assay is an optimal primer-template for this polymerase In the mutant it was now easier to detect and purifiy other polymerase activities, which eventually led to Tom Kornberg purifying the true replicative DNA Polymerase in E. coli, DNA Pol III DNA Pol III is a 3 subunit protein, with one subunit containing the polymerase activity and another the 3’-5’ proofreading exonulcease activity. Previous screens for temperature sensitive mutants that could not replicate DNA at the restrictive temperature had generated batteries of “replication” mutants which couldn’t be characterized further for want of better replication phenotypes When these mutants were screened for loss of PolIII activity (with a polA1 mutation in the background to eliminate DNA PolI activity) it was found that one of these mutants dnaE had little DNA PolIII activity This indicated that DNA Pol III was somehow important for DNA replication but didni’t establish that it was the replicative polymerase And the fact that it incorporated nucleotides at a very slow rate did not help the argument What made DNA Pol III a more compelling candidate came from purification of a related but different incorporation activity It is important to remember that exacly how an assay is performed defines exaclty what activity you are assaying So slight changes in an assay can significantly alter the activity one is detecting. Using a more “realistic” replicative template which allows the uninterrupted synthesis of thousands of nucleotide of DNA, a new polymerase activity was purified and found to consist of a large multiprotein complex that included two DNA Pol III cores loosely associated with other multisubunit subcomplexes. This superactive form of DNA Pol III had the rapid incorporation rate needed to replicate the enbtire E coli genome in a timely manner. As we will discuss in a later lecture, its activity was dependent on rATP, and not just driven by the hydrolysis of phosphates from dNTPs To distinguish between these fundamentally similar but functionally distinct complexes yet acknowledge their underlying similarity the original complex was called the DNA Pol III core and the new superactive complex was called the DNA PolIII Holoenzyme. This holoenzyme organizational strategy is used repeatedly by nature and will be seen in other parts of Bioreg, such as in the transcriptional machinery Fundamentally it involves modifying the activity of a core enzyme complex by incorporating it into a higher order complex with other protein subunits and/or complexes Exactly how stable these higher coplexes have to be to be called a holoenzyme varies among different systems. But the basic concept of modifying a core activity remains the same Why does nature bother with cores enzymes and more loosely associated holoenzymes? Why not just build the optimal holoenzme as a single tight complex? One can imagine at least two reasons: Versatility: allows core to interact with different subunits to form different holoenzyme to perform different tasks Specificity: allows precise regulation of enzyme activities through the controlled assembly and disassembly of protein complexes

16 Using both Biochemistry and Genetics to understand function
reverse genetics (targeted gene disruption by HR, ZFN, TALEN, CRISPR/Cas9) So once again to place things into the context of biochemical and genetic analysis of function: Cairns does a brute force genetic screen using a biochemical assay to identify a mutant that has disrupted DNA polymerase I activity He identifies polA1 and argues that this mutant has likely mutated the gene for DNA polymerase I Further phenotypic characterization of this mutant indicates that it grows well, synthesizes DNA well, but has a DNA repair defect The genetics thus supports the notion that DNA polymerase I may not be the replicative DNA polymerase. The polA1 mutant is then used to detect and purify other DNA polymerase activities using the same original assay, and this leads to purification of DNA polymerase III Modifying the assay to use a more “relevant” replication template identifies a larger form of DNA polymerase III, DNA PolIII holoenzyme, which now exhibits polymerization rates expected of a replicative DNA polymerase. But how do you provide evidence that it could actually be the replicative DNA polymerase in cells? You need to isolate mutants, in this case conditional ones, that severely disrupt DNA replication in the cell at the restrictive conditions. It turns out these mutants had already been isolated two labs had already performed genetic screens for temperature sensitive mutants with defective DNA synthesis in vivo. Further characterization of these mutants had been stalled because of the lack of tools and approaches to study DNA replication further in cells But with the abiity to make extracts from them and assay them for DNA Polymerase III activity (after first combining them with polA1 mutants so that DNA Polymerase I activity would not interfere in their assays) the dnaE mutant was shown to lack DNA Polymerase III activity. This established that DNA Polymerase III is required for DNA replication in cells and was consistent with the notion that Dna Pol III is the replicative DNA polymerase. Note that because the genetics cannot tell you whether the requirement for DNA Pol III was direct or indirect, this result does not establish that DNA Pol III is the replicative DNA polymerase That had to await the reconstitution of the replication fork from purified proteins, where one can biochemically isolate and assay the replicative polymerase activity. Nonetheless the biochemical properties of the DNA Pol III holoenzyme plus the temperature sensitive mutant data convinced most people that DNA Pol III was the best candidate for the replicative DNA polymerase in the cell. expression interference (RNAi, siRNA, CRISPRi)

17 Controlling Molecular Choice
Bioregulation I now turn to what I consider to be the core of bioregulation: the control of molecular choice One of the simplest forms of bioregulation is establishing fidelity in a biomolecular process You want the correct molecules to be transformed by the process and you want to prevent incorrect molecules from participating in the process In the rest of this lecture I use the fidelity of DNA polymerase to illustrate how nature uses equilibria and rate constants to make discriminating molecular choices. Controlling Molecular Choice

18 Contributions to E coli DNA Replication Fidelity
Fidelity Overview Contributions to E coli DNA Replication Fidelity Fidelity Comparisons Error Rate Product Size Speed Error rate DNA Replication Intrinsic Fidelity (polym) 500 bp/sec 5 x 106 (Prokaryotes) (E. coli) (sensing dNTP complementarity to template) 50 bp/sec 6 x 109 (Eukaryotes) (humans) Exonuclease Proofreading (polym) Among nucleic acid driven processes, DNA replication has the highest demand for fidelity -- DNA is the largest polymer -- Each copy of DNA is critical and cannot simply be discarded, if it is incorrectly synthesized (unlike RNA or proteins) For DNA replication, correct nucleotide is that which forms Watson-Crick base pairing with the template nucleotide at the incorporation position Any other nucleotide would be incorrect. The error rate for incorporating incorrect nucleotides during DNA replication 10E-9 to 1oE-10 Three mechanisms contribute to this extremely high fidelity -- making sure the correct nucleotide is incorporated by the repilcative polymerase into the growing DNA chain -- removing incorrect nucleotides that have just been incorporated are removed by the replicative polymerase -- removing incorrect nucleotides soon after the replication polymerase has moved down the DNA molecule For a mistaken incorporation to occur and persist in the DNA it must elude all three mechanisms Thus, the probability that it will do so is the product of the probabilities for eluding each individual mechanism --Later you will see that protein translation can be divided into two basic steps: -- the covalent attachment of an amino acid to the correct tRNA -- the incorporation of the amino acid on the tRNA into a protein In this case a mistake in the first step is NOT filtered out by the second step. For a mistaken incorporation an amino acid only has to elude the fidelity mechanisms of either the first OR the second step Hence the probability of an translational error is NOT the product of the error rates of the two steps, but closer to the sum 1 x 1011 (sensing primer complementarity to template) (lily) Mismatch Repair (post polym) 10-2 RNA Transcription 30 bp/sec 10-4 (sensing complementarity of two strands) (distinguishing parental and daughter strands) Protein Translation 20 aa/sec 10-4 Overall Replication Fidelity

19 Geometry From Crystal Structure
How to Distinguish Mismatch versus Correct Base Pair Geometry From Crystal Structure Sources of Polymerase Discrimination Steric Constraints (structure/geometry) H-bonding (binding energetics) Outside the active site, unpaired nucleotides are H-bonded to H2O. Inside the active site these H-bonds can be replaced by WC base pairing but only incompletely replaced by mismatch pairing Mismatch H bonding can also exacerbate steric and stacking clashes (see below) Imposed by enzyme’s “induced fit”, which can test for precise base pair geometry, proper base stacking, and correct primer-template fit. WC bp What are the principles by which correct Watson-Crick base pairs can be distinguished from incorrect NON-Watson Crick base pairs. The two basic principles are energetics and geometry When incoming and template nucleotides are unpaired, their hydrogen bond donor and acceptor moieties are hydrogen bonded with water molecules When these nucleotides form a Watson-Crick base pair in the polymerase active site, these hydrogen bonds with water molecules are replaced by hydrogen bonds with each other, an energetically favorable reaction Further promoting the reaction is the base-stacking interaction between the newly formed base pair and the adjacent primer-template base pair. Mismatched nucleotides still form some hydrogen bonds with each other, but a hydrogen bond donor or acceptor will be unpaired And because the polymerase active site excludes water molecules when a free nucleotide enters, there will be a net loss of a hydrogen bond when a mismatch occurs Thus, correct Watson-Crick base pairing is energetically more favorable than mismatched base pairing, providing one source of discrimination This difference in binding energy manifests itself in the occupancy of the active site The nucleotide that forms correct Watson-Crick base pairing will occupy the active site for much longer than incorrect nucleotides However, this energetic difference can only account for a 10 to 100-fold preference for the cognate (correct) over the noncognate (incorrect) nucleotides. Another if not more important source of discrimination is the geometry of the base pairs Mismatches do not greatly perturb the global structure of the helix; there is no bulge in the helix. However, Watson Crick base pairs share internucleotide distances and base-deoxyribose bond angles that distinguish them from all mismatched base pairs These are subtle differences, but DNA polymerases can sense these differences by undergoing an “induced fit” causing the active site to fit snugly around the base pair. The Watson Crick base pair allows this induced fit, mismatched base pairs prevent this fit due to steric clashes Because the proper alignment of amino acid residues needed for the chemistry of polymerization requires the conformational change associated with the induced fit, mismatches don’t allow the active site to become fully active. Thus, the slight geometric differences between Watson Crick base pairs and mismatched base pairs translates into a large difference in the rate of polymerization mismatch WC bp mismatch mismatch Global structure of helix is not greatly perturbed But there are: differences in C1’ - C1’ distance and C1’ bond angles protrusions of bases into major groove loss of universal H acceptor positions in minor groove

20 Slow Conformational Change
Intrinsic Fidelity: Potential Base Pair Discrimination for dNTP at Three Stages Of the DNA Polymerization Reaction Cycle Example: T7 DNA polymerase (arrow thickness roughly corresponds to rate CONSTANTS) Other polymerases discriminate differently at each stage k conf C k pol C K D C E DNA N dNTP C E DNA N dNTP C * E DNA N+1 C PP i Reaction pathway for correct nucleotide E DNA N Enzymologists can measure rates or equilibria for various steps in the polymerization reaction and quantify the differences when correct versus incorrect nucleotides participate This allows us to understand the basis for the intrinsic fidelity of DNA polymerases, the fidelity exercised when a polymerase incorporates a free nucleotide Shown here is a simplified schematic in which the multiple steps that can be detected by enzymologists have been boiled down to three major stages: Rapid psuedoequilibrium of free nucleotide entering or leaving the active site to base pair with the template strand A kinetically slow step prior to actual polymerization that is presumed to correspond to a conformational change A faster polymerization step In these reaction pathways, the arrow thicknesses roughly corresponds to the size of the rate constants for each stage. Recall from Macromolecules, that the rate or flux of of a reaction is determined by the product of the rate constant and the concentration of the substrates for that reaction. The top pathway illustrates the kinetic parameters for the reaction with the correct nucleotide The bottom pathway corresponds to the reaction with an incorrect nucleotide. The overall specificity is determined by the overall rate difference between these two pathways, which for most replicative polymerase is on the order of 1 x 10E-5. Comparing the rate or pseudoequilibrium constants for correct and incorrect nucleotides reveals how much specificity is provided at each stage, which differs from polymerase to polymerase Shown here are the rate and pseudoequilibrium constants measured for the bacteriophage T7 DNA polymerase For this polymerase the first two stages provide the bulk of the specificity -- the correct nucleotide binds the active site 400x better than the incorrect nucleotide so there will be a 400-fold greater likelihood that the correct nucleotide will be occupying the active site than the incorrect nucleotide if the two nucleotides have identical concentrations in solution -- the rate of the slow conformational change is 1000x faster for the correct nucleotide over the incorrect nucleotide Together they provide a 400,000 fold advantage to the correct nucleotide Reaction pathway for incorrect nucleotide k conf I k pol I K D I E DNA N dNTP I E DNA N dNTP I * E DNA N+1 I PP i 1 2 3 Rapid dNTP Binding Pseudo-equilibrium Slow Conformational Change “Induced Fit” Polymerization Reaction K D I C 20 µM > 8mM ~ 400x k conf Rapid and Not Measured 0.3 s 300 s -1 1000x

21 N+1 E DNA C I PP i But this is not the only opportunity to discriminate between correct and incorrect nucleotide Let’s now see happens to the correct and incorrect nucleotide when they undergo the next polymerization cycle After the pyrophosphate diffuses out of the active site and the polymerase translocates down the primer-template to open up the active site to a new incoming nucleotide… PP i

22 E DNA E DNA C N+1 …the enzyme is now poised to add another nucleotide.
In this case we are going to presume that the incoming nucleotide is correct for both, and that they only differ in the nature of the nucleotide that was just incorporated and is now at the 3’ end of the primer N+1 E DNA I

23 Slow Conformational Change
DNA N dNTP C * N+2 PP i K D k pol N+1 conf Fast reaction pathways for correct primer with correct nucleotide Slow reaction pathways for incorrect primer with correct nucleotide K D C I k conf C I pol k C I Like the previous slide the addition of a new nucleotide can be divided into at least three major stages But unlike the previous slide we are presuming that the incoming nucleotide is correct for both reaction pathways, so there is little or no discrimination for the first step. However, there is still a big difference possible for the second and third step because there is a primer mismatch in the bottom pathway due to misincorporation of an incorrect nucleotide This primer mismatch can interfere with the induced fit or the polymerization stage. In either case it will greatly slow the lower pathways relative to the upper E DNA N+1 I dNTP C E DNA N+1 I E * dNTP C DNA N+1 I I E DNA PP N+2 i Rapid dNTP Binding Pseudo-equilibrium Slow Conformational Change “Induced Fit” Polymerization Reaction 1 2 3

24 Slow Conformational Change
Error Correction: Primer requirement allows kinetic discrimination sensitive to base pairing of recently incorporated nucleotides E DNA N dNTP C * N+2 PP i K D k pol N+1 conf Fast reaction pathways for correct primer with correct nucleotide K D C I Slow reaction pathways for incorrect primer with correct nucleotide k conf C I pol k C I In essence the DNA polymerase requirement for a properly base-paired primer allows one to make the second round of polymerization sensitive to mismatch incorporation errors made in the first round. However, unlike the first round where the two pathways are in direct competition and differences in competing rates leads to differences in incorporation probability, the two pathways shown here are not in direct competition. Hence the enzyme cannot take advantage of slower kinetics of the lower pathway unless it provides a corrective pathway as an alternative and competing step. Otherwise, although the lower incorporation rate is very slow, it will eventually happen leaving the incorrect nucleotide in place. The alternative and competing corrective pathway is provided by the exonuclease E DNA N+1 I dNTP C E DNA N+1 I E * dNTP C DNA N+1 I I E DNA PP N+2 i 1 2 3 Rapid dNTP Binding Pseudo-equilibrium Slow Conformational Change “Induced Fit” Polymerization Reaction

25 Arrow thickness roughly corresponds to rate constant
Error Correction: Exonuclease activity allows the polymerase’s kinetic discrimination to lead to different primer fates Arrow thickness roughly corresponds to rate constant E DNA N C Fast reaction pathways for correct primer with correct nucleotide k exo E DNA N dNTP C * N+2 PP i K D k pol N+1 conf K D C k conf C k pol C E DNA N+1 C E DNA N dNTP C E DNA N dNTP C * E DNA N+2 C PP i Here we show the competing exonuclease step for both the correct and the incorrect primers. Arrow thicknesses again roughly represents relative rate constants (which have been measured by enzymologists for various polymerases). As an alternative to adding a nucleotide in a multistep polymerization reaction, the primer can undergo the removal of the most recently added nucleotide. In other words, the exonuclease provides a molecular choice to the DNA polymerase. So how is that choice controlled? To explain this I will first consolidate the separate stages of the second incorporation reaction (shown within the dotted boxes)…. K D C I k conf C I pol k C I E DNA N+1 I dNTP C E DNA N+1 I E * dNTP C DNA N+1 I E DNA N+2 I PP i 2 3 1 k exo I Slow reaction pathways for incorrect primer with correct nucleotide E DNA N C

26 Arrow thickness roughly corresponds to rate CONSTANT
Error Correction: Kinetic manipulation of molecular choice based on complementarity of primer Arrow thickness roughly corresponds to rate CONSTANT E DNA N C When a correct nucleotide is incorporated, 3’>5’ exonuclease activity is much slower than 5’>3’ polymerase activity. Addition of the next nucleotide is kinetically favored. exo pol E DNA N+1 C E DNA N+2 C PP i …by reducing it to a single arrow whose thickness roughly represents the overall rate constant of the composite reaction. This consolidation allows us to focus on the kinetic competition between the polymerization reaction, which adds the next nucleotide, and the exonuclease reaction, which removes the most recently incorporated nucleotide. It is this competition which determines which path is taken most often, i.e. which molecular choice is made That competition can be quantified by the ratio of the rate constants of the competing reactions Visually we can see the choice made most often will be down the competing path with the thicker darker arrow. Importantly this choice changes depending on the substrate When a correct nucleotide has been incorporated and there is perfect primer-template base pairing, polymerization of the next nucleotide is greatly favored over removal of the last incorporated nucleotide. However, when a correct nucleotide has been incorporated and there is a mismatch at the 3’ end of the primer-template, the competition flips to favoring the exonuclease reaction over the polymerization reaction. Importantly, it is not the absolute rate constants that matters, but the relative values of competing rate constants because this relative value determines the actual flux of molecules, i.e. which pathway is chosen. Hence, even though the exonuclease rate constant is still relatively slow when an incorrect nucleotide has been incorporated, the polymerase rate constant has been so drastically reduced such that the exonuclease reaction is greatly favored. IN summary: The exonuclease reaction gives the polymerase a molecular CHOICE whether to retain or discard the most recent nucleotide The COMPETITION between exonuclease and polymerase reaction rate CONSTANTS determine what choice is made By changing the competition depending on whether the most recent nucleotide was correct or incorrect, the polymerase can adjust the choice so as to enhance its fidelity. pol E DNA N+1 I E DNA N+2 I PP i When an incorrect nucleotide is incorporated, disruption of the primer greatly slows 5’>3’ polymerase activity for the next nucleotide (and slightly increases 3’>5’ exonuclease activity). Excision of the incorrect nucleotide is kinetically favored. exo E DNA N C

27 Error Correction: Kinetic manipulation of molecular choice based on complementarity of primer
Black arrow thickness roughly corresponds to relative rate constant 5’-TAGCTTC 3’-ATCGAAGCTCATG Light blue arrow thickness roughly corresponds to relative flux k T7pol C primer 300 s -1 T7exo 0.2 s I primer 0.01 s 2.3 s 1500 230 1 ~ Example: T7 DNA Polymerase exo pol 5’-TAGCTTCG 3’-ATCGAAGCTCATG 5’-TAGCTTCGA 3’-ATCGAAGCTCATG This slide shows the substrates and products for the polymerization and exouclease reactions in detail and, as an example, provides the rate contants that were measured for the T7 DNA polymerase. On the top half, a correctly incorporated G can either be excised or have the next nucleotide added to it. Because of proper primer-template complementarity the T7 polymerization rate constant is 1500-fold higher than the T7 exonuclease rate constant Hence, the flux of molecules, represented by the light blue arrow, will predominantly go down the polymerization path IN contrast, an incorrectly incorporated A will disrupt the primer template base pairing, causing a dramatic reduction in the rate constant for the next polymerization step This mismatch also increases the exonuclease rate constant moderately by about 10 fold Together these changes make the exonuclease rate constant 230-fold higher than the polymeraization rate constant Consequently the flux of molecules will predominantly go down the exonuclease path. 5’-TAGCTTC 3’-ATCGAAGCTCATG A pol 5’-TAGCTTC A 3’-ATCGAAGCTCATG A exo 5’-TAGCTTC 3’-ATCGAAGCTCATG

28 A Discard Strategy for Fidelity: Kinetic manipulation of molecular choice between irreversible forward and discard pathways The choice is ultimately determined by the relative flux of molecules that proceed down the two competing pathways (light blue arrow) Elimination Pathway irreversibility usually requires some chemical energy expenditure (e.g dNTP hydrolysis), which could be coupled to either pathway or to a reaction step preceding these pathways discard forward Cognate Substrate Correct Product One can think of the polymerization reaction as a forward pathway that allows a biomolecular process to continue toward a desired product and the exonuclease reactions as a competing discard pathway The fundamental strategy used by DNA polymeraases to maintain fidelity will be seen in different guises throughout the rest of Bioreg, and there are several salient features of this strategy that you should look for in these other systems The first is the presence of a competition between a forward and a discard pathway. Clearly identifying these competing pathways is an important first step in understanding how these other systems maintain fidelity. The second is the irreversibility of both pathways. Imposing a choice between competing pathways is not that useful if one can easily return from a pathway and have another chance to make a choice. The immediate first step does not have to be reversible, but some step down that pathway has to be reversible For example the first step down the polymerization step is a rapidly reversible dNTP binding step, but the downstream polymerization reaction is effectively irreversible This thermodynamic irreversibiilty requires some chemical energy expenditure so this is why all fidelity mechanisms require the hydrolysis of high energy bonds (like ATP or dNTP) How this chemical energy is incorporated can vary from system to system. For example, in DNA polymerization the hydrolysis of dNTPs from the previous incorporation step places the exonuclease substrate in a higher energy state than the exonuclease product, making this discard pathway irreversible. In other systems ATP hydrolysis might be coupled to one or both competing pathways to make them irreversible. But the bottom line is that many biological processes use chemical energy ,not to drive the chemistry underlying the process, but to ensure its fidelity The third salient feature is the ability of the system to sense a difference between correct (or cognate) and incorrect (or noncognate) substrates and to flip the the ratio of the competing rate constants depending on which substrate it senses. In principle, to flip this ratio, the system could alter the rate constants of just one reaction or the other depending on whether the substrate is cognate or noncognate, but in nature both reactions often experience substrate dependent changes. In the case of DNA polymerases the major reason mismatched primers preferentially go down the exonuclease pathway is because the rate of polymerization drops so dramatically. However, the mismatch also increases the rate constant of the exonuclease reaction, further contributing to its preference. forward Noncognate Substrate Incorrect Product For each substrate, the molecular flux (and hence molecular choice) is determined by the ratio of the forward to discard rate constants (black arrows) for that substrate. For cognate substrates this ratio should favor the forward reaction. For noncognate substrates, the ratio should “flip” to favor the discard pathway. discard Elimination In principal, just one or both pathways could discriminate between cognate and noncognate substrates, i.e. change rate constants with substrate. In practice, nature often discriminates with both.

29 Intrinsic Fidelity + Exo Proofreading
5’-TTC DNA polymerase 3’-AAGCTCA primer-template G DNA polymerase 5’-TTC primer-template k exo C primer 3’-AAGCTCA bound dNTP K D C primer k incorp C dNTP k incorp C primer K D C dNTP 5’-TTC 3’-AAGCTCA G A 5’-TTCG 5’-TTCG 5’-TTCGA 3’-AAGCTCA 3’-AAGCTCA 3’-AAGCTCA To summarize. DNA polymerase has two systems for ensuring high fidelity of nucleotide incorporation and in both cases it controls molecular choices by controlling and altering relative rate constants or pseudoequilibria. Intrinsic Fidelity, based on monitoring the correct base pairing of the incoming nucleotide BEFORE it is catalytically incorporated by the enzyme Here the choice is determined by direct competition of reactions involving correct versus incorrect nucleotides The pseudoequilibria for the nucleotide binding step and the rate constant for the incorporation step greatly favor the branch with the correct nucleotide 2) Proofreading or Error Correction, based on monitoring the correct base pairing of the nucleotide AFTER it is catalytically incorporated by the enzyme and is now part of the primer. Here the choice is determined independently for each branch by the relative rate constants of kincorp versus kexo. That ratio greatly favors further nucleotide incorporation when the prior incorporated nucleotide was correct and exonuclease removal when the prior incorporated nucleotide was incorrect Finally, I note that the exonuclease pathway is not a complete discard pathway that aborts the growth of the nascent chain It discards the incorrect nucleotide but preserves the primer-template and sends it back to the previous cycle with the polymerase to start over again. Although I don’t have time to discuss it in the lecture, it is the 5’ to 3’ direction of the polymerase that ensures that the primer-template is not discarded and can be reused. This is explained in the Appendix slide on Head versus Tail growth of polymers. 5’-TTC 3’-AAGCTCA K D I primer k incorp I dNTP k incorp I primer A 5’-TTC A A A 5’-TTC 5’-TTC 5’-TTC A A K D I dNTP 3’-AAGCTCA 3’-AAGCTCA 3’-AAGCTCA 3’-AAGCTCA pseudoequilibrium constant for nucleotide binding K D k exo I primer composite rate constant for Incorporation of bound dNTP k incorp composite rate constant for exonuclease reaction k exo

30 How DNA Polymerases Check for Proper Base Pairing Geometry
Crystal structure suggestion for “Induced Fit” DNA Polymerase contacts minor groove of primer-template Stacking Interaction Primer Template So how does the DNA polymerase sense the difference between cognate and noncognate substrates whether at the level of the incoming nucleotide or primer? As mentioned before, one important component is induced fit, which is thought to corresponds to the slow pre-catalytic step in the polymerase cycle. This “fitting” allows precise checking of base pair geometry and couples this to the proper alignment of catalytic residues needed to carry out the polymerization reaction. If the incoming nucleotide is correct AND the previously incorporated nucleotide, now part of the primer, is correct, the geometry of the complementary base pairs will allow the induced fit needed to catalytically activate the enzyme. Evidence for such an induced fit was obtained for one DNA polymerase by x-ray crystallography as schematized on the left Presumably for this fit to sense the proper geometry of the base pairing of the primer template and the incoming nucleotide the polymerase must interact with these substrates Crystal structures show that the T7 DNA polymerase makes numerous contacts with the minor groove of a primer template and form H-bonds with universal H bond acceptor groups that are both properly positioned only for Watson-Crick base pairs. Polymerase + Primer-Template Polymerase + Primer-Template + dNTP Base pair fit is “tested” before polymerization Base pair fit is still “tested” after polymerization Only W-C base pairs allow proper stacking Purple: Interaction surface with DNA polymerase Induced fit positions nucleotide, primer 3’, metal ions Green: Universal H-bond acceptors H-bonding with DNA polymerase

31 Most DNA polymerases in the cell have NON-replicative roles
Prokaryotic DNA Polymerases (E. coli) Pol I Pol II (Din A) Pol III holoenzyme Pol IV (Din B) Pol V (UmuC, UmuD’2C) DNA Replication (RNA primer removal); DNA repair DNA repair DNA Replication DNA repair; TLS; adaptive mutagenesis TLS (translesion synthesis) Eukaryotic DNA Polymerases Pol a Pol b Pol g Pol d Pol e Pol q Pol z Pol l Pol m Pol k Pol h Pol i Rev1 DNA Replication (Primer Synthesis) Base excision repair Mitochondrial DNA replication/repair DNA Replication; nucleotide and base excision repair DNA crosslink repair TLS Meiosis-associated DNA repair Somatic hypermutation Error-free TLS past cyclobutane dimers TLS, somatic hypermutation So far our discussion has focused on the replicative polymerases (listed in purple), whose fidelity must be extremely high to ensure faithful duplication of the entire genome. All these polymerases have high intrinsic fidelity and exonuclease proofreading However, over the past years many DNA polymerases have been discovered that do NOT have a high intrinsic fidelity nor a functional exonuclease for proofreading These error-prone polymerases are NON-replicative polymerases They play an important role in compensating for the Achilles heel of replicative DNA polymerases, their high fidelity High fidelity can be a problem, for example, when there is a chemically damaged nucleotide in the template. Such damaged nucleotide cannot form a proper Watson-Crick base pair with any of the four dNTPs. Thus with no “correct” incoming nucleotide possible, the replicative polymerase will stall indefinitely at such lesions. And even if they manage to incorporate some nucleotide, their proofreading function would rapidly remove it. There are recombinational tricks the cell has to get around the lesion and ultimately incorporate the correct nucleotide, but if these fail or take to long, the cell is faced with the choice of either dying because it can’t complete replication OR forcing the incorporation of some nucleotide across the damaged nucleotide and risking a mutation. The latter process bypasses the lesion and is called translesion synthesis (TLS), obviously a better alternative than dying immediately Hence many of these low fidelity DNA polymerases have evolved to carry out TLS in the face of different types of damaged nucleotides. There are also cases where you might want a low fidelity polymerase to generate mutations on purpose to increase genetic variation. For example, during the V-D-J recombination that occurs to assemble mature immunoblobulin genes, low fidelity DNA polymerases are used that introduce mutations (somatic hypermutation) in the antigen recognition pockets to increase the variety of antigens that can be recognized. Also, in times of environmental stress, increasing genetic variation might increase the chance that some variants might have a better chance to survive the stress. This “adaptive mutagenesis” has been documented in E. coli, and occurs because stress increases the expression of an error prone DNA polymerase called Pol IV or Din B In short, there are important biological reasons why nature might have evolved DNA polymerases with lower fidelity Finally, the Loeb lab has been able to generate mutant replicative DNA polymerases in E. coli that have normal speed but increased fidelity. This suggests that nature could in principle make a more faithful replicative polymerase but doesn’t. It raises the possibility that nature limits fidelity even in replicative polymerases to promote the genetic variation that might be important for evolutionary change. This idea that some “intentional” replication infidelity may be important for evolution will raised again when we discuss the fidelity of controlling replication initiation. Many of the nonreplicative polymerases have low fidelity and are error-prone because they tolerate non-WC bp and lack 3’>5’ exo activity Low fidelity is needed to bypass template lesions that are stalling replication Low fidelity may be used to increase genetic variation in special circumstances

32 How is TLS synthesis coordinated with replicative synthesis?
Exactly how TLS synthesis is coordinated with replicative synthesis is not clear Initially it was thought that both replicative and TLS polymerases could be incorporated in the replisome associated with a moving replication fork. Then at damaged nucleotides on the template, the TLS polymerase could transiently replace the replicative polymerase at the primer-template junction to synthesize past the lesion before the replicative polymerase is swapped back to continue on its way (as shown on the left). Biochemical assays consistent with such polymerase swapping have been developed suggesting that this swapping could indeed be occuring in cells Recent evidence, however, suggests that the replicative polymerase might be able to skip past a template lesion and be reprimed to restart downstream of the lesion. This finding raises the possibility that the TLS can fill in the resulting gaps in the daughter DNA strand after the bulk of replication is completed (as shown on the right).

33 Contributions to E coli DNA Replication Fidelity
Fidelity Overview Contributions to E coli DNA Replication Fidelity Fidelity Comparisons Error Rate Product Size Speed Error rate DNA Replication Intrinsic Fidelity (polym) 500 bp/sec 5 x 106 (Prokaryotes) (E. coli) (sensing dNTP complementarity to template) 50 bp/sec 6 x 109 (Eukaryotes) (humans) Exonuclease Proofreading (polym) Finally let me remind you that thereis a third level of replication fidelity, called mismatch repair, that can correct a mismatched nucleotide that has escaped the first two levels of fidelity. Mismatch repair can occur within a short window of time after DNA polymerase and the replication fork have the passed. This system makes the presumption that any mismatch detected shortly after a replication fork has passed is due to misincorporation of a nucleotide in the daughter strand. Hence, the system needs to be able to (1) detect a mismatch and (2) distinguish daughter from parental strands in newly replicated DNA Once it detects such a mismatch it directs an excision repair system to replace a patch of the daughter strand containing the misincorporated nucleotide 1 x 1011 (sensing primer complementarity to template) (lily) Mismatch Repair (post polym) 10-2 RNA Transcription 30 bp/sec 10-4 (sensing complementarity of two strands) (distinguishing parental and daughter strands) Protein Translation 20 aa/sec 10-4 Overall Replication Fidelity

34 Mismatch Repair: Correction of Replication Errors
E. coli mismatch repair DNA- both parental strands methylated at GATC sites daughter strand transiently unmethylated after replication MutS - recognizes mispaired bp by susceptibility to kinking MutH - recognizes nearby GATC MutL - association with MutS and MutH stimulates MutH to nick unmethylated daughter strand (basis of strand bias) Exonuclease and helicase II, directed by MutS and MutL excise daughter strand from nick to mispaired bp DNA polIII, clamp, clamp-loader, and SSB synthesize replacement DNA MutS bound to mispaired DNA Shown here is the mismatch repair system for E. coli which was worked out by Paul Modrich and earned him a Nobel prize MutS provides the mismatch recognition As shown in the crystal structure it does this by binding and bending the DNA, since mismatches are easier to kink MutH provides the daughter strand recognition -- It can distinguish between the parental strand which is methylated at GATC sites from the daughter strand which is transiently unmethylated at these sites shortly after it is synthesized. -- this distinction is lost once the DNA methylase detects the newly replicated GATC sites and methylates them, closing the window of opportunity for this fidelity mechanism to work -- MutH permanently marks the daughter strand by nicking that strand at GATC sites close to mismatches -- this nick initiates a process called excision repair, which replaces a patch of daughter DNA strand surrounding the mismatch Eukaryotes, do not have the methylation system used by E. coli to distinguish daughter from parental strand, so it is not clear how they carry out this critical task in mismatch repair. Modrich showed that nicks could be used to initiate mismatch excision repair on in eukaryotic mismatch extract systerms Hence, the presumption is that nicks are also used in eukaryotes to distinguish daughter strands from parental strands. Where those nicks come from is still a source of speculation One hypothesis is that daughter strand specific nicks arise form misincorporation of rNTPs, which trigger nicks to initiate their replacement. DNA Dam methylase eventually fully methylates GATC sites so both strands are marked as parental for next round of replication

35 DNA Polymerase and the Replication Fork
Lecture 1: DNA Polymerase and the Replication Fork Use of biochemistry (assays) and genetics (mutant phenotypes) to define function Fidelity/Specificity: bioregulation through substrate control of molecular choice Let me end by reminding you of the two key take home points Frist, biological function is defined by either biochemical assays or mutant phenotypes Exactly what function you define will depend on exactly how you set up the assay or analyzed the phenotypes Each approach has its advantages and limitations, so ideally one would like to apply both approaches to any problem you attack Second, bioregulation involves the control of molecular choice This choice is determined by the relative rate and equilibrium constants for competing reactions Biological systems can alter this choice for different molecular circumstances by coupling these circumstances to different rate and equilibrium constants One of the simplest examples of bioregulation is ensuring the fidelity of biomolecular processes Here one wants the correct substrate to move forward in the process and the incorrect substrate to be excluded from the process One way Nature does this is to set up competing forward and discard pathways and to give the kinetic advantage to the forward pathway for correct substrates and to the discard pathway for incorrect substrates. This strategy will be seen repeatedly throughout the course, but in different guises.

36 Bioreg 2017 Lecture 1 Replication
APPENDIX

37 Using Biochemical Assays to Define Biochemical Functions
Assay must distinguish or physically separate products from substrates Polymerization: conversion of radioactive nucleotide from acid soluble to acid precipitable Nuclease: conversion of incorporated radioactive nucleotide from acid precipitable to acid soluble Quantification is important for inferring biological relevance 3’ > 5’ exonuclease negates the polymerization reaction, but is generally much slower DNA Pol I’s poor polymerization raised the possibility that it was not the replicative helicase Small differences in assay conditions can define different activities gapped/nicked template defines Pol III core activity primed single-stranded template defines Pol III holoenzyme activity Complications of assaying crude extracts (beware of wasting clean thoughts on dirty enzymes) can be detecting multiple types of activities and be affected by multiple competing activities can be detecting multiple similar activities

38 The Awesome Challenges of Genetics
polA mutant revisited Lecture 1: polA1 mutant with <1% assayable DNA Pol1 activity have relatively normal replication Cairns carefully suggests DNA Pol1 is not critical for DNA replication Lecture 2: DNA Pol1 plays a role in okazaki fragment maturation, an important part of replication What happened to the awesome power of genetics? Caveats about gene analysis Caveats applied to polA1 mutants Limitations in Phenotypic Analysis Pol I activity in living polA1 cells may be greater than that measured in vitro (in extracts) as mutant protein may be more labile or inhibitable in the harsher in vitro setting than in vivo. Bottom line: Genetics can more easily demonstrate that a gene is required for a certain function than it can demonstrate that a gene is NOT required or NOT involved in a certain function. Another limitation not addressed in the slide is that even if the genetics makes a strong case that a gene is required for a certain function, it is harder, especially if the phenotypic analysis is performed in a whole cell or organism, to determine whether it has a direct or only indirect role in that function. Excess Activity/Leaky Allele E.coli has an estimated 300 molecules of DNA Pol I, most used in DNA repair. Fewer molecules are needed for the 2 replication forks, so the residual activity in a polA1 mutant may be sufficient. Note, although polA1 has an early nonsense mutation, read-through of the nonsense codon is suspected of generating the residual Pol I activity Redundancy We can eliminate the first two caveats with a null mutant, but the polA∆ mutant is still viable in minimal media (although not in rich media, where the demands for rapid DNA replication are greater). In this mutant Pol II or Pol III is thought to substitute (poorly but sufficiently) for Pol I in OF maturation With all these caveat, what is the evidence that DNA Pol1 is important for OF maturation and DNA replication? polA12 ts mutant accumulates increased OFs at restrictive temp (similar to the ts lig4 mutant) polA12 lig4 double mutant not only accumulates OFs but rapidly ceases DNA synthesis at restrictive temp

39 Polymerization via Head Growth vs Tail Growth
Head Growth: activated high energy bond end of polymer drives polymerization Tail Growth: activated high energy bond end of monomer drives polymerization Tail growth is used when you want to be able to reversibly remove and add back monomers from the polymer Head growth is used when you want the completed polymer length to be final, with no possibility of accidental addition to the polymer The 5’ to 3’ direction of DNA polymerization forces it to occur by tail growth. The high energy bond for polymerization is provided by the monomer, so when the proofreading 3’ to 5’ exonuclease removes the last nucleotide added, the remaining polymer is still capable of chain growth If DNA polymerization had evolved to occur in the 3’ to 5’ direction, the triphosphate at the 5’ end of the polymer would be needed for each polymerization step. This would preclude excision of the most recently added nucleotide by a presumably 5’ to 3’ proofreading exonuclease, because such an excision would terminate chain growth. Thus, the 5’ to 3’ direction of DNA polymerization is important for replication fidelity.


Download ppt "DNA Polymerase and the Replication Fork"

Similar presentations


Ads by Google