2Overview of Gene Expression An organism may contain many types of somatic cells, each with distinct shape and function. However, they all have the same genome. The genes in a genome do not have any effect on cellular functions until they are "expressed". Different types of cells express different sets of genes, thereby exhibiting various shapes and functions.
3Essential steps involved in the expression of protein genes. The Central Dogma of Molecular Biology:
4Overview of Gene Expression "Gene expression" means the production of a protein or a functional RNA from its gene. Several steps are required:Transcription: A DNA strand is used as the template to synthesize a RNA strand, which is called the primary transcript. RNA processing: This step involves modifications of the primary transcript to generate a mature mRNA (for protein genes) or a functional tRNA or rRNA.
5For RNA genes (tRNA and rRNA), the expression is complete after a functional tRNA or rRNA is generated. However, protein genes require additional steps:Nuclear transport: mRNA has to be transported from the nucleus to the cytoplasm for protein synthesis.Protein synthesis: In the cytoplasm, mRNA binds to ribosomes, which can synthesize a polypeptide based on the sequence of mRNA.
6TranscriptionTranscription: The process of copying DNA to produce an RNA transcript.This is the first step in the expression of any gene.The resulting RNA, if it codes for a protein, will be spliced, polyadenylated, transported to the cytoplasm, and by the process of translation will produce the desired protein molecule.
7Overview of Transcription Transcription is a process in which one DNA strand is used as template to synthesize a complementary RNA. The following is an example:Note that uracil (U) of RNA is paired with adenine (A) of DNA. The DNA strand which serves as the template may be called "template strand", "minus strand", or "antisense strand". The other DNA strand may be termed "non-template strand", "coding strand", "plus strand", or "sense strand".
8Since both DNA coding strand and RNA strand are complementary to the template strand, they have the same sequences except that T in the DNA coding strand is replaced by U in the RNA strand.
9Schematic illustration of transcription. (a) DNA before transcription Schematic illustration of transcription. (a) DNA before transcription. (b) During transcription, the DNA should unwind so that one of its strand can be used as template to synthesize a complementary RNA.
10Essential steps of transcription (i) Binding of polymerases to the initiation site. The DNA sequence which signals the initiation of transcription is called the promoter. (ii) Unwinding of the DNA double helix (pilinan double heliks membuka). The enzyme which can unwind the double helix is called helicase. Prokaryotic polymerases have the helicase activity, but eukaryotic polymerases do not. Unwinding of eukaryotic DNA is carried out by a specific transcription factor.
11(iii) Synthesis of RNA based on the sequence of the DNA template strand. RNA polymerases use nucleoside triphosphates (NTPs) to construct a RNA strand.(iv) Termination of synthesis. Prokaryotes and eukaryotes use different signals to terminate transcription. Transcription in eukaryotes is much more complicated than in prokaryotes, partly because eukaryotic DNA is associated with histones, which could hinder (menghalangi) the access of polymerases to the promoter.
12The Relationship between Genes and Proteins Most genes encode the information for the synthesis of a proteinThe sequence of bases in DNA codes for the sequence of amino acids in proteins
14An Illustration of the transcription of DNA to RNA to protein which forms the backbone of molecular biology.
15DNA codes for the production of RNA. RNA codes for the production of protein.Protein does not code for the production of protein, RNA or DNA.
16The function of RNA polymerases Both RNA and DNA polymerases can add nucleotides to an existing strand, extending its length. However, there is a major difference between the two classes of enzymes: RNA polymerases can initiate a new strand but DNA polymerases cannot. The chemical reaction catalyzed by RNA polymerases
17The function of RNA polymerases The nucleotides used to extend a growing RNA chain are ribonucleoside triphosphates (NTPs). Two phosphate groups are released as pyrophosphate (PPi) during the reaction. Strand growth is always in the 5' to 3' direction. The first nucleotide at the 5' end retains its triphosphate group
18Gene's Regulatory Elements Transcriptional regulation is mediated by the interaction between transcription factors and their DNA binding sites which are the cis-acting elements, whereas the sequences encoding transcription factors are trans-acting elements. The cis-acting elements may be divided into the following four types:PromotersEnhancersSilencersResponse elements
19Gene organization. The transcription region consists of exons and introns. The regulatory elements include promoter, response element, enhancer and silencer (not shown). Downstream refers to the direction of transcription, and upstream is opposite to the transcription direction. The number increases along the direction of transcription, with "+1" assigned for the initiation site. There is no "0" position. The base pair just upstream of +1 is numbered "-1", not "0".
211. PromoterPromoter is the DNA region where the transcription initiation takes place. In prokaryotes, the sequence of a promoter is recognized by the Sigma (s) factor of the RNA polymerase. In eukaryotes, it is recognized by specific transcription factors.Pada E. coliE. coli has five sigma factors:Sigma 70: Regulate expression of most genes.Sigma 32: Regulate expression of heat shock proteins.Sigma 28: Regulate expression of flagellar operon (involved in cell motion).Sigma 38: Regulate gene expression against external stresses.Sigma 54: Regulate gene expression for nitrogen metabolism.
22Pada EukaryotesIn eukaryotes, there is a significant difference between the transcription of protein genes and RNA genes. The most common promoter element in eukaryotic protein genes is the TATA box, located at -35 to -20. Another promoter element is called the initiator (Inr). It has the consensus sequence PyPyAN(T/A)PyPy, where Py denotes pyrimidine (C or T), N = any, and (T/A) means T or A. The base A at the third position is located at +1 (the transcriptional start site).TATA box and initiator are the core promoter elements. There are other elements often located within 200 bp of the transcriptional start site, such as CAAT box and GC box which may be referred to as promoter-proximal elements.The protein which interacts with the initiator and TATA box is known as the TATA-box binding protein (TBP), because the TATA box was discovered earlier than the initiator
232. EnhancersEnhancer: is a nucleotide sequence to which transcription factor(s) bind, and which increases the transcription of a gene.Enhancers are the positive regulatory elements located either upstream or downstream of the transcriptional initiation site. However, most of them are located upstream. In prokaryotes, enhancers are quite close to the promoter, but eukaryotic enhancers could be far from the promoter.An enhancer region may contain one or more elements recognized by transcriptional activators. Enhancers are "conditional" - in other words, they enhance transcription only under certain conditions, for example in the presence of a hormone.
243. SilencerElements that are very similar to enhancers except that they have the function of binding proteins and inhibiting transcription.
254. Response elementsResponse elements are the recognition sites of certain transcription factors. Most of them are located within 1 kb from the transcriptional start site.
271. Initiation of Transcription RNA polymerase able to recognize the beginning of a gene so that it knows where to start synthesizing an mRNA.It is directed to the start site of transcription by one of its subunits' affinity to a particular DNA sequence that appears at the beginning of genes. This sequence is called a promoter.It is a unidirectional sequence on one strand of the DNA that tells the RNA polymerase both where to start and in which direction (that is, on which strand) to continue synthesis.
28 2. Elongation of Transcription The RNA polymerase then stretches open the double helix at that point in the DNA and begins synthesis of an RNA strand complementary to one of the strands of DNA.The RNA polymerase recruits rNTPs (ribonucleic nucleotides triphosphates) in the same way that DNA polymerase recruits dNTPs. However, since synthesis is single stranded and only proceeds in the 5' to 3' direction, there is no need for Okazaki fragments. It is important to note that synthesis is proceeds in a unidirectional fashion.
293. Termination of Transcription How does RNA polymerase know when to stop transcribing a gene?This system has been elucidated in prokaryotes. It is important to know that since there is no nucleus in prokaryotes, ribosomes can begin making protein from an mRNA immediately upon its synthesis. At the end of a gene, the sequence of the mRNA allows it to form a hairpin loop, which blocks the ribosome. The ribosome falls off the mRNA, and that is the termination signal recognized by the RNA polymerase. As soon as the ribosome falls off the mRNA, the RNA polymerase falls off the DNA and transcription ceases.
30RNA Processing RNA Processing: pre-mRNA --> mRNA All the primary transcripts produced in the nucleus must undergo processing steps to produce functional RNA molecules for export to the cytosol.
31RNA processingRNA processing is to generate a mature mRNA (for protein genes) or a functional tRNA or rRNA from the primary transcript. Processing of pre-mRNA involves the following steps:Capping - adding 7-methylguanylate (m7G) to the 5' end.Polyadenylation - adding a poly-A tail to the 3' end.Splicing - removing introns and joining exons.In some cases, RNA editing is also involved.
32The procedure of RNA processing for protein genes.
335'-CappingCap site: Two usages: In eukaryotes, the cap site is the position in the gene at which transcription starts, and really should be called the "transcription initiation site". The first nucleotide is transcribed from this site to start the nascent RNA chain. That nucleotide becomes the 5' end of the chain, and thus the nucleotide to which the cap structure is attached (see "Cap"). In bacteria, the CAP site (note the capital letters) is a site on the DNA to which a protein factor (the Catabolite Activated Protein) binds.Capping occurs shortly after transcription begins. The chemical structure of the "cap" is shown in the following figure, where m7G is linked to the first nucleotide by a special 5'-5' triphosphate linkage. In most organisms, the first nucleotide is methylated at the 2'-hydroxyl of the ribose. In vertebrates, the second nucleotide is also methylated.
353'-PolyadenylationA stretch of adenylate residues are added to the 3' end. The poly-A tail contains ~ 250 A residues in mammals, and ~ 100 in yeasts.Polyadenylation at the 3' end. The major signal for the 3' cleavage is the sequence AAUAAA. Cleavage occurs at nucleotides downstream from the specific sequence. A second signal is located about 50 nucleotides downstream from the cleavage site. This signal is a GU-rich or U-rich region.
36RNA splicingRNA splicing is a process that removes introns and joins exons in a primary transcript. An intron usually contains a clear signal for splicing (e.g., the beta globin gene). In some cases (e.g., the sex lethal gene of fruit fly), a splicing signal may be masked by a regulatory protein, resulting in alternative splicing. In rare cases (e.g., HIV genes), a pre-mRNA may contain several ambiguous splicing signals, resulting in a few alternatively spliced mRNAs.Splicing signalMost introns start from the sequence GU and end with the sequence AG (in the 5' to 3' direction). They are referred to as the splice donor and splice acceptor site, respectively. However, the sequences at the two sites are not sufficient to signal the presence of an intron. Another important sequence is called the branch site located bases upstream of the acceptor site. The consensus sequence of the branch site is "CU(A/G)A(C/U)", where A is conserved in all genes.In over 60% of cases, the exon sequence is (A/C)AG at the donor site, and G at the acceptor site.Figure 5-A-4. The consensus sequence for splicing. Pu = A or G; Py = C or U.
37Splicing mechanismThe detailed splicing mechanism is quite complex. In short, it involves five snRNAs and their associated proteins. These ribonucleoproteins form a large (60S) complex, called spliceosome. Then, after a two-step enzymatic reaction, the intron is removed and two neighboring exons are joined together. The branch point A residue plays a critical role in the enzymatic reaction.Schematic drawing for the formation of the spliceosome during RNA splicing. U1, U2, U4, U5 and U6 denote snRNAs and their associated proteins. The U3 snRNA is not involved in the RNA splicing, but is involved in the processing of pre-rRNA.
39Summary of the stepsseveral protein transcription factors bind to promoter sites, usually on the 5' side of the gene to be transcribedRNA polymerase, binds to the complex of transcription factors , working together, they open the DNA double helixRNA polymerase proceeds down one strand moving in the 3' -> 5' direction as it does so, it assembles ribonucleotides (supplied as triphosphates, e.g., ATP) into a strand of RNAeach ribonucleotide is inserted into the growing RNA strand following the rules of base pairing. Thus for each C encountered on the DNA strand, a G is inserted in the RNA; for each G, a C; and for each T, an A. However, each A on the DNA guides the insertion of the pyrimidine uracil (U, from uridine triphosphate, UTP). There is no T in RNA.synthesis of the RNA proceeds in the 5' -> 3' direction.as each nucleoside triphosphate is brought in to add to the 3' end of the growing strand, the two terminal phosphates are removed
40 Types of RNA Several types of RNA are synthesized: messenger RNA (mRNA). This will later be translated into a polypeptide.ribosomal RNA (rRNA). This will be used in the building of ribosomes: machinery for synthesizing proteins by translating mRNA.transfer RNA (tRNA). RNA molecules that carry amino acids to the growing polypeptide.small nuclear RNA (snRNA). DNA transcription of the genes for mRNA, rRNA, and tRNA produces large precursor molecules ("primary transcripts") that must be processed within the nucleus to produce the functional molecules for export to the cytosol. Some of these processing steps are mediated by snRNAs.
41Types of RNA Ribosomal RNA (rRNA) There are 4 kinds. In eukaryotes, these are18S rRNA. One of these molecules, along with some 30 different protein molecules, is used to make the small subunit of the ribosome.28S, 5.8S, and 5S rRNA. One each of these molecules, along with some 45 different proteins, are used to make the large subunit of the ribosome.The name given each type of rRNA reflects the rate at which the molecules sediment in the ultracentrifuge. The larger the number, the larger the molecule (but not proportionally).
42Types of RNA Transfer RNA (tRNA) There are some 32 different kinds of tRNA in a typical eukaryotic cell.each is the product of a separate genethey are small (~4S), containing nucleotidesmany of the bases in the chain pair with each other forming sections of double helixthe unpaired regions form 3 loopseach kind of tRNA carries (at its 3' end) one of the 20 amino acids (thus most amino acids have more than one tRNA responsible for them)at one loop, 3 unpaired bases form an anticodonbase pairing between the anticodon and the complementary codon on a mRNA molecule brings the correct amino acid into the growing polypeptide chain.
43Types of RNA Messenger RNA (mRNA) Messenger RNA comes in a wide range of sizes reflecting the size of the polypeptide it encodes. Most cells produce small amounts of thousands of different mRNA molecules, each to be translated into a peptide needed by the cell.Many mRNAs are common to most cells, encoding "housekeeping" proteins needed by all cells (e.g. the enzymes of glycolysis). Other mRNAs are specific for only certain types of cells. These encode proteins needed for the function of that particular cell (e.g., the mRNA for hemoglobin in the precursors of red blood cells).
44Types of RNA Small Nuclear RNA (snRNA) Approximately a dozen different genes for snRNAs, each present in multiple copies, have been identified.The snRNAs have various roles in the processing of the other classes of RNA. For example, several snRNAs are part of the spliceosome that participates in converting pre-mRNA into mRNA by excising the introns and splicing the exons.
45The RNA polymerasesThe RNA polymerases are huge multi-subunit protein complexes. Three kinds are found in eukaryotes.RNA polymerase I (Pol I). It transcribes the rRNA genes for the precursor of the 28S, 18S, and 5.8S molecules. (and is the busiest of the RNA polymerases)RNA polymerase II (Pol II). It transcribes the mRNA and snRNA genes.RNA polymerase III (Pol III). It transcribes the 5S rRNA genes and all the tRNA genes.
46However, the "Central Dogma" has had to be revised a bit However, the "Central Dogma" has had to be revised a bit. It turns out that you CAN go back from RNA to DNA, and that RNA can also make copies of itself. It is still not possible to go from Proteins back to RNA or DNA, and no known mechanism has yet been demonstrated for proteins making copies of themselves.
472. Synthesizing Proteins from the Instructions of DNA Genetic information flows in a cell from:DNA ->RNA-> ProteinIn a prokaryotic cell, this process happens at the same time:
48However, in an eukaryotic cell, the transcription & translation occur in different places:
51The Genetic Code uses three bases to specify each amino acid
524. RNA: Intermediary in Protein Synthesis Why would the cell want to have an intermediate between DNA and the proteins it encodes?· The DNA can then stay pristine and protected, away from the caustic chemistry of the cytoplasm. · Gene information can be amplified by having many copies of an RNA made from one copy of DNA. · Regulation of gene expression can be effected by having specific controls at each element of the pathway between DNA and proteins. The more elements there are in the pathway, the more opportunities there are to control it in different circumstances.
53What is RNA?RNA has the same primary structure as DNA. It consists of a sugar-phosphate backbone, with nucleotides attaches to the 1' carbon of the sugar. The differences between DNA and RNA are that: 1. RNA has a hydroxyl group on the 2' carbon of the sugar (thus, the difference between deoxyribonucleic acid and ribonucleic acid). 2. Instead of using the nucleotide thymine, RNA uses another nucleotide called uracil:
54Because of the extra hydroxyl group on the sugar, RNA is too bulky to form a a stable double helix. RNA exists as a single-stranded molecule. However, regions of double helix can form wherethere is some base pair complementation (U and A , G and C), resulting in hairpin loops.The RNA molecule with its hairpin loops is said to have a secondary structure.
55In addition, because the RNA molecule is not restricted to a rigid double helix, it can form many different tertiary structures. Each RNA molecule, depending on the sequence of its bases, can fold into a stable three-dimensional structure.
61 The Genetic CodeHow does an mRNA specify amino acid sequence? The answer lies in the genetic code. It would be impossible for each amino aciud to be specified by one nucleotide, because there are only 4 nucleotides and 20 amino acids. Similarly, two nucleotide combinations could only specify 16 amino acids. The final conclusion is that each amino acid is specified by a particular combination of three nucleotides, called a codon:
63Note the degeneracy of the genetic code Note the degeneracy of the genetic code. Each amino acid might have up to six codons that specify it. It is also interesting to note that different organisms have different frequencies of codon usage. A giraffe might use CGC for arginine much more often than CGA, and the reverse might be true for a sperm whale. Another interesting point is that some species vary from the codon association described above, and use different codons fo different amino acids. In general, however, the code depicted can be relied upon. How do tRNAs recognize to which codon to bring an amino acid? The tRNA has an anticodon on its mRNA-binding end that is complementary to the codon on the mRNA. Each tRNA only binds the appropriate amino acid for its anticodon.
65Central Dogma, Part 1: Transcription How does the sequence information from DNA get transferred to mRNA so that it can be carried to the ribosomes in the cytoplasm? This process, called transcription is highly analogous to DNA replication. Of course, there are different effectors, or proteins, that direct transcription. Primary among these is the RNA polymerase holoenzyme, an agglomeration of many different factors that together direct the synthesis of mRNA on a DNA template.
66As mentioned above, transcription (like ANY polymerisation process) is divided into three parts:
68The sequence logo for the -10 "TATA" box for 60 human promoters, aligned on the TATA box, is shown below:
69Taken together, they make up the "central dogma" of biology: DNA -> RNA -> protein. Here is an overview.
70Synthesis of the cap. This is a stretch of three modified nucleotides attached to the 5' end of the pre-mRNA.Synthesis of the poly(A) tail. This is a stretch of adenine nucleotides attached to the 3' end of the pre-mRNA.Step-by-step removal of introns present in the pre-mRNA and splicing of the remaining exons. This step is required
71Split GenesMost eukaryotic genes are split into segments. In decoding the open reading frame of a gene for a known protein, one usually encounters periodic stretches of DNA calling for amino acids that do not occur in the actual protein product of that gene. Such stretches of DNA, which get transcribed into RNA but not translated into protein, are called introns. Those stretches of DNA that do code for amino acids in the protein are called exons. Examples:the gene for one type of collagen found in chickens is split into 52 separate exonsthe gene for dystrophin, which is mutated in boys with muscular dystrophy, has 79 exonseven the genes for rRNA and tRNA are split.The cutting and splicing of mRNA must be done with great precision. If even one nucleotide is left over from an intron or one is removed from an exon, the reading frame from that point on will be shifted, producing new codons specifying a totally different sequence of amino acids from that point to the end of the molecule (which often ends prematurely anyway when the shifted reading frame generates a STOP codon).The removal of introns and splicing of exons is done with the spliceosome. This is a complex of several snRNA molecules and several proteins. The introns in most pre-mRNAs begin with a GU and end with an AG. Presumably these short sequences are essential for guiding the spliceosome.Alternate SplicingThe processing of pre-mRNA for many proteins proceeds along various paths in different cells or under different conditions. For example, early in the differentiation of a B cell (a lymphocyte that synthesizes an antibody) the cell first uses an exon that encodes a transmembrane domain that causes the molecule to be retained at the cell surface. Later, the B cell switches to using a different exon whose domain enables the protein to be secreted from the cell as a circulating antibody molecule.So, whether a particular segment of RNA will be retained as an exon or excised as an intron can vary under different circumstances. Clearly the switching to an alternate splicing pathway must be closely regulated.Why split genes?Perhaps during evolution, eukaryotic genes have been assembled from smaller, primitive genes - today's exons. Some proteins, like the antibodies mentioned in the previous section, are organized in a set of separate sections or domains each with a special function to perform in the complete molecule. Each domain is encoded by a separate exon. Having the different functional parts of the antibody molecule encoded by separate exons makes it possible to use these units in different combinations. Thus a set of exons in the genome may be the genetic equivalent of the various modular pieces in a box of "Lego" for children to assemble in whatever forms they wish.But the boundaries of other exons do not seem to correspond domain boundaries of the protein. Furthermore, rRNA and tRNA genes are also split, and these do not encode proteins. So perhaps some exons are simply "junk" DNA that was inserted into the gene at some point in evolution without causing any harm.SummaryGene expression occurs in two steps:transcription of the information encoded in DNA into a molecule of RNA (described here) andtranslation of the information encoded in the nucleotides of mRNA into a defined sequence of amino acids in a protein (discussed in Gene Translation: RNA -> Protein).