Presentation is loading. Please wait.

Presentation is loading. Please wait.

[BejeranoFall13/14] 1 MW 12:50-2:05pm in Beckman B302 Profs: Serafim Batzoglou & Gill Bejerano TAs: Harendra Guturu & Panos.

Similar presentations


Presentation on theme: "[BejeranoFall13/14] 1 MW 12:50-2:05pm in Beckman B302 Profs: Serafim Batzoglou & Gill Bejerano TAs: Harendra Guturu & Panos."— Presentation transcript:

1 http://cs273a.stanford.edu [BejeranoFall13/14] 1 MW 12:50-2:05pm in Beckman B302 Profs: Serafim Batzoglou & Gill Bejerano TAs: Harendra Guturu & Panos Achlioptas CS273A Lecture 5: Transcription Regulation I

2 http://cs273a.stanford.edu [BejeranoFall13/14] 2 Announcements HW1 is out. Due by 11.00 AM Friday, October 18. –Check it out.

3 http://cs273a.stanford.edu [BejeranoFall13/14] 3 TTATATTGAATTTTCAAAAATTCTTACTTTTTTTTTGGATGGACGCAAAGAAGTTTAATAATCATATTACATGGCATTACCACCATATA CATATCCATATCTAATCTTACTTATATGTTGTGGAAATGTAAAGAGCCCCATTATCTTAGCCTAAAAAAACCTTCTCTTTGGAACTTTC AGTAATACGCTTAACTGCTCATTGCTATATTGAAGTACGGATTAGAAGCCGCCGAGCGGGCGACAGCCCTCCGACGGAAGACTCTCCTC CGTGCGTCCTCGTCTTCACCGGTCGCGTTCCTGAAACGCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATACT AGCTTTTATGGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATTAACGAATCAAATTAACAACCATAGGATG ATAATGCGATTAGTTTTTTAGCCTTATTTCTGGGGTAATTAATCAGCGAAGCGATGATTTTTGATCTATTAACAGATATATAAATGGAA AAGCTGCATAACCACTTTAACTAATACTTTCAACATTTTCAGTTTGTATTACTTCTTATTCAAATGTCATAAAAGTATCAACAAAAAAT TGTTAATATACCTCTATACTTTAACGTCAAGGAGAAAAAACTATAATGACTAAATCTCATTCAGAAGAAGTGATTGTACCTGAGTTCAA TTCTAGCGCAAAGGAATTACCAAGACCATTGGCCGAAAAGTGCCCGAGCATAATTAAGAAATTTATAAGCGCTTATGATGCTAAACCGG ATTTTGTTGCTAGATCGCCTGGTAGAGTCAATCTAATTGGTGAACATATTGATTATTGTGACTTCTCGGTTTTACCTTTAGCTATTGAT TTTGATATGCTTTGCGCCGTCAAAGTTTTGAACGATGAGATTTCAAGTCTTAAAGCTATATCAGAGGGCTAAGCATGTGTATTCTGAAT CTTTAAGAGTCTTGAAGGCTGTGAAATTAATGACTACAGCGAGCTTTACTGCCGACGAAGACTTTTTCAAGCAATTTGGTGCCTTGATG AACGAGTCTCAAGCTTCTTGCGATAAACTTTACGAATGTTCTTGTCCAGAGATTGACAAAATTTGTTCCATTGCTTTGTCAAATGGATC ATATGGTTCCCGTTTGACCGGAGCTGGCTGGGGTGGTTGTACTGTTCACTTGGTTCCAGGGGGCCCAAATGGCAACATAGAAAAGGTAA AAGAAGCCCTTGCCAATGAGTTCTACAAGGTCAAGTACCCTAAGATCACTGATGCTGAGCTAGAAAATGCTATCATCGTCTCTAAACCA GCATTGGGCAGCTGTCTATATGAATTAGTCAAGTATACTTCTTTTTTTTACTTTGTTCAGAACAACTTCTCATTTTTTTCTACTCATAA CTTTAGCATCACAAAATACGCAATAATAACGAGTAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGA TAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTT GGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTTGCGAAGTT CTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGT TTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATAC CTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCT TGGCAAGTTGCCAACTGACGAGATGCAGTTTCCTACGCATAATAAGAATAGGAGGGAATATCAAGCCAGACAATCTATCATTACATTTA AGCGGCTCTTCAAAAAGATTGAACTCTCGCCAACTTATGGAATCTTCCAATGAGACCTTTGCGCCAAATAATGTGGATTTGGAAAAAGA GTATAAGTCATCTCAGAGTAATATAACTACCGAAGTTTATGAGGCATCGAGCTTTGAAGAAAAAGTAAGCTCAGAAAAACCTCAATACA GCTCATTCTGGAAGAAAATCTATTATGAATATGTGGTCGTTGACAAATCAATCTTGGGTGTTTCTATTCTGGATTCATTTATGTACAAC CAGGACTTGAAGCCCGTCGAAAAAGAAAGGCGGGTTTGGTCCTGGTACAATTATTGTTACTTCTGGCTTGCTGAATGTTTCAATATCAA CACTTGGCAAATTGCAGCTACAGGTCTACAACTGGGTCTAAATTGGTGGCAGTGTTGGATAACAATTTGGATTGGGTACGGTTTCGTTG GTGCTTTTGTTGTTTTGGCCTCTAGAGTTGGATCTGCTTATCATTTGTCATTCCCTATATCATCTAGAGCATCATTCGGTATTTTCTTC TCTTTATGGCCCGTTATTAACAGAGTCGTCATGGCCATCGTTTGGTATAGTGTCCAAGCTTATATTGCGGCAACTCCCGTATCATTAAT GCTGAAATCTATCTTTGGAAAAGATTTACAATGATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCT TGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTT TCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCT ATTCTTGACATGATATGACTACCATTTTGTTATTGTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTT TCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGA GATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTA TCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTT CATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTT CAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAA TAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGT ATGATAATGTTTTCAATGTAAGAGATTTCGATTATCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATAAAG

4 4 Gene Products reverse transcription long non-coding RNA microRNA rRNA, snRNA, snoRNA

5 Gene Regulatory Switches http://cs273a.stanford.edu [BejeranoFall13/14] 5 Gene = genomic substring that encodes HOW to make a protein (or ncRNA). Genomic switch = genomic substring that encodes WHEN, WHERE & HOW MUCH of a protein to make. [1,0,0,1] [1,1,0,0] [0,1,1,1] Gene B N B N H H

6 If you only measure gene expression http://cs273a.stanford.edu [BejeranoFall13/14]6 It’s like only seeing the values change in RAM as a program is running.

7 Cis (=close) regulatory elements http://cs273a.stanford.edu [BejeranoFall13/14] 7 CIS REGULATION Type# in genome% of genome genes25,0002% ncRNA15,0001% cis elements1,000,000>10% Encode causality Disease susceptibility Driver sequences Alter cell state Key for evolution promoters, enhancers, silencers, insulators

8 http://cs273a.stanford.edu [BejeranoFall13/14] 8 Transcription Activation

9 RNA Polymerase Transcription = Copying a segment of DNA into (non/coding) RNA Gene transcription starts at the (aptly named) TSS, or gene transcription start site Transcription is done be RNA polymerase, a complex of 10-12 subunit proteins. There are three types of RNA polymerases in human: –RNA pol I synthesizes ribosomal RNAs –RNA pol II synthesizes pre-mRNAs and most microRNAs –RNA pol III synthesizes tRNAs, rRNA and other ssRNAs http://cs273a.stanford.edu [BejeranoFall13/14] 9 RNA Polymerase TSS

10 RNA Polymerase is General Purpose RNA Polymerase is the general purpose transcriptional machinery. It generally does not recognize gene transcription start sites by itself, and requires interactions with multiple additional proteins. http://cs273a.stanford.edu [BejeranoFall13/14] 10 general purpose context specific

11 Terminology Transcription Factors (TF): Proteins that return to the nucleus, bind specific DNA sequences there, and affect transcription. –There are 1,200-2,000 TFs in the human genome (out of 20-25,000 genes) –Only a subset of TFs may be expressed in a given cell at a given point in time. Transcription Factor Binding Sites: 4-20bp stretches of DNA where TFs bind. –There are millions of TF binding sites in the human genome. –In a cell at a given point in time, a site can be either occupied or unoccupied. http://cs273a.stanford.edu [BejeranoFall13/14] 11

12 Terminology Promoter: The region of DNA 100-1,000bp immediately “upstream” of the TSS, which encodes binding sites for the general purpose RNA polymerase associated TFs, and at times some context specific sites. –There are as many promoters as there are TSS’s in the human genome. Many genes have more than one TSS. Enhancer: A region of 100-1,000bp up to 1Mb or more upstream or downstream from the TSS that includes binding sites for multiple TFs. When bound by (the right) TFs an enhancer turns on/accelerates transcription. –Note how an enhancer (E) very far away in sequence can in fact get very close to the promoter (P) in space. http://cs273a.stanford.edu [BejeranoFall13/14] 12 promoter TSS gene

13 http://cs273a.stanford.edu [BejeranoFall13/14] 13 TFBS Position Weight Matrix (PWM) Note the strong independence assumption between positions. Holds for most transcription binding profiles in the human genome.

14 Promoters http://cs273a.stanford.edu [BejeranoFall13/14]14

15 Enhancers http://cs273a.stanford.edu [BejeranoFall13/14]15

16 Terminology Gene regulatory domain: the full repertoire of enhancers that affect the expression of a (protein coding or non-coding) gene, at some cells under some condition. –Gene regulatory domains do not have to be contiguous in genome sequence. –Neither are they disjoint: One or more enhancers may well affect the expression of multiple genes (at the same or different times). http://cs273a.stanford.edu [BejeranoFall13/14] 16 TSS promoter enhancers for different contexts

17 http://cs273a.stanford.edu [BejeranoFall13/14] 17 Imagine a giant state machine Gene Transcription factors bind DNA, turn on or off different promoters and enhancers, which in-turn turn on or off different genes, some of which may themselves be transcription factors, which again changes the presence of TFs in the cell, the state of active promoters/enhancers etc. Proteins DNA transcription factor binding site

18 http://cs273a.stanford.edu [BejeranoFall13/14] 18 One nice hypothetical example requires active enhancers to function functions independently of enhancers

19 http://cs273a.stanford.edu [BejeranoFall13/14] 19 The State Space Discrete, but very large. All states served by same genome(!) 10 12 cells 1 cell

20 http://cs273a.stanford.edu [BejeranoFall13/14] 20 Transcription Activation: Some measurements and observations

21 Transcription Factor Binding Sites (TFBS) An antibody is a large Y-shaped protein used by the immune system to identify and neutralize foreign objects such as bacteria. Antibodies can be raised that instead recognize specific transcription factors. Chromatin Immunoprecipitation followed by deep sequencing (ChIP-seq): Take DNA (region or whole genome) bound by TFs, crosslink DNA-TFs, shear DNA, select DNA fragments bound by TF of interest using antibody, get rid of TF and antibody, sequence pool of DNA.  Obtain genomic regions bound by TF. http://cs273a.stanford.edu [BejeranoFall13/14] 21

22 http://cs273a.stanford.edu [BejeranoFall13/14] 22 ChIP-seq  Position Weight Matrix Computational challenge: The sequenced DNA fragments are 200-500bp. In each is one or more instance of the 6-20bp motif. Find it…

23 Transcription Factors have Large “fan outs” We could have had one TF regulate two TFS, each of which regulates two other TFs, etc. and each of those contributing to the regulation of a modest number of target genes (that do the real work). Instead TFs reproducibly bind to thousands of genomic locations almost anywhere we’ve looked. Gene regulation forms a dense network. http://cs273a.stanford.edu [BejeranoFall13/14] 23

24 http://cs273a.stanford.edu [BejeranoFall13/14] 24 Transfections enhancerreporter gene minimal promoter in cellular context of choice As far as we’ve seen, enhancers work “the same” irrespective of distance (or orientation) to TSS, or identity of target gene. Which enhancers work in what contexts? What if you mutate enhancer bases (disrupt or introduce binding sites) and run the experiment again? What if you co-transfect a TF you think binds to this enhancer? What if you instead add siRNA for that TF?

25 25 Transcription factors bind synergistically, often with preferred spacing Adapted from Kamach et al., Genes Dev, 2001 Sox:1 bp:Pax Sox2 Pax6 Sox2Pax6 06018051080100120140160 Fold activation Sox2 Pax6 Sox2Pax6 06018051080100120140160 Fold activation Transcription factor complexes prefer specific spacings! Sox:3 bp:Pax {+2} http://cs273a.stanford.edu [BejeranoFall13/14]

26 26 Strict spacing between binding sites is important for structural interactions http://cs273a.stanford.edu [BejeranoFall13/14]

27 If a complex prefers TF : spacer : TF This pattern may be abundant in the genome 27 Complexes may leave genomic footprints TAAACAGGAAGT AAAACAGGAATA ATAACAGGATGC TTAACAGGAAAG TAAACAGGATAG AAAACAGGAAAA http://cs273a.stanford.edu [BejeranoFall13/14] Can we read complexes from individual predictions?

28 Cooperative binding of complexes can be detected as the co-occurrence of individuals 28 Each dot = different spacer http://cs273a.stanford.edu [BejeranoFall13/14]

29 Co-occurrences can be filtered for only structurally feasible patterns Fox { spacer } Ets Remove physically incompatible configurations = = 29 compatibleincompatible http://cs273a.stanford.edu [BejeranoFall13/14]

30 Complex motifs were grouped to reduce redundancy Statistically Significant (p < 1×10 -8 ) & valid motifs 300 transcription factor motifs 6,548,947 motif spacing combinations Started with: 6,180 significant motif spacing combinations Found: 30 422 unique complex motifs Grouping Fox { spacer } Ets Searched: (TF1 {spacer} TF2) http://cs273a.stanford.edu [BejeranoFall13/14]

31 31 Transgenics enhancerreporter gene minimal promoter Observe enhancer behavior in vivo. Qualitative (not quantitative) assay. Can section and stain to obtain more specific cell-type information.

32 http://cs273a.stanford.edu [BejeranoFall13/14] 32 BAC transgenics: necessity vs sufficiency You can take 100-200kb segments out of the genome, insert a reporter gene in place of gene X, and measure regulatory domain expression. You can then continue to delete or mutate individual enhancers.

33 http://cs273a.stanford.edu [BejeranoFall13/14] 33 Gene Regulation: Enhancers are modular and additive Sall1 limb neural tube brain Temporal gene expression pattern “equals” sum of promoter and enhancers expression patterns.

34 Genome Engineering Technologies are in fact constantly improving that allow us to edit the nuclear genome itself. Edit the genome of an embryonic stem cell, breed homozygous modified animals. http://cs273a.stanford.edu [BejeranoFall13/14] 34

35 Chromosome conformation capture (3C) People are also developing methods to detect when two genomic regions far in sequence are in fact interacting in space. Ultimately this will allow to determine experimentally the regulatory domain of each gene (likely condition dependent). http://cs273a.stanford.edu [BejeranoFall13/14] 35

36 4C example result (in a single context) http://cs273a.stanford.edu [BejeranoFall13/14]36 TSS probe Irreproducible peaks

37 Gene Regulation is HOT Despite its complexity gene regulation is currently one of the hottest topics in the study of the human genome. Large projects are pouring tons of money to generate huge descriptive datasets. The challenge now is to glean logic from these piles. http://cs273a.stanford.edu [BejeranoFall13/14] 37 To be continued…


Download ppt "[BejeranoFall13/14] 1 MW 12:50-2:05pm in Beckman B302 Profs: Serafim Batzoglou & Gill Bejerano TAs: Harendra Guturu & Panos."

Similar presentations


Ads by Google