Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Use a circular template to get redundant reads and so more accuracy. Pacific Biosciences.

Similar presentations


Presentation on theme: "1 Use a circular template to get redundant reads and so more accuracy. Pacific Biosciences."— Presentation transcript:

1 1 Use a circular template to get redundant reads and so more accuracy. Pacific Biosciences

2 2 DNA methylation detection by bisulfite conversion

3 3 Detection of methylated adenine in Pacific Biosciences (SMRT) sequencing

4 4 IPD = average interpulse duration ratio (meth/non-meth) Template position

5 5 Pacific Biosciences 50,000 ZMWs (Aug., 2011), and density may climb Long reads (e.g., full molecules to determine full length splicing isoforms) Direct RNA sequencing possible. DNA methylation detectable

6 6 Agilent SureSelect RNA Target Enrichment Capture a subgenomic region of interest for economy and speed of sequencing: E.g., the entire exome (all exons w/o introns or intergeneic regions) hundreds of cancer genes a particular genomic locus Alternative: hybridize to a custom microarray. Agilent

7 7 Nimblegen (Roche) sub=-genomic DNA capture options: Beads or microarrays

8 8 Targeted Capture and Next- Generation Sequencing Identifies C9orf75, encoding Taperin, as the Mutated Gene in Nonsyndromic Deafness DFNB79 Rehman et al. American Journal of Human Genetics 86, 378–388,2010 Some results using DNA capture for subgenomic sequencing

9 9 ----CpG-- > ----C m pG--- > < ---G p C m --- Na bisulfite Heat cytosine uracil ----UpG-- > ----C m pG--- > Na bisulfite Heat deamination PCR ----TpG-- > <--ApC--- ----CpG-- > <--GpC--- All NON-methylated Cs changed to T. Sequence and compare to deduce the methylated C’s Detection of methylated C (~all in CpG dinucleotides) DS DNA

10 10 DEEP SEQUENCING (Next generation sequencing, High throughput sequencing, Massively parallel sequencing) applications: Human genome re-sequencing (mutations, SNPs, haplotypes, disease associations, personalized medicine) Tumor genome sequencing Microbial flora sequencing (microbiome, viruses) Metagenomic sequencing (without cell culturing) RNA sequencing (RNAseq; gene expression levels, miRNAs, lncRNAs, splicing isoforms) Chromatin structure (ChIP-seq; histone modifications, nucleosome positioning) Epigenetic modifications (DNA CpG methylation and hydroxymethylation) Transcription kinetics (GROseq; nascent RNA, BrdU pulse labeled RNA) High throughput genetics (QUEPASA; cis-acting regulatory motif discovery) Drug discovery (bar-coded organic molecule libraries) [Manocci PNAS paper]

11 11 Ke et al, and Chasin, Quantitative evaluation of all hexamers as exonic splicing elements. Genome Res. 2011. 21: 1360-1374 ). Order an equal mixture of all 4 bases at these 6 positions

12 12 Quantifying extensive phenotypic arrays from sequence arrays (= QUEPASA)

13 13 Rank6-merESRseq score (~ -1 to +1) 1 AGAAGA 1.0339 2 GAAGAT 0.9918 3 GACGTC 0.9836 4 GAAGAC 0.9642 5 TCGTCG 0.9517 6 TGAAGA 0.9434 7 CAAGAA 0.9219 8 CGTCGA 0.8853 : 4086TAGATA -0.8609 4087AGGTAG -0.8713 4088CGTCGC 0.8850 4089CTTAAA -0.8786 4090CCTTTA -0.8812 4091GCAAGA 0.8911 4092TAGTTA -0.8933 4093TCGCCG 0.9113 4094CCAGCA -0.8942 4093CTAGTA -0.9251 4094TAGTAG -0.9383 4095TAGGTA -0.9965 4096CTTTTA -1.0610 Best exonic splicing enhancers Worst exonic splicing enhancers, = best exonic splicing silencers - - -

14 14 Composite exon (from ~100,000) Constitutive exons Alternativexons Pseudo exons

15 15 Experiment: 1 1 1 2 2 1+2 2 2 1 2 Sequence of 36 Quality code CGCACTGTGCTGGAGCTCCCGGGGTTAACTCTAGAA abU^Vaa`a\aaa]aWaTNZ`aa`Q][TE[UaP_U] TACACTGTGCTGGAGCTCCCAACGGCAACTCTAGAA a`P^Wa`[`Wa^`X_X_XWVa^NSP]_]S^X_T\X^ CGCACTGTGCTGGAGCTCCCATGGAGAACTCTAGAA aTa`^b``baaaa^aab^YaTQLOHIa`^a``TX]] TACACTGTGCTGGAGCTCCCCTCCCAAACTCTAGAA I_`aaaa`aaaaaaa_a_^[KZIGIGZ`U`\^P^^` CGCACTGTGCTGGAGCTCCCAATAGTAACTTTAGAA aY_\abb[T\abaaa`a`bZ[HXXIZa_`_LGMS[` TATACTGTGCTGGAGCTCCCGACGTAAACTCTAGAA aba]^aa_a]`aa]_]`XWSMFGGIPX[P]X`V_Y^ TACACTGTGCTGGAGCTCCCTGGTAAAACTCTAGAA a_^a^aa`aYaaa_aY`Y_^[I]VY\`]V]R\W]VV TACACTGTGCTGGAGCTCCCAATAAAAACTCTAGAA XZababa`aZaaaaaYaYXX`baa``\\TaUa\aW` 2 nt barcode (TA or CG) Constant regions (peculiar to our expt.) Variable region Barcoding allows multiplexing of several or many experiments at once (in one channel of a sequencer)  economy. Here, two biological replicates What the data looks like: Error

16 16 Next generation methods for high throughput genetic analysis: Use custom oligo libraries to construct minigene libraries (40,000, up to 60 nt long): E.g., for saturation mutagenesis to identify all exonic bases contributing to splicing (or transcription or polyadenylation, …..) Use bar codes to detect sequences missing from the selected molecules E.g., Nat Biotechnol. 2009 27:1173-5. High-resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis. Patwardhan RP, Lee C, Litvin O, Young DL, Pe'er D, Shendure J. Long (200-mer) synthetic oligo library

17 17 OUTLINE OF LECTURE TOPICS COMING UP Expression and manipulation of transgenes in the laboratory In vitro mutagenesis to isolate variants of your protein/gene with desirable properties –Single base mutations –Deletions –Overlap extension PCR –Cassette mutagenesis To study the protein: Express your transgene –Usually in E. coli, for speed, economy –Expression in eukaryotic hosts –Drive it with a promoter/enhancer –Purify it via a protein tag –Cleave it to get the pure protein Explore protein-protein interaction Co-immunoprecipitation (co-IP) from extracts 2-hybrid formation surface plasmon resonance FRET (Fluorescence resonance energy transfer) Complementation readout

18 18 PCR fragment subsequent cloning in a plasmid (or not, the PCR product itself can be used in many ways, e.g., transfection) Cut with RE 1 and 2 Ligate into similarly cut vector RS1 RS2 RS1 RS2 Site-directed mutagenesis by overlap extension PCR 1 2 Strachan and Read Human Mol. Genet.3, p.148

19 19 Original sequence coding for, e.g., a transcription enhancer region Cassette mutagenesis = random mutagenesis but in a limited region: 1) by error-prone PCR ------*--------*--*-**---------------*-----------*--*------ -*------------------------*-*-*------------*------------*-- ----------------------------------------------------------- Cut in primer sites and clone upstream of a reporter protein sequence. Pick colonies Analyze phenotypes Sequence PCR fragment with high Taq polymerase and Mn +2 instead of Mg +2  errors

20 20 Original enhancer sequence -*------------------------*-*-*------------*------------*-- ------*--------*--*-**---------------*-----------*--*------ ----------------------------------------------------------- Buy 2 doped oligos; anneal OK for up to ~80 nt. Clone upstream of a reporter. Doping = e.g., 90% G, 3.3% A, 3.3% C, 3.3% T at each position Pick colonies Analyze phenotypes Sequence Cassette mutagenesis = random mutagenesis but in a limited region: 2) by “doped” synthesis Target = e.g., an enhancer element

21 21 E. coli as a host PROs:Easy, flexible, high tech, fast, cheap; but problems CONs Folding (can misfold) Sorting within the cell -> can form inclusion bodies Purification -- endotoxins Modifications -- not done ( glycosylation, phosphorylation, etc. ) Modifications: Glycoproteins Acylation: acetylation, myristoylation Methylation (arg, lys) Phosphorylation (ser, thr, tyr) Sulfation (tyr) Prenylation (farnesyl, geranylgeranyl on cys) Vitamin C-Dependent Modifications (hydroxylation of proline and lysine) Vitamin K-Dependent Modifications (gamma carboxylation of glu) Selenoproteins (seleno-cys tRNA at UGA stop)

22 E. coli expression vectors Promoter examples: 1) Lac promoter (with operator)-YFG, + lac repressor (I gene): Induce expression by inactivationof thelac repressor with IPTG or lactose 2) As above but with a hybrid Tac promoter (tryptophan operon + lac operon): Stronger. Use i q mutant of lac I gene, which prodices high levels of the lac repressor. Expression regulatatable over several orders of magnitude. 3) BAD promoter-YFG. Arabinose utilization operon. Inducible by arabinose via the endogenous araC gene for a transciptional activator. Background levels driven down by including glucose. 4) Phage T7 promoter-YFG. Vector carries gene for T7 polymerase, under control of the lac promoter. Add IPTG or lactose to induce T7 polymerase and thence YFG. IPTG = isoproplthiogalactoside (non-metabolizable indicer) YFG = your favorite gene

23 23 Myristoylation – myristoic acid to N-terminal glycine alpha amino group Anchors protein to memebrane.

24 24 Lysine epsilon amino group modifications mono methyl, dimethyl also Well-studied in histones, microtubules

25 25 Via seleno-cys tRNA at a UGA nonsense codon Sequence context dictates efficiency.

26 26 Gamma carboxylation of glutamic acid Binds calcium, used in coagulation proteins

27 27 Some alternative hosts Yeasts (Saccharomyces, Pichia) Insect cells with baculovirus vectors Mammalian cells in culture (later) Whole organisms (mice, goats, corn) (not discussed) In vitro (cell-free), for analysis only, not preparatively (good for radiolabeled proteins, discussed later)

28 Some popular yeast promoters ARS = autonomously replicating sequence element Selectable marker ori http://biochemie.web.med.uni- muenchen.de/Yeast_Biol/04http://biochemie.web.med.uni- muenchen.de/Yeast_Biol/04 Yeast Molecular Techniques.pdf

29 29 Yeast Expression Vector (example) 2μ = 2 micron plasmid 2 mu seq features: yeast ori ori E = bacterial ori Amp r = bacterial selection LEU2, e.g. = Leu biosynthesis for yeast selection Saccharomyces cerevisiae (baker’s yeast) ori E Your favorite gene (Yfg) LEU2 Amp r GAPD term’n GAPD prom Complementation of an auxotrophy can be used instead of drug-resistance Auxotrophy = state of a mutant in a biosynthetic pathway resulting in a requirement for a nutrient GAPD = the enzyme glyceraldehyde-3 phosphate dehydrogenase For growth in E. coli

30 Got this far

31 31 Genomic DNA HIS4 mutation - Yeast - genomic integration via homologous recombination HIS4 gfY pt Vector DNA Functional HIS4 gene Defective HIS4 gene Yfg t p Genomic DNA

32 32 Double recombination Yeast (integration in Pichia pastoris) AOX1 gene (  ~ 30% of total protein) Genomic DNA AOX1p Yfg AOX1tHIS4 3’AOX1 Genomic DNA HIS4 Yfg AOX1p AOX1t 3’AOX1 Vector DNA P. pastoris -tight control -methanol induced (AOX1) -large scale production (gram quantities) Alcohol oxidase gene

33 Expression in mammalian cells Lab examples of immortal cell lines: HEK293 Human embyonic kidney (high transfection efficiency) HeLa Human cervical carcinoma (historical, low RNase) CHO Chinese hamster ovary (hardy, diploid DNA content, mutants) CosMonkey cells with SV40 replication proteins (-> high transgene copies) 3T3Mouse or human exhibiting ~regulated (normal-like) growth + various others, many differentiated to different degrees, e.g.: BHKBaby hamster kidney HepG2Human hepatoma GH3Rat pituitary cells PC12Mouse neuronal-like tumor cells MCF7Human breast cancer HT1080 Human fibroblastic cells with near diploid karyotype IPSinduced pluripotent stem cells and: Primary cells cultured with a limited lifetime. E.g., MEF = mouse embryonic fibroblasts, HDF = Human diploid fibroblasts Common in industry: NS1mAbsMouse plasma cell tumor cells Vero vaccines African greem monkey cells CHOmAbs, other therapeutic proteinsChinese hamster ovary cells PER6mAbs, other therapeutic proteinsHuman retinal cells

34 Mammalian cell expression Generalized gene structure for mammalian expression: cDNA gene Mam.prom. polyA site intron 5’UTR 3’UTR Intron is optional but a good idea

35 Popular mammalian cell promoters SV40 LargeT Ag (Simian Virus 40) RSV LTR (Rous sarcoma virus) MMTV (steroid inducible) (Mouse mammary tumor virus) HSV TK (low expression) (Herpes simplex virus) Metallothionein (metal inducible, Cd ++ ) CMV early (Cytomegalovirus) Actin EIF2alpha Engineered inducible / repressible: tet, ecdysone, glucocorticoid (tet = tetracycline)

36 Engineered regulated expression: Tetracycline-reponsive promoters Tet-OFF (add tet  shut off) tTA cDNA tTA = tet activator fusion protein: tetR = tet repressor (original role) tetR domain VP16 transcription activation domain No tet. Binds tet operator (multiple copies) (if tet not also bound) tetR domain Tetracycline (tet), or, better, doxicyclin (dox) active not active CMV prom. polyA site tTA gene must be in cell (permanent transfection, integrated): Tet-OFF (Bujold et al.) Allosteric change in conformation VP16 transcription activation domain

37 MIN. CMV prom. your favorite gene polyA site Mutliple tet operator elements MIN. CMV prom. your favorite gene polyA site tetR domain VP16 tc’n act’n domain not active little transcripton (2%?, bkgd) Doxicyclin present: MIN. CMV prom. your favorite gene polyA site active Plenty of transcripton No doxicyclin: tetR domain VP16 tc’n act’n domain RNA po l Tet-OFF, cont.

38 Tetracycline-reponsive promoters Tet-ON (add tet  turn on gene tTA cDNA tetR domain VP16 tc’n act’n domain tetR domain VP16 tc’n act’n domain Tetracycline (tet), or, better, doxicyclin (dox) active not active Full CMV prom. polyA site Different fusion protein: Does NOT bind tet operator (if tet not bound) Tet-ON Must be in cell (permanent transfection, integrated): commercially available (293, CHO) or do-it-yourself

39 MIN. CMV prom. your favorite gene polyA site Mutliple tet operator elements MIN. CMV prom. your favorite gene polyA site active Doxicyclin absent: MIN. CMV prom. your favorite gene polyA site active Plenty of transcripton (> 50X) Add dox: tetR domain VP16 tc’n act’n domain RNA pol II Tet-ON tetR domain VP16 tc’n act’n domain not active little transcription (bkgd.) doxicyclin


Download ppt "1 Use a circular template to get redundant reads and so more accuracy. Pacific Biosciences."

Similar presentations


Ads by Google