Presentation on theme: "Class 13 Two sequencing methods that aim to sequence"— Presentation transcript:
1Class 13 Two sequencing methods that aim to sequence single DNA moleculesPacific Biosciences “zero mode” wave guideBayley group nanopore methodWhat steps in sequencing methods we considered so farwould ability to sequence single molecules avoid?What technical problem would be eliminated?
2Pacific Bioscience seq. strategy – single-molecule, real-time immobilize DNA pol on glasssurface at bottom of very smallwells (~100nm radius); aim for1 pol molecule/welladd DNA template + primeruse dNTPs with fluor attachedto terminal P so that it is cleaved off during incorporation(how different from previous fluor-dNTPS we discussed?)collect sequence in real time (enz can go ~1-100b/s)
3How long should pulses last? Why would you expect to see only 1 dye at a time?
4What steps in previous methods would enzyme- removal of fluor during synthesis eliminate?How fast were previous methods (in terms of basessequenced/s) given need for chemical steps andwashes between base additions?
5Challenge for detecting single fluor molecules is usually not sensitivity, but reducing background“Zero mode” waveguide (ZMW)– for well diameters << lw/metallic walls, propagating waves blocked; evanescentwaves of exciting and emitted light decay exponentially.Detection depth (volume) ~30nm (10-19 liters) for100nm holes in aluminum. At mM dye conc., <1dye/detection vol on average, and any such moleculediffuses out of detection vol in ~100ms (verify: t ~x2/2D,D= 10-12m2/s) whereas dye on dNTP being incorporatedby DNA pol expected to be retained by DNA pol. for ms.
6Science 299:683, 2003E-beam lithography makesarray of <100nm diam. holes in~ 100nm thick aluminum filmon silica slideFor well diameter << l, I(z) ~ e-kz ;excitation of fluor also inhibitedby wall proximity; effectiveilluminated height theoretically~ 30nm (~ 10x smaller than TIRF)vol ~ liters
7optical set-up to excite and read fluorescence from each well holographic wave platedivides input laser beamsinto array of beamlets, 1/well? -> more light/well thanIf whole field illuminatedprism diffracts emitted lightto collect diff dye-dNTPsignals in different pixels
8http://www. sciencemag 93 rows (1mm spacing) x 33 columns (4mm spacing)Light from each well diffracted laterally for diff. detectors
10Why do you want high (~mM) dNTP conc.? Binding rate = kon [dNTP]Rate (#/s) at which dNTPs bind each polymerase molecule,per M conc of dNTP; if kon = 107/Ms, what conc.of dNTP do you need for pol to synthesize 10b/s?If you use TIRF, min. vol. of illuminated spot ~ pl2*hatt~p (500nm)2 100nm = 10-19m3 = 10-16literHow many dNTPs in this vol. at 1mM? Need <1ZMW gets you to <1 by reducing illum. vol ~1000-fold!
11Idea hinges on using dNTPs with dye labels on phosphate (diff colors for each nt)so dye will be cleaved offduring incorporation (don’tneed separate chemical step)-> real-time sequencingBase-labeled nucleotidePhosphate-labeled nucleotide
12Actual dNTPs used have big dyes + 6 phosphates – They “invented” these dye-NTs and then had to engineer(mutate) DNA pol to use them efficiently
13C A T GEmission spectra - would be easiest to distinguish C fromG, harder to distinguish 1 from 2, and 3 from 4(affects “substitution” error rates)
14Start with “simple” ss-template – 150 bases with alternating regions with multiple G’s or C’s, synthesizeusing dGTP-yellow, dCTP-blue, other bases unlabeled
15Segment from previous trace Same data in graphical formThey at least can pick outthe G’s and C’s separated by0-2 other bases in this regionof template, but note variabilityin peak duration
16Now try a circular 75b ss template Their pol enzyme has “strand displacing” activityWhat advantages might a circular template have?Should you see a pattern repeating every 75 b?What disadvantages would there be to having to usecircular DNA templates?
17Now try all 4 bases on 150b ss template Note variability in pulse widths
18Error ratesSingle run on 150 b template ~30% of reads incomplete12 insertions, 8 deletions, 7 mismatches (~20%)What causes apparent insertions?what if base sticks to pol long enough to be detectedbut falls off before being incorporated, thensticks again and gets incorporated?what could you do about this? – try to engineerenzyme that binds bases tighter (might not work)
19What causes apparent deletions? what if base were photobleached before detection?what if enzyme incorporates a base very quickly?If enzyme has constant average rate of incorporation ofbound base/unit time, expect Poisson distribution of“hold times”, with largest number of shortest durations
20Expected Poisson distribution of hold times They have problem detecting the shortest hold timesbecause of photon counting statistics and noise:dyes produce ~5000 photons/sec -> 50/10ms
21What could you do about this problem? try to slow down incorporation rate chemically(they say pH change has a small effect)try to engineer enzyme that incorporates basesmore slowly (might not work)
22Substitution errors = misreading dyes due to incomplete spectral separation, esp. hard distinguish in short pulseswhat could you do? – try to improve dyesMore immediate solution to high error ratesread each DNA template multiple timesto generate a consensus sequence(circular templates would be useful…)
23They used sequence info from 449 reads of same 150 base template in different wells. Generate aconsensus sequence based on random samples of datain which ? each position appears in >15 sequences.Repeat 100x to generate 100 consensus sequences.Error rate in consensus sequences ~2-3%.If they had 3000 wells,why did they only use449 reads? Suggests theyare getting fewer usefulwells than they say…
24Over-sequencing makes sense if the errors are random but what if the error rate depends on sequence?SummaryMany technical hurdles have been overcome, but errorrate remains very highEven if they got reliable sequence in real time at rateof 3b/s in each of 3000 wells, would need > 15days to do human genome 15-fold redundantly,which is competitors’ claimed current rate; 2 yrsago they said a 1,000,000-well device was coming…
26a-hemolysin: heptameric membrane pore-forming protein (bacterial toxin that punches holes in red blood cells)Spontaneously forms 7-mer and inserts into lipid membrane
27When inserted in membrane, in electrolyte solution, creates channel that allows ions to cross membraneCan easily detect single channels in artificial membraneas they cause step-like changes in current …how manyions go thru pore/sec?pA-20How much current do you expect from 0.7nm radius pore10nm long in 1M KCl at 100mV, if conductivity s = 14S/m?I = V/R = VG = VsA/L = .1*14*p*(.7*10-9)2/10-8 = 200pA
28How can you make membranes, introduce pores? 1 cm_+2 nmCl-K+Add lipid tosolution, raiseand lowermeniscus overhole, 5nm lipidbilayer formsspontaneously (!)teflonbarrierwith~50mmholeAdd a-hemolysin protein to 1 chamber – it inserts itself!
29b-cyclodextrin: heptameric ring of sugars spontaneously inserts inside a-hemolysin porestabilized by coordination with 7 identical sites,one in each a-hemolysin monomer
30b-CD insertion lowers conductivity Why might you wantto reduce porediameter?Would you expectcharges in pore toinfluence current?As dNMPs go through pore, they further decreaseconductivity; can b-CD be modified so that differentbases will -> different decreases in conductivity?
31Bayley’s group have made extensive mutations in a-HL and tested many b-CD derivatives to try to makepores that distinguish DNA bases by extent of decr. cond.Here, devise covalent S-S linkage between a-HL and b-CDbased on single cys in a-HL and S in modified b-CDin order to have stable small diam. poreHow do they get single cys in heptamerica-HL? Mix 2 types of a-HL, 1 with 1 cys andtail of 8-charged (asp) aa’s; other w/no cysor tail; they form different hetero-7mers;select desired 7mer by electrophoresis
32b-CD with -SH group stably associates w/cys-modified a-HL b-CD without S reversibly enters a-HL; with S, it insertsstably but can be removed by reducing -S-S- with DTT
33aHL with stably inserted b-CD senses mixture of bases Residual pore currents come in 4 types – why?
35Scatter plot of dwell time vs residual pore current Dwell times have wide distribution, but averagesdiffer for different bases. What does this suggest?
36Channel can also distinguish methyl-dCMP, a variant of C associated with silenced gene expression: this systemcan detect such “epigenetic” changes more easily thanother sequencing methods
37Idea for sequencing – use exonuclease to degrade template to dNMPs and read them going thru porein the order they are produced
38Problem #1: exonuclease doesn’t work in high salt, which is requiredfor goodbase discrim-ination; theylower salt onside w/exoTest mixed salt system on simpler templates withNo A’sshows they candegrade temp-lates with exo’aseand read basesproducedNo T’s
39Technical challengesReducing [KCL]cis to 200mM allowed exonuclease to work,but decreased ability to distinguish A’s and T’s to ~90%,not good enough for sequencing; they might be able toselect exo’s that can work in high salt (e.g. brine bacteria)<tdwell> ~10ms, but Poisson distribution => most dwellsare short; for short dwells it is harder to distinguishdifferent bases; very short dwells may -> “deletions”Ability to distinguish 4 bases enhanced by +charge onlinker arm; may help to trap bases electrostatically;? more chemical modifications might improvediscrimination
40No data yet that exonuclease can be held near enough to pore (e.g. via aHL-exo fusion protein) thatthat chewed off bases can be read sequentiallyWill path of some bases (drift + diffusion) be suchthat they are read out-of-order, or not read at all?Can pore formation be automated and multiplexed?Nevertheless, extraordinary accomplishment in terms ofchemical adaptation of nanopore to make real-time,label-free, single-molecule detector thatdistinguishes 5 bases
41Some similarities between solid state FETs and ion channels: charge on channel walls regulates rate charged objects(ions, DNA molecules, peptides) go thru poreif pore is small enough compared to transported object,changes in charge -> exponential changes in transportchips – use simple geometries, Coulomb interactions,engineered structures ~ mm x 30nm laterally,a few nm verticallybiology – more complex geometries, Coulomb + chemicalinteractions (H-bonding, covalent bonds), greatercontrol of nanoscale positioning via protein & DNAengineering but less control at mm scale
42Some big ideas from course: Biology at nano-scale is not very different fromchemistry, physics at similar scaleBiological macromolecules (DNA, protein) allownew forms of engineering and controlover nanoscale phenomenaSome tools are novel to biology – e.g. replication via pcrBiology provides new things we want to sense(like DNA) and new tools to sense them
43At nano-level, single things become detectable Detecting single-molecule events can provide info abt.molecular structure not obtainable in bulkmeasurements – e.g. when replicatemolecules cannot be kept in same state forbulk measurements (e.g. phasing problem inDNA sequencing)Aim for detailed and quantitative understanding:how many molecules per sec, unit area, vol;what do they stick to; for how long; how issignal generated; how many photons, etcThink critically as you learn!
44Suggestions for student presentations Go over paper with me beforehand if at all possible!Stick to a few main points – you have only ~20 minutesTry to teach us something interesting you learned