Download presentation
Presentation is loading. Please wait.
1
28-Apr 8:15AM – 8:45AM Next-Gen Seq Data Management Thanks to: Advancing Personal Genetics with Second Generation Sequencing
2
Context: Personal Genomics Landscape direct-to-consumer -- hybrid -- research only REVEAL * * * *
3
23andme
4
Over 600 alleles of BRCA1 (Myriad/DNAdirect sequencing not chips)
5
PersonalGenomes.org Project Goals 1) Low cost: <$1K : 98% exome (or more) 2) Active subject participation, informed redaction 3) Avoid over-promising de-identification 4) Entrance exam to ensure highly informed consent 5) Multiple samples to ensure consistent IDs 6) Open access (not just researcher subset) 7) Trait questionnaire, stem cell RNA, biome 8) Cells available for personal functional genomics 9) Scaleable to 100,000 diverse research subjects 0431 1070 1660 1677 1687 1781 1833 1846 Coriell GM2 Employers/Insurers > Non-Discrimination Act Actionable alleles are rare > all at risk Non-actionable alleles > activism 1731
6
3 Exponential technologies 3 to 18 month doubling times Shendure J, Mitra R, Varma C, Church GM, 2004 Nature Reviews of Genetics. Kurzweil 2002; Moore 1965 urea B12 tRNA telegraph Computation & Communication Analytic tRNA Synthetic chemistry human Gbp chips
7
Chips vs. Gen-2 Sequencing Illumina Affymetrix bead-array Roche-454 Illumina ABI-SOLiD Harvard-Danaher Polonator-G007 Chips: 0.02% of the genome – assumes common DNA variants stay associated with deleterious variants over 50,000 years Sequencing 98% genome accesses the deleterious variants directly Helicos
8
G A C T Multiplex Cyclic Sequencing by Synthesis Single instrument, multiple chemistries: polonies on slides or beads Polymerase -or- Ligase Shendure, Porreca, et al. 2005 Science Illumina, IBS AB- SOLiD, CGI Mitra, et al. 2003 Analyt. Biochem. 1999 NAR
9
36 to 64 flowcells (+ DNA barcodes) 2 to 4 billion beads 8.5 thick sequence image
10
Open-source hardware, software, wetware: Polonator G.007 (12TB image > 120 Gbp /run) Enzyme/oligo kits Polymerase or Ligase chemistries $150K including computer & 1 yr service, software, support Danaher Inc.
11
Effect of improvements on cost ImprovementFactorFeature cost/run Sequencing cost/run Gb/runReagent cost/Gb Fold decrease None1$1,677$68510$292 Flowcell volume5$1,677$13710$1811.4 Useable yield6$1,677$68560$396 Instrument speed2$3,354$1,37020$2361 Emulsion sorting18$93$68510$783 Readlength 48bp3.7$1,677$2,53437$1142 ALL$186$1014444$2.7088 Polonator instrument 3 yr amortization: $150k / 300 runs = $500/run = $50/Gb $150k / 81 runs = $1850/run = $4.2/Gb ($10 vs. $2000 / Gb for other 2 nd gen)
12
Personal genome sequencing options/goals Technology Genome Cost Raw bp AB3730 98%$30M 7x = 42 Gb (3.5x each) Knome 98%$350K 15x = 84 Gb SNP-chip 0.02%$1K 2 Mbp PGP coding 1%$90 30x = 1Gb PGP RNA 99%$20 30x 20K*n = 60 Mb (n=100 cell types for RNA) -path/resistome - $20 rRNA + 20K genes VDJ-Immunome- $20 ?
13
Selective genome sequencing Shendure, et al. Science 309(5741):1728-32. Nilsson et al. (2006) Trends Biotechnol 24:83. Red=Synthetic; Yellow=genome/cDNA How do we optimize >100K 100mers ? 8 ways to capture alleles from genomic or c-DNA In vitro Paired-end- tags (PET) Gap fill Cleave & ligate Zhang, Chou, Shendure, Li, Leproust, Dahl, Davis, Nilsson, Church For rearrangements 2. 3. 4. Hybr-select-chip 5. Hybr-select-solution 6. fluidic PCR 7. Multiplex PCR 1.
14
Circle Capture DNA from Chips
15
Aug 2007 R=.53 Jan 2008 R=.986 Zhang, Li et al. unpublished Gap fill Circle-capture 1% genome
16
Genome to Phenome: Population Variation G A T C Zhang & Church unpublished cis Trans Gene products Gene Expression Genome Environment Traits
17
G A T C Allele-specific expression (ASE) Combine all cis element variants G A AAAAA T C T T Enhancer, promoter, splicing, polyA, termination, transport, decay. Eliminate environmental & trans-acting variation among individuals. G A G G Allele-specific transcription factor binding TF ChIP-Seq Digital RNA allelotyping Zhang, LI, Church unpublished Forton et al. Genome Res. 2007
18
Genomic DNA Lymphocyte cDNA Lymphocyte cDNA Fibroblast cDNA Keratinocyte rs1264899, ATP5F1, ATP synthase T/C = 0.51 T/C = 3.47 T/C = 3.73 Tissue specific & allele specific gene expression confirmatory assays Kun Zhang & Alice Li
19
25X probe * 72X time =1800X Better efficiency. Kun Zhang & Billy Li Genomic DNA Aug 2007 Genomic DNA Jan 2008 cDNA Jan 2008
20
Challenge: Multiple cell types from healthy adults 3mm skin sample
21
PGP Physicians Network Volunteers Induction of Multiple Gene Sets (not necessarily functional tissues) Primary fibroblasts Complex Traits via Allele-Specific Gene Expression Induced Stem Cells mRNA Multiplexed Differentiation Multiplexed Reprogramming Sequence tag quantitation Jay Lee et al. unpublished
22
Induced Pluripotent Stem Cell Generation & Transdifferentiation (Oct4/Sox2/Myc/Klf4) Retroviral Infection Tissue Culture on a Mouse Feeder Layer ES Cell Colony Identification Clonal Isolation and Propagation Embryoid Body Induction & Guided Differentiation Adenoviral Infection Mixture of differentiated cell types & Guided Differentiation 2 months Multiple integration sites 1 week No genomic integration Yamanaka, Daley, Thomson Hochedlinger, Jaenisch labs Lee & Church
23
Multiple cell-types with transdifferentiation Retroviral Infection Adenoviral Infection MyoD CD34 Collagen
24
Kun Zhang & Fan Liang Green: phase contrast image Red: Cy5-labelled Alu probe Nunc or UCSD Haplotyping by amplification of single chromosomes or fragments
25
Ultra-clean conditions for reduction of background amplification + Real-Time monitoring Post-amplification chip hybridization distinguishes alleles Amplification variation random & easily filled by PCR error rate <1.7 10 –5 Single-cell or Single DNA-fragment (haplotype) sequencing: 5 Mbp Zhang et al. Nature Biotec 2006
26
Environments of Genomes VDJ-ome TRAITS biome RNAome PERSONAL GENOME One in a life-time genome + yearly ( to daily) tests Bio-weather map : Allergens, Microbes, Viruses
27
PGP Resistome: 18 Antibiotics Dantas, Sommer, Church unpublished
28
Bacteria Subsisting on 18 Antibiotics Dantas Sommer Church Science 2008
29
Personal genome sequencing options/goals Technology Genome Cost Raw bp AB3730 98%$30M 7x = 42 Gb (3.5x each) Knome 98%$350K 15x = 84 Gb SNP-chip 0.02%$1K 2 Mbp PGP coding 1%$90 30x = 1Gb PGP RNA 99%$20 30x 20K*n = 60 Mb (n=100 cell types for RNA) -path/resistome - $20 rRNA + 20K genes VDJ-Immunome- $20 ?
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.