Presentation is loading. Please wait.

Presentation is loading. Please wait.

28-Apr 8:15AM – 8:45AM Next-Gen Seq Data Management Thanks to: Advancing Personal Genetics with Second Generation Sequencing.

Similar presentations


Presentation on theme: "28-Apr 8:15AM – 8:45AM Next-Gen Seq Data Management Thanks to: Advancing Personal Genetics with Second Generation Sequencing."— Presentation transcript:

1 28-Apr 8:15AM – 8:45AM Next-Gen Seq Data Management Thanks to: Advancing Personal Genetics with Second Generation Sequencing

2 Context: Personal Genomics Landscape direct-to-consumer -- hybrid -- research only REVEAL * * * *

3 23andme

4 Over 600 alleles of BRCA1 (Myriad/DNAdirect sequencing not chips)

5 PersonalGenomes.org Project Goals 1) Low cost: <$1K : 98% exome (or more) 2) Active subject participation, informed redaction 3) Avoid over-promising de-identification 4) Entrance exam to ensure highly informed consent 5) Multiple samples to ensure consistent IDs 6) Open access (not just researcher subset) 7) Trait questionnaire, stem cell RNA,  biome 8) Cells available for personal functional genomics 9) Scaleable to 100,000 diverse research subjects 0431 1070 1660 1677 1687 1781 1833 1846 Coriell GM2 Employers/Insurers > Non-Discrimination Act Actionable alleles are rare > all at risk Non-actionable alleles > activism 1731

6 3 Exponential technologies 3 to 18 month doubling times Shendure J, Mitra R, Varma C, Church GM, 2004 Nature Reviews of Genetics. Kurzweil 2002; Moore 1965 urea B12 tRNA telegraph Computation & Communication Analytic tRNA Synthetic chemistry human Gbp chips

7 Chips vs. Gen-2 Sequencing Illumina Affymetrix bead-array Roche-454 Illumina ABI-SOLiD Harvard-Danaher Polonator-G007 Chips: 0.02% of the genome – assumes common DNA variants stay associated with deleterious variants over 50,000 years Sequencing 98% genome accesses the deleterious variants directly Helicos

8 G A C T Multiplex Cyclic Sequencing by Synthesis Single instrument, multiple chemistries: polonies on slides or beads Polymerase -or- Ligase Shendure, Porreca, et al. 2005 Science Illumina, IBS AB- SOLiD, CGI Mitra, et al. 2003 Analyt. Biochem. 1999 NAR

9 36 to 64 flowcells (+ DNA barcodes) 2 to 4 billion beads 8.5  thick sequence image

10 Open-source hardware, software, wetware: Polonator G.007 (12TB image > 120 Gbp /run) Enzyme/oligo kits Polymerase or Ligase chemistries $150K including computer & 1 yr service, software, support Danaher Inc.

11 Effect of improvements on cost ImprovementFactorFeature cost/run Sequencing cost/run Gb/runReagent cost/Gb Fold decrease None1$1,677$68510$292 Flowcell volume5$1,677$13710$1811.4 Useable yield6$1,677$68560$396 Instrument speed2$3,354$1,37020$2361 Emulsion sorting18$93$68510$783 Readlength 48bp3.7$1,677$2,53437$1142 ALL$186$1014444$2.7088 Polonator instrument 3 yr amortization: $150k / 300 runs = $500/run = $50/Gb $150k / 81 runs = $1850/run = $4.2/Gb ($10 vs. $2000 / Gb for other 2 nd gen)

12 Personal genome sequencing options/goals Technology Genome Cost Raw bp AB3730 98%$30M 7x = 42 Gb (3.5x each) Knome 98%$350K 15x = 84 Gb SNP-chip 0.02%$1K 2 Mbp PGP coding 1%$90 30x = 1Gb PGP RNA 99%$20 30x 20K*n = 60 Mb (n=100 cell types for RNA)  -path/resistome - $20 rRNA + 20K genes VDJ-Immunome- $20 ?

13 Selective genome sequencing Shendure, et al. Science 309(5741):1728-32. Nilsson et al. (2006) Trends Biotechnol 24:83. Red=Synthetic; Yellow=genome/cDNA How do we optimize >100K 100mers ? 8 ways to capture alleles from genomic or c-DNA In vitro Paired-end- tags (PET) Gap fill Cleave & ligate Zhang, Chou, Shendure, Li, Leproust, Dahl, Davis, Nilsson, Church For rearrangements 2. 3. 4. Hybr-select-chip 5. Hybr-select-solution 6.  fluidic PCR 7. Multiplex PCR 1.

14 Circle Capture DNA from Chips

15 Aug 2007 R=.53 Jan 2008 R=.986 Zhang, Li et al. unpublished Gap fill Circle-capture 1% genome

16 Genome to Phenome: Population Variation G A T C Zhang & Church unpublished cis Trans Gene products Gene Expression Genome Environment Traits

17 G A T C Allele-specific expression (ASE) Combine all cis element variants G A AAAAA T C T T Enhancer, promoter, splicing, polyA, termination, transport, decay. Eliminate environmental & trans-acting variation among individuals. G A G G Allele-specific transcription factor binding TF ChIP-Seq Digital RNA allelotyping Zhang, LI, Church unpublished Forton et al. Genome Res. 2007

18 Genomic DNA Lymphocyte cDNA Lymphocyte cDNA Fibroblast cDNA Keratinocyte rs1264899, ATP5F1, ATP synthase T/C = 0.51 T/C = 3.47 T/C = 3.73 Tissue specific & allele specific gene expression confirmatory assays Kun Zhang & Alice Li

19 25X probe * 72X time =1800X Better efficiency. Kun Zhang & Billy Li Genomic DNA Aug 2007 Genomic DNA Jan 2008 cDNA Jan 2008

20 Challenge: Multiple cell types from healthy adults 3mm skin sample

21 PGP Physicians Network Volunteers Induction of Multiple Gene Sets (not necessarily functional tissues) Primary fibroblasts Complex Traits via Allele-Specific Gene Expression Induced Stem Cells mRNA Multiplexed Differentiation Multiplexed Reprogramming Sequence tag quantitation Jay Lee et al. unpublished

22 Induced Pluripotent Stem Cell Generation & Transdifferentiation (Oct4/Sox2/Myc/Klf4) Retroviral Infection Tissue Culture on a Mouse Feeder Layer ES Cell Colony Identification Clonal Isolation and Propagation Embryoid Body Induction & Guided Differentiation Adenoviral Infection Mixture of differentiated cell types & Guided Differentiation 2 months Multiple integration sites 1 week No genomic integration Yamanaka, Daley, Thomson Hochedlinger, Jaenisch labs Lee & Church

23 Multiple cell-types with transdifferentiation Retroviral Infection Adenoviral Infection MyoD CD34 Collagen

24 Kun Zhang & Fan Liang Green: phase contrast image Red: Cy5-labelled Alu probe Nunc or UCSD Haplotyping by amplification of single chromosomes or fragments

25 Ultra-clean conditions for reduction of background amplification + Real-Time monitoring Post-amplification chip hybridization distinguishes alleles Amplification variation random & easily filled by PCR error rate <1.7 10 –5 Single-cell or Single DNA-fragment (haplotype) sequencing: 5 Mbp Zhang et al. Nature Biotec 2006

26 Environments of Genomes VDJ-ome TRAITS  biome RNAome PERSONAL GENOME One in a life-time genome + yearly ( to daily) tests Bio-weather map : Allergens, Microbes, Viruses

27 PGP  Resistome: 18 Antibiotics Dantas, Sommer, Church unpublished

28 Bacteria Subsisting on 18 Antibiotics Dantas Sommer Church Science 2008

29 Personal genome sequencing options/goals Technology Genome Cost Raw bp AB3730 98%$30M 7x = 42 Gb (3.5x each) Knome 98%$350K 15x = 84 Gb SNP-chip 0.02%$1K 2 Mbp PGP coding 1%$90 30x = 1Gb PGP RNA 99%$20 30x 20K*n = 60 Mb (n=100 cell types for RNA)  -path/resistome - $20 rRNA + 20K genes VDJ-Immunome- $20 ?


Download ppt "28-Apr 8:15AM – 8:45AM Next-Gen Seq Data Management Thanks to: Advancing Personal Genetics with Second Generation Sequencing."

Similar presentations


Ads by Google