Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—mouse N50 contig length Terminology: N50 contig length If we sort.

Similar presentations


Presentation on theme: "CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—mouse N50 contig length Terminology: N50 contig length If we sort."— Presentation transcript:

1 CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—mouse N50 contig length Terminology: N50 contig length If we sort contigs from largest to smallest, and start Covering the genome in that order, N50 is the length Of the contig that just covers the 50 th percentile. 7.7X sequence coverage

2 CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—dog 7.5X sequence coverage

3 CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—chimp 3.6X sequence Coverage Assisted Assembly

4 CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 History of WGA 1982: -virus, 48,502 bp 1995: h-influenzae, 1 Mbp 2000: fly, 100 Mbp 2001 – present  human (3Gbp), mouse (2.5Gbp), rat *, chicken, dog, chimpanzee, several fungal genomes Gene Myers Let’s sequence the human genome with the shotgun strategy That is impossible, and a bad idea anyway Phil Green 1997

5 $399 Personal Genome Service $2,500 Health Compass service $985 deCODEme (November 2007) (April 2008) $350,000 Whole-genome sequencing (November 2007) Genetic Information Nondiscrimination Act (May 2008)

6 Whole-genome sequencing Comparative genomics Genome resequencing Structural variation analysis Polymorphism discovery Metagenomics Environmental sequencing Gene expression profiling Applications Genotyping Population genetics Migration studies Ancestry inference Relationship inference Genetic screening Drug targeting Forensics

7 CS273a Lecture 9, Aut08, Batzoglou Sequencing applications Demand for more sequencing Sequencing technology improvement Increase in sequencing data output New sequencing applications

8 CS273a Lecture 9, Aut08, Batzoglou Sequencing technology Sanger sequencing 19751980200819902000 $10.00 $1.00 $0.10 $0.01 Cost per finished bp: Read length:15 – 200 bp500 – 1,000 bp Throughput: “grad-student years”2 ∙ 10 6 bp/day Fred Sanger

9 CS273a Lecture 9, Aut08, Batzoglou Sequencing technology Sanger sequencing 3 ∙ 10 9 bp 1x coverage 10x coverage 2 ∙ 10 6 bp/day = 40 years × 3 ∙ 10 9 bp 10x coverage × 3 ∙ 10 9 bp × $0.001/bp = $30 million

10 CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Pyrosequencing on a chip Mostafa Ronaghi, Stanford Genome Technologies Center 454 Life Sciences

11 CS273a Lecture 9, Aut08, Batzoglou Sequencing technology Next-generation sequencing Read length:250 bp Throughput:300 Mb/day Cost: ~ 10,000 bp/$ De novo:yes Genome Sequencer / FLX “short reads”

12 CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Single Molecule Array for Genotyping—Solexa

13 CS273a Lecture 9, Aut08, Batzoglou Sequencing technology Next-generation sequencing Read length: ~ 35 bp Throughput:300 – 500 Mb/day Cost: ~ 100,000 bp/$ De novo:yes Genome AnalyzerSOLiD Analyzer “microreads”

14 CS273a Lecture 9, Aut08, Batzoglou Sequencing technology Next-generation sequencing Read length: ~ 50-150 bp Throughput:3 Gb/day Cost: ~ 3,000,000 bp/$ De novo:yes Genome AnalyzerSOLiD Analyzer reads

15 CS273a Lecture 9, Aut08, Batzoglou Illumina Projections

16 CS273a Lecture 9, Aut08, Batzoglou Complete Genomics  $5,000 this summer  Quality?...  1,000 genomes in 2009  20,000 genomes in 2010

17 CS273a Lecture 9, Aut08, Batzoglou Pacific Biosciences

18 CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 2006: $10 million 2008: $100,000 2009: $10,000 ? $1,000 ??? $100 So, how fast is cost going down?

19 CS273a Lecture 9, Aut08, Batzoglou Molecular Inversion Probes

20 CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Illumina Genotype Arrays

21 CS273a Lecture 9, Aut08, Batzoglou Sequencing technology Next-generation sequencing Read length:1 bp Throughput:1 – 2 Mb/day Cost:5,000 bp/$ De novo:no Infinium AssayGeneChip Array genotypes “SNP chips”

22 CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Nanopore Sequencing http://www.mcb.harvard.edu/branton/index.htm

23 CS273a Lecture 9, Aut08, Batzoglou Sequencing technology Next-generation sequencing

24 CS273a Lecture 9, Aut08, Batzoglou Sequencing technology TechnologyRead length (bp) Throughput (Mb/day) Cost (bp/$) De novo Sanger1,0002 45425030010,000 Solexa / ABI35500100,000 SNP chip125,000 ApplicationSanger454Solexa/ABI SNP chip Bacterial sequencing$ Mammalian sequencing$$$$$ not likely today Mammalian resequencing$$$$$$ Metagenomics$$$ Genotyping$$$

25 CS273a Lecture 9, Aut08, Batzoglou Multiple Sequence Alignment

26 CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Evolution at the DNA level …ACGGTGCAGTTACCA… …AC----CAGTCCACCA… Mutation SEQUENCE EDITS REARRANGEMENTS Deletion Inversion Translocation Duplication

27 CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Evolutionary Rates OK X X Still OK? next generation

28 CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Orthology, Paralogy, Inparalogs, Outparalogs

29 CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008


Download ppt "CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—mouse N50 contig length Terminology: N50 contig length If we sort."

Similar presentations


Ads by Google