CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—mouse N50 contig length Terminology: N50 contig length If we sort.

Slides:



Advertisements
Similar presentations
applications of genome sequencing projects
Advertisements

High throughput sequencing Barbera van Schaik
Vanderbilt Center for Quantitative Sciences Summer Institute Sequencing Analysis Yan Guo.
Doug Brutlag 2011 Sequencing the Human Genome Doug Brutlag Professor Emeritus of Biochemistry.
Next-generation sequencing
Comparative genomics Joachim Bargsten February 2012.
The 454 and Ion PGM at the Genomics Core Facility Dr. Deborah Grove, Director for Genetic Analysis Genomics Core Facility Huck Institutes of the Life Sciences.
What Is Genomics? Genomics is the study of how the entire genome of a species functions as a unit and evolves over time. It is the study of life’s blueprint,
Greg Phillips Veterinary Microbiology
CS262 Lecture 9, Win07, Batzoglou History of WGA 1982: -virus, 48,502 bp 1995: h-influenzae, 1 Mbp 2000: fly, 100 Mbp 2001 – present  human (3Gbp), mouse.
Phylogeny Tree Reconstruction
CS273a Lecture 5, Win07, Batzoglou Quality of assemblies—mouse N50 contig length Terminology: N50 contig length If we sort contigs from largest to smallest,
CS273a Lecture 4, Autumn 08, Batzoglou Some Terminology insert a fragment that was incorporated in a circular genome, and can be copied (cloned) vector.
CS262 Lecture 9, Win07, Batzoglou Fragment Assembly Given N reads… Where N ~ 30 million… We need to use a linear-time algorithm.
Evolution at the DNA level …ACGGTGCAGTTACCA… …AC----CAGTCCACCA… Mutation SEQUENCE EDITS REARRANGEMENTS Deletion Inversion Translocation Duplication.
CS273a Lecture 8, Win07, Batzoglou Evolution at the DNA level …ACGGTGCAGTTACCA… …AC----CAGTCCACCA… Mutation SEQUENCE EDITS REARRANGEMENTS Deletion Inversion.
CS273a Lecture 11, Aut 08, Batzoglou Multiple Sequence Alignment.
Some new sequencing technologies. Molecular Inversion Probes.
CS262 Lecture 11, Win07, Batzoglou Some Terminology insert a fragment that was incorporated in a circular genome, and can be copied (cloned) vector the.
DNA Sequencing Some Terminology insert a fragment that was incorporated in a circular genome, and can be copied (cloned) vector the circular genome (host)
CS262 Lecture 9, Win07, Batzoglou Multiple Sequence Alignments.
DNA Sequencing.
DNA Sequencing. CS273a Lecture 3, Spring 07, Batzoglou DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT.
DNA Sequencing.
CS273a Lecture 10, Aut 08, Batzoglou Multiple Sequence Alignment.
CS273a Lecture 10, Aut 08, Batzoglou CS273a Lecture 10, Fall 2008 Local Alignments.
Workshop in Bioinformatics 2010 Class # Class 8 March 2010.
CS273a Lecture 4, Autumn 08, Batzoglou Fragment Assembly (in whole-genome shotgun sequencing) CS273a Lecture 5.
Sequencing and Assembly Cont’d. CS273a Lecture 5, Win07, Batzoglou Steps to Assemble a Genome 1. Find overlapping reads 4. Derive consensus sequence..ACGATTACAATAGGTT..
CS273a Lecture 9/10, Aut 10, Batzoglou Multiple Sequence Alignment.
Sequencing and Assembly Cont’d. CS273a Lecture 5, Aut08, Batzoglou Steps to Assemble a Genome 1. Find overlapping reads 4. Derive consensus sequence..ACGATTACAATAGGTT..
DNA Sequencing. CS273a Lecture 3, Spring 07, Batzoglou Steps to Assemble a Genome 1. Find overlapping reads 4. Derive consensus sequence..ACGATTACAATAGGTT..
DNA Sequencing. CS262 Lecture 9, Win07, Batzoglou DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT.
1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.
$399 Personal Genome Service $2,500 Health Compass service $985 deCODEme (November 2007) (April 2008) $350,000 Whole-genome sequencing (November 2007)
DNA Sequencing. DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGACTACGTTTTA TATATATATACGTCGTCGT.
CS273a Lecture 1, Autumn 10, Batzoglou DNA Sequencing.
CS273a Lecture 2, Autumn 10, Batzoglou DNA Sequencing (cont.)
DNA Sequencing. CS273a Lecture 3, Autumn 08, Batzoglou DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT.
Short Primer on Comparative Genomics Today: Special guest lecture 12pm, Alway M108 Comparative genomics of animals and plants Adam Siepel Assistant Professor.
Informatics for next-generation sequence analysis – SNP calling Gabor T. Marth Boston College Biology Department PSB 2008 January
Informatics challenges and computer tools for sequencing 1000s of human genomes Gabor T. Marth Boston College Biology Department Cold Spring Harbor Laboratory.
Sequencing a genome (a) outline the steps involved in sequencing the genome of an organism; (b) outline how gene sequencing allows for genome-wide comparisons.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Doug Brutlag Professor.
Diabetes and Endocrinology Research Center The BCM Microarray Core Facility: Closing the Next Generation Gap Alina Raza 1, Mylinh Hoang 1, Gayan De Silva.
Next generation sequencing platforms Applications
Next Now-Generation Genomics: methods and applications for modern disease research Aaron J. Mackey, Ph.D. Center for Public Health.
Dr Katie Snape Specialist Registrar in Genetics St Georges Hospital
ARC Biotechnology Platform: Sequencing for Game Genomics Dr Jasper Rees
Introduction to next generation sequencing Rolf Sommer Kaas.
A Primer on Genetic Variation Variety Lawrence Brody - NHGRI.
CS273a Lecture 4, Autumn 08, Batzoglou CS273a 2011 DNA Sequencing.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
High throughput sequencing: informatics & software aspects Gabor T. Marth Boston College Biology Department BI543 Fall 2013 January 29, 2013.
Stratton Nature 45: 719, 2009 Evolution of DNA sequencing technologies to present day DNA SEQUENCING & ASSEMBLY.
Genome Characterization DNA sequence-ULTIMATE Map DNA sequencing-methods Assembly/sequencing BIO520 BioinformaticsJim Lund Assigned reading: Service 2006.
Vervet Monkey Genomics: Genome Canada and Génome Québec Physical Map Project J. Wasserscheid, G. Leveque, C. Nagy, C. Pinsonnault, and K. Dewar, McGill.
Anna Shcherbina Bioinformatics Challenge Day 01/10/2013 De novo assembly from clinical sample This work is sponsored by the Defense Threat Reduction Agency.
Third Generation Sequencing. Today Illumina – Solexa sequencing technology 454 Life sciences – 454 sequencer Applied Biosystem – SOLiD system Tomorrow.
Virginia Commonwealth University
Next generation sequencing
Gil McVean Department of Statistics
Fragment Assembly (in whole-genome shotgun sequencing)
Very important to know the difference between the trees!
Microarray Technology and Applications
Today… Review a few items from last class
2nd (Next) Generation Sequencing
Next-generation DNA sequencing
THE HUMAN GENOME PROJECT. Gene Myers Let’s sequence the human genome with the shotgun strategy That is impossible, and a bad idea anyway Phil Green 1997.
Presentation transcript:

CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—mouse N50 contig length Terminology: N50 contig length If we sort contigs from largest to smallest, and start Covering the genome in that order, N50 is the length Of the contig that just covers the 50 th percentile. 7.7X sequence coverage

CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—dog 7.5X sequence coverage

CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—chimp 3.6X sequence Coverage Assisted Assembly

CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 History of WGA 1982: -virus, 48,502 bp 1995: h-influenzae, 1 Mbp 2000: fly, 100 Mbp 2001 – present  human (3Gbp), mouse (2.5Gbp), rat *, chicken, dog, chimpanzee, several fungal genomes Gene Myers Let’s sequence the human genome with the shotgun strategy That is impossible, and a bad idea anyway Phil Green 1997

$399 Personal Genome Service $2,500 Health Compass service $985 deCODEme (November 2007) (April 2008) $350,000 Whole-genome sequencing (November 2007) Genetic Information Nondiscrimination Act (May 2008)

Whole-genome sequencing Comparative genomics Genome resequencing Structural variation analysis Polymorphism discovery Metagenomics Environmental sequencing Gene expression profiling Applications Genotyping Population genetics Migration studies Ancestry inference Relationship inference Genetic screening Drug targeting Forensics

CS273a Lecture 9, Aut08, Batzoglou Sequencing applications Demand for more sequencing Sequencing technology improvement Increase in sequencing data output New sequencing applications

CS273a Lecture 9, Aut08, Batzoglou Sequencing technology Sanger sequencing $10.00 $1.00 $0.10 $0.01 Cost per finished bp: Read length:15 – 200 bp500 – 1,000 bp Throughput: “grad-student years”2 ∙ 10 6 bp/day Fred Sanger

CS273a Lecture 9, Aut08, Batzoglou Sequencing technology Sanger sequencing 3 ∙ 10 9 bp 1x coverage 10x coverage 2 ∙ 10 6 bp/day = 40 years × 3 ∙ 10 9 bp 10x coverage × 3 ∙ 10 9 bp × $0.001/bp = $30 million

CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Pyrosequencing on a chip Mostafa Ronaghi, Stanford Genome Technologies Center 454 Life Sciences

CS273a Lecture 9, Aut08, Batzoglou Sequencing technology Next-generation sequencing Read length:250 bp Throughput:300 Mb/day Cost: ~ 10,000 bp/$ De novo:yes Genome Sequencer / FLX “short reads”

CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Single Molecule Array for Genotyping—Solexa

CS273a Lecture 9, Aut08, Batzoglou Sequencing technology Next-generation sequencing Read length: ~ 35 bp Throughput:300 – 500 Mb/day Cost: ~ 100,000 bp/$ De novo:yes Genome AnalyzerSOLiD Analyzer “microreads”

CS273a Lecture 9, Aut08, Batzoglou Sequencing technology Next-generation sequencing Read length: ~ bp Throughput:3 Gb/day Cost: ~ 3,000,000 bp/$ De novo:yes Genome AnalyzerSOLiD Analyzer reads

CS273a Lecture 9, Aut08, Batzoglou Illumina Projections

CS273a Lecture 9, Aut08, Batzoglou Complete Genomics  $5,000 this summer  Quality?...  1,000 genomes in 2009  20,000 genomes in 2010

CS273a Lecture 9, Aut08, Batzoglou Pacific Biosciences

CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall : $10 million 2008: $100, : $10,000 ? $1,000 ??? $100 So, how fast is cost going down?

CS273a Lecture 9, Aut08, Batzoglou Molecular Inversion Probes

CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Illumina Genotype Arrays

CS273a Lecture 9, Aut08, Batzoglou Sequencing technology Next-generation sequencing Read length:1 bp Throughput:1 – 2 Mb/day Cost:5,000 bp/$ De novo:no Infinium AssayGeneChip Array genotypes “SNP chips”

CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Nanopore Sequencing

CS273a Lecture 9, Aut08, Batzoglou Sequencing technology Next-generation sequencing

CS273a Lecture 9, Aut08, Batzoglou Sequencing technology TechnologyRead length (bp) Throughput (Mb/day) Cost (bp/$) De novo Sanger1, ,000 Solexa / ABI ,000 SNP chip125,000 ApplicationSanger454Solexa/ABI SNP chip Bacterial sequencing$ Mammalian sequencing$$$$$ not likely today Mammalian resequencing$$$$$$ Metagenomics$$$ Genotyping$$$

CS273a Lecture 9, Aut08, Batzoglou Multiple Sequence Alignment

CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Evolution at the DNA level …ACGGTGCAGTTACCA… …AC----CAGTCCACCA… Mutation SEQUENCE EDITS REARRANGEMENTS Deletion Inversion Translocation Duplication

CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Evolutionary Rates OK X X Still OK? next generation

CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Orthology, Paralogy, Inparalogs, Outparalogs

CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008