Presentation is loading. Please wait.

Presentation is loading. Please wait.

Population sequencing using short reads: HIV as a case study Vladimir Jojic et.al. PSB 13:114-125 (2008) Presenter: Yong Li.

Similar presentations


Presentation on theme: "Population sequencing using short reads: HIV as a case study Vladimir Jojic et.al. PSB 13:114-125 (2008) Presenter: Yong Li."— Presentation transcript:

1 Population sequencing using short reads: HIV as a case study Vladimir Jojic et.al. PSB 13:114-125 (2008) Presenter: Yong Li

2 Overview Background –Population sequencing & metagenomics –Pyrosequencing & classical sequencing The Problem and the challenge –low concentration; short reads; sequencing errors; The model –sequence & frequency  reads The EM algorithm Validation

3 Background Population sequencing & metagenomics –Multiple strain vs. multiple species –HIV drug resistance from rare variants Pyrosequencing & chromatographical –Ultra-deep sequencing, 454 sequencing –Short reads; high error rate; homopolymers –Sensitivity 0.1% vs. 20% To clone or not to clone? –Two protocols to detect mutational variant –Cloning bias; stoichiometry

4 Genome Res. Wang et al. 17: 1195-1201, 2007 Clonal amplification

5 Genome Res. Wang et al. 17: 1195-1201, 2007

6 The computational problem Given: – 454 sequencing reads Get: –Reconstruct the population Sequences (epitome) –Estimate the relative quantity Statistical model

7 The statistical model (1) Indel frequency Sequencing error parameter

8 The statistical model (2)

9 The hidden variable: Model parameters: Observed variable:, t = 1…T EM algorithm ?

10 Computational tricks One tau Clustering of reads Initialization Determining the number of strains: S –Trails

11 Validation Data is partially simulated –e is composed of real HIV variants –Artificial values for –x generated from the very probabilistic model with 1% substitution; 2% insertion, 0.5% deletion Two datasets –1. Varied strains frequencies, and coverage –2. Varied mutation density

12

13 Discussion High sensitivity compared with chromatography approach –0.1% relative abundance May be applied to metagenomic sequencing Need validation using real date Need comparison with other method

14 Questions?


Download ppt "Population sequencing using short reads: HIV as a case study Vladimir Jojic et.al. PSB 13:114-125 (2008) Presenter: Yong Li."

Similar presentations


Ads by Google