Presentation is loading. Please wait.

Presentation is loading. Please wait.

By Alfonso Farrugio, Hieu Nguyen, and Antony Vydrin Sequencing Technologies and Human Genetic Variation.

Similar presentations


Presentation on theme: "By Alfonso Farrugio, Hieu Nguyen, and Antony Vydrin Sequencing Technologies and Human Genetic Variation."— Presentation transcript:

1 By Alfonso Farrugio, Hieu Nguyen, and Antony Vydrin Sequencing Technologies and Human Genetic Variation

2 Overview  Introduction  Simulating genomic variation and sequencing  Analyzing and comparing different sequencing technologies  Algorithms for detecting human genetic variation

3 Introduction  Different people have different mutations in their genomes  A recent study was done (Nature 453, 56-64, 5/1/2008) where 8 human genomes were compared, and 1,695 structural variants were found

4  Whole-genome shotgun sequencing allows for fast and relatively cheap sequencing of human genomes  New technologies are being developed to allow for accurate detection of human genomic variation  Most of these technologies use short paired reads.  How long should the reads be in order to optimize the process of detecting human genomic variation ?  What algorithms can be used to detect variations in a new individual’s genome ?

5 Simulating Genomic Variation  Program to take a human genome and add randomly-distributed inversions, insertions, deletions, and SNPs  The number of mutations (and their mean lengths) can be controlled by the user  To simplify, no two mutations can overlap each other (the SNPs are an exception)

6 InversionsInsertionsDeletions “Intermediate” mutated genome Original genome

7 Subtract Deletions “Intermediate” mutated genome

8 SNPs “Intermediate” mutated genome (output mutated genome)

9 Simulating Genomic Sequencing  Program to take a human genome and create paired reads (output read pairs to a file)  The read lengths are all identical, and the separation between reads in a pair is picked randomly based on a normal distribution  The program can simulate sequencing errors when creating the paired reads

10 Simulating Genomic Sequencing  The user can control the total number of reads, read lengths, the mean of the read separations, and sequencing error rate

11 Genome to be sequenced Choose uniformly - distributed random locations

12 Genome to be sequenced Create read pair at each location. Choose random direction for each read L L d1d1 L is a constant while d is random (normally distributed) Read direction

13 L L d2d2 L L d3d3

14 L L d2d2 L L d3d3 L L d1d1 Resulting paired reads

15 L L d2d2 L L d3d3 L L d1d1 Paired reads with simulated sequencing errors


Download ppt "By Alfonso Farrugio, Hieu Nguyen, and Antony Vydrin Sequencing Technologies and Human Genetic Variation."

Similar presentations


Ads by Google