Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 6293 AT: Current Bioinformatics HW2 Papers 1

Similar presentations


Presentation on theme: "CS 6293 AT: Current Bioinformatics HW2 Papers 1"β€” Presentation transcript:

1 CS 6293 AT: Current Bioinformatics HW2 Papers 1
CS 6293 AT: Current Bioinformatics HW2 Papers 1. BLAT--The BLAST-Like Alignment Tool 2. Classification of DNA sequences using Bloom Filters Course Intructor Dr. Jianhua Ruan Presenters Husnu Narman Nihat Altiparmak

2 BLAT--The BLAST-Like Alignment Tool
W. James Kent (2002) UCSC Cited by 2229(Google Scholar)

3 Brief Information About BLAST
BLAST: Basic Local Allignment Search Tool Find a gene in different kinds of databases Divide query to small part words and compare High Scoring Segments Pairs(HSP) Scan for exact matches in HSP Extend exact matches to HSP Evaluate, handle exceptions, and reports List all of the HSPs in the database

4 BLAT BLAT: The Blast-Like Alignment Tool
Find a gene in different kinds of databases Why new search tool?

5 Differences between BLAST and BLAT
Index of Query Triggers extension one or two hit occur List of exons sorted by size BLAT Index of Database Triggers extensions any number perfect or near perfect hits Look up location of a sequence in genome or determine exon structure of a mRNA

6 Classification of DNA sequences using Bloom Filters
Strannheim et al. (2010) Stockholm, SWEDEN

7 Classification of DNA sequences using Bloom Filters
New generation sequencing technologies Complex datasets New efficient, specialized sequence analysis algorithms Often, only noval sequences required, unnecessary sequences(belonging to a known genome) need to be removed A new algorithm(FACS) to classify sequences as belonging or not belonging to a reference sequence Source code available at;

8 Bloom Filter A memory efficient data structure for testing whether an element is part of a reference set m bit vector with k hash functions Never returns a false negative; may however return a false positive Optimal number of hash functions; π‘˜= π‘š 𝑛 𝑙𝑛2

9 Example Bloom Filter x y z 1 1 1 1 1 1 1 √ x √ w π‘š=18, π‘˜=3

10 Method Bloom filter is created from the reference sequence with desired K-mer and false positive rate. The query sequences are then classified by using the bloom filter

11 Evaluation Experimental metagenome dataset(Allander et al. 2005) containing reads Analysis using human genome as a reference FACS, BLAT and SSAHA2 compared 31x 21x

12 Evaluation False Positive Rate(Missed)

13 Any Questions?


Download ppt "CS 6293 AT: Current Bioinformatics HW2 Papers 1"

Similar presentations


Ads by Google