Presentation is loading. Please wait.

Presentation is loading. Please wait.

Basics of BLAST Basic BLAST Search - What is BLAST?

Similar presentations


Presentation on theme: "Basics of BLAST Basic BLAST Search - What is BLAST?"— Presentation transcript:

1 Basics of BLAST Basic BLAST Search - What is BLAST?
- The framework of BLAST - Different protocols of BLAST - The database you can search - Where can I BLAST?

2 What is BLAST? BLAST stands for Basic Local Alignment Search Tool
NCBI- BLAST vs. WU- BLAST BLAST ! = BLAT Why is BLAST popular? - Good balance of sensitivity and speed - Reliability - Flexibility Blast applies heuristic approach, it does not necessarily find the best hit for your search. Statistical standpoint and software development point It can be adapted to many sequence analysis scenarios.

3 The Framework of BLAST (1)
Scoring matrix - high BLOSUM (low PAM)  closely related sequences - low BLOSUM (high PAM)  distantly related sequences - Default is BLOSUM 62

4 BLOSUM Matrix BLOSUM = BLOcks SUbstitution Matrix

5 The Framework of BLAST (2)
Sequence Alignment - W (word size); blastn 11; others 2 or 3 - G (gap open penalty); blastn 5; others 11 - E (gap extension penalty); blastn 2; others 1 Statistic Interpretation - e (threshold for expectation value) default 10 W (word size), low  sensitive

6 BLAST Protocols The most common BLAST search includes five protocols:
Program Database Query BLASTN Nucleotide BLASTP Protein BLASTX Nt.  Protein TBLASTN TBLASTX

7 BLASTN BLASTN DNA :: DNA homology
- The query is a nucleotide sequence. - The database is a nucleotide database - No conversion is done on the query or database DNA :: DNA homology - Mapping oligos to a genome - Cross-species sequence exploration - Annotating genomic DNA with ESTs

8 BLASTP BLASTP Protein :: Protein homology
- The query is an amino acid sequence - The database is an amino acid database - No conversion is done on the query or database Protein :: Protein homology - Protein function exploration - Novel gene  makes parameters more sensitive Score matrix, e value, parameter

9 BLASTX BLASTX Coding nucleotide seq :: Protein homology
- The query is a nucleotide sequence - The database is an amino acid database - All six reading frames are translated on the query and used to search the database Coding nucleotide seq :: Protein homology - Gene finding in genomic DNA - Annotating ESTs (and Shotgun Sequence)

10 TBLASTN TBLASTN - The query is an amino acid sequence
- The database is a nucleotide database - All six frames are translated in the database and searched with the protein sequence Protein :: Coding Nucleotide DB homology - Mapping a protein to a genome - Mining ESTs (Shotgun DNA) for protein similarities

11 TBLASTX TBLASTX - The query is a nucleotide sequence
- The database is a nucleotide database - All six frames are translated on the query and on the database Coding :: Coding homology - For searching distantly related species - Sensitive but expensive

12 BLAST output List of Sequences with scores
– Raw score, higher is better (length dependant) – Expect Value, smaller is better (length and database size independent) • List of alignments

13 The Databases (1) Genbank NR (protein and nucleotide versions)
Non-redundant large databases (compile and remove dups) Anyone can submit, you can call your sequence anything Quality low; names can be meaningless EST databases Short single reads of cDNA clones Other short single reads High error rates

14 The Databases (2) Swissprot Curated from literature
REAL proteins; REAL functions; small; Genomic Databases Human, Mouse, Drosophila, Arabidopsis etc NCBI, species specific web pages

15 Where Can I run BLAST? Three choices: – NCBI (www.ncbi.nih.gov)
databases updated constantly (daily); very slow at times – Goose web (goose.wustl.edu/blast/blast.html) –command line (blastall)


Download ppt "Basics of BLAST Basic BLAST Search - What is BLAST?"

Similar presentations


Ads by Google