Download presentation
Presentation is loading. Please wait.
Published byPaige Johnston Modified over 10 years ago
2
Speeding the Database Searches and Sequence Alignments with Multi-Motif PHI-BLAST Nitin Bhardwaj, Dept. of Chemical Engineering, IIT Bombay.
3
National Center For Biological Sciences, (NCBS) Bangalore A unit of TATA Institute of Fundamental Research (TIFR)
4
What is Sequence Alignment ? The process of lining up two or more sequences to achieve maximal levels of similarity Why make Sequence Alignments ? To detect: Structural & Functional Relationship Evolutionary Relationship
5
Some Basic Terms Global Alignment Entire Sequence Local Alignment Restricted to regions of identity and strong similarity Query Sequence The sequence of interest Subject Sequence The other one
6
And…. Scoring Matrices: to score a match/mismatch True Positive True Negative False PositiveFalse Negative Motif: A short conserved region of a sequence Hits: Sequences picked up from the database
7
What after alignment ? Calculate the score of the alignment Sort the aligned sequences in the order of their decreasing scores Go ahead with your analysis to find out the relationships/similarities
8
Pattern-Hit Initiated Basic Local Alignment Search Tool (PHI-BLAST) Takes a query seq, a motif, a database to search into Aligns the query sequence with all the seqs which have the motif Brings out a score for each seq Reports all the seqs which have the score above a particular thresh-hold value sorted in the order of the score
9
A Typical PHI-BLAST Output 1 occurrence(s) of pattern in query pattern [RA][C][ACDEFGHIKLMNPQRSTVWY][C] at position 3 of query sequence Significant matches for pattern occurrence 1 at position 3 Score E Value (bits) pdb|1ILP|A Chain A, Cxcr-1 N-Terminal Peptide Bound To Interleuk... 128 2e-37 pdb|1QE6|D Chain D, Interleukin-8 With An Added Disulfide Betwee... 121 2e-35 pdb|1ICW|A Chain A, Interleukin-8, Mutant With Glu 38 Replaced B... 121 3e-35 pdb|1ROD|A Chain A, Chimeric Protein Of Interleukin 8 And Human... 98 2e-28 pdb|1TVX|B Chain B, Neutrophil Activating Peptide-2 Variant Form... 50 6e-14 pdb|1NAP|A Chain A, Mol_id: 1; Molecule: Neutrophil Activating P... 50 6e-14 pdb|1MSG|A Chain A, Human Melanoma Growth Stimulatory Activity (... 48 3e-13 pdb|1MGS|A Chain A, Human Melanoma Growth Stimulating Activity (... 48 3e-13 pdb|1QNK|A Chain A, Truncated Human Grob[5-73], Nmr, 20 Structur... 47 5e-13 pdb|1MI2|A Chain A, Solution Structure Of Murine Macrophage Infl... 46 1e-12
10
Strategy behind PHI-BLAST Location of motifs in the seqs Motif (Query) Motif (Subject) Extension in both directions with local alignment Calculate the score for the alignment
11
Problems with PHI-BLAST Only one motif as input so no of runs required thus increasing the time Consequently, no space for attaching any weightage to any motif No parallel comparison possible No control on the specificity of the program
12
The Solution(s) !!! MULTI – MOTIF PHI-BLAST (MMPB) RANKED MOTIF PHI-BLAST (RMPB)
13
Multi-Motif PHI-BLAST Takes a query seq, any no of motifs, a database to search into Aligns the query sequence with all the seqs which have a min no of motif(s) Brings out a score for each seq Reports all the seqs which have the score above a particular thresh-hold value sorted in the order of the score
14
Strategy behind MMPB Location of motifs in the two seqs Extension in both directions with local alignment and the part in between with global alignment Calculate the score for the alignment Query Motif 2 Motif 1 Motif 2 Subject (Local)(Global) Motif 1
15
Comparison of Results il8 Macrophage Inflammatory 1beta (the middle columns correspond to PHI-BLAST(e=1) And the last one correspond to MMPB
16
il8 (1ikl) Interleukin-8
17
4helud (1bbh) Cytochrome $c (prime)
18
4helud (256b) Cytochrome $b502
19
Flav (1ord) Orthinine Decarboxylase
20
Flav (1cus) Cutinase
21
Ranked Motif PHI-BLAST Takes a query seq, a number of motifs in the order of their ranks, and a database to search into Aligns the query sequence with all the seqs which have the min no of highest ranked motifs Brings out a score for each seq Reports all the seqs which have the score above a particular thresh-hold value sorted in the order of the score
22
Comparison of results Results for il8 (1hum) Macrophage Inflammatory 1beta the unmarked columns correspond to RMPB with at least 3 & 2
23
il8 (1ikl) Interleukin-8
24
The problems are solved !!!! Space for multiple motifs as input Space for attaching weightage to the motifs via their ranks Only one run required for any number of motifs so less time A deeper analysis possible
25
Thats All & Thanks to All of You
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.