Presentation is loading. Please wait.

Presentation is loading. Please wait.

Speeding the Database Searches and Sequence Alignments with Multi-Motif PHI-BLAST Nitin Bhardwaj, Dept. of Chemical Engineering, IIT Bombay.

Similar presentations


Presentation on theme: "Speeding the Database Searches and Sequence Alignments with Multi-Motif PHI-BLAST Nitin Bhardwaj, Dept. of Chemical Engineering, IIT Bombay."— Presentation transcript:

1

2 Speeding the Database Searches and Sequence Alignments with Multi-Motif PHI-BLAST Nitin Bhardwaj, Dept. of Chemical Engineering, IIT Bombay.

3 National Center For Biological Sciences, (NCBS) Bangalore A unit of TATA Institute of Fundamental Research (TIFR)

4 What is Sequence Alignment ? The process of lining up two or more sequences to achieve maximal levels of similarity Why make Sequence Alignments ? To detect: Structural & Functional Relationship Evolutionary Relationship

5 Some Basic Terms Global Alignment Entire Sequence Local Alignment Restricted to regions of identity and strong similarity Query Sequence The sequence of interest Subject Sequence The other one

6 And…. Scoring Matrices: to score a match/mismatch True Positive True Negative False PositiveFalse Negative Motif: A short conserved region of a sequence Hits: Sequences picked up from the database

7 What after alignment ? Calculate the score of the alignment Sort the aligned sequences in the order of their decreasing scores Go ahead with your analysis to find out the relationships/similarities

8 Pattern-Hit Initiated Basic Local Alignment Search Tool (PHI-BLAST) Takes a query seq, a motif, a database to search into Aligns the query sequence with all the seqs which have the motif Brings out a score for each seq Reports all the seqs which have the score above a particular thresh-hold value sorted in the order of the score

9 A Typical PHI-BLAST Output 1 occurrence(s) of pattern in query pattern [RA][C][ACDEFGHIKLMNPQRSTVWY][C] at position 3 of query sequence Significant matches for pattern occurrence 1 at position 3 Score E Value (bits) pdb|1ILP|A Chain A, Cxcr-1 N-Terminal Peptide Bound To Interleuk... 128 2e-37 pdb|1QE6|D Chain D, Interleukin-8 With An Added Disulfide Betwee... 121 2e-35 pdb|1ICW|A Chain A, Interleukin-8, Mutant With Glu 38 Replaced B... 121 3e-35 pdb|1ROD|A Chain A, Chimeric Protein Of Interleukin 8 And Human... 98 2e-28 pdb|1TVX|B Chain B, Neutrophil Activating Peptide-2 Variant Form... 50 6e-14 pdb|1NAP|A Chain A, Mol_id: 1; Molecule: Neutrophil Activating P... 50 6e-14 pdb|1MSG|A Chain A, Human Melanoma Growth Stimulatory Activity (... 48 3e-13 pdb|1MGS|A Chain A, Human Melanoma Growth Stimulating Activity (... 48 3e-13 pdb|1QNK|A Chain A, Truncated Human Grob[5-73], Nmr, 20 Structur... 47 5e-13 pdb|1MI2|A Chain A, Solution Structure Of Murine Macrophage Infl... 46 1e-12

10 Strategy behind PHI-BLAST Location of motifs in the seqs Motif (Query) Motif (Subject) Extension in both directions with local alignment Calculate the score for the alignment

11 Problems with PHI-BLAST Only one motif as input so no of runs required thus increasing the time Consequently, no space for attaching any weightage to any motif No parallel comparison possible No control on the specificity of the program

12 The Solution(s) !!! MULTI – MOTIF PHI-BLAST (MMPB) RANKED MOTIF PHI-BLAST (RMPB)

13 Multi-Motif PHI-BLAST Takes a query seq, any no of motifs, a database to search into Aligns the query sequence with all the seqs which have a min no of motif(s) Brings out a score for each seq Reports all the seqs which have the score above a particular thresh-hold value sorted in the order of the score

14 Strategy behind MMPB Location of motifs in the two seqs Extension in both directions with local alignment and the part in between with global alignment Calculate the score for the alignment Query Motif 2 Motif 1 Motif 2 Subject (Local)(Global) Motif 1

15 Comparison of Results il8 Macrophage Inflammatory 1beta (the middle columns correspond to PHI-BLAST(e=1) And the last one correspond to MMPB

16 il8 (1ikl) Interleukin-8

17 4helud (1bbh) Cytochrome $c (prime)

18 4helud (256b) Cytochrome $b502

19 Flav (1ord) Orthinine Decarboxylase

20 Flav (1cus) Cutinase

21 Ranked Motif PHI-BLAST Takes a query seq, a number of motifs in the order of their ranks, and a database to search into Aligns the query sequence with all the seqs which have the min no of highest ranked motifs Brings out a score for each seq Reports all the seqs which have the score above a particular thresh-hold value sorted in the order of the score

22 Comparison of results Results for il8 (1hum) Macrophage Inflammatory 1beta the unmarked columns correspond to RMPB with at least 3 & 2

23 il8 (1ikl) Interleukin-8

24 The problems are solved !!!! Space for multiple motifs as input Space for attaching weightage to the motifs via their ranks Only one run required for any number of motifs so less time A deeper analysis possible

25 Thats All & Thanks to All of You


Download ppt "Speeding the Database Searches and Sequence Alignments with Multi-Motif PHI-BLAST Nitin Bhardwaj, Dept. of Chemical Engineering, IIT Bombay."

Similar presentations


Ads by Google