Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches1 By Jayakumar Rudhrasenan S3047315 Primary Supervisor: Prof. Heiko Schroder.

Similar presentations


Presentation on theme: "Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches1 By Jayakumar Rudhrasenan S3047315 Primary Supervisor: Prof. Heiko Schroder."— Presentation transcript:

1 Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches1 By Jayakumar Rudhrasenan S3047315 Primary Supervisor: Prof. Heiko Schroder Secondary Supervisor: Dr. Margaret Hamilton

2 Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches2 Introduction Bio-Informatics What is Bio-Informatics? Bio-Informatics is the science of developing computer databases and algorithms to facilitate biological research especially in the area of genomic. Genomic is the study of genes and its functions.

3 Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches3 Background - Protein Structure e How can we find the Structure of a protein ? X-ray Crystallography NMR Spectroscopy Phi Psi Amino acid a k r n d c a r a Protein Structure

4 Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches4 Where does Computer Science come into it? Limitations of traditional lab-work Expensive Cost involved in finding the structure through these method is expensive Time Consuming Takes 6 to 12 months to predict the structure of a single protein. REASON:  Some proteins don’t crystallise  Some don’t give good diffraction patterns  All proteins are fragile and difficult to handle.

5 Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches5 Methods Available There are many ways by which this problem is being tackled. These methods are basically classified into two groups: ab initio Homology modelling What is Homology modelling ?

6 Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches6 What is homology modelling? Homology modeling works on the principle that although each protein adopts a unique structure, there are only ~2,000 common folds between the various super families identified thus far. If two protein sequences are aligned and their percentage similarity is above the ‘twilight zone’, or 20% we can conclude that the sequences are homologous, or share a common ancestry, below this zone it is not possible to say whether the identical amino acid residues are in fact evolutionarily linked or have arisen by chance.

7 Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches7 What is Protein Structure Prediction? In its most general form - It is the prediction of the relative position of each amino acid in the protein structure with the knowledge of the structural details of other known proteins.

8 Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches8 Why predict protein structure? The sequence structure gap –750 000 known sequences, 17 000 known structures Structural knowledge brings understanding of function and mechanism of action Can help in prediction of function Predicted structures can be used in structure based drug design It can help us understand the effects of mutations on structure or function It is a very interesting scientific problem –still unsolved in its most general form after more than 20 years of effort

9 Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches9 Protein Structure Prediction Algorithm nfsbcar..... arndcqeghilkmnfssd eghilnfsearlkspqga nhe........... Window size =3. Can be implemented with window size of 5,7,9. With window size of 9, we look for almost perfect matches as we wont get a perfect match with the database we have. window Protein Database Protein sequence for which the structure is unknown

10 Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches10 Algorithm – continued.. Number of Occurrences Phi graph Psi graph

11 Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches11 Limitations of this algorithm  Time Consuming Time taken to predict the structure of a protein Time taken to predict the structure 20,000 protein 2 hr PC time 2 x 20,000 = 40,000 hrs PC time

12 Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches12 Why does it take time? Each sub sequence of the unknown protein is compared with all the sub sequences of the proteins in the database. With a window size of 9, the number of sub strings in the database will be around 2 million. So, there will be 2 million comparisons for each sub sequence in the unknown protein. “Unknown protein” here means the proteins whose sequence is knows but the structure is not known.

13 Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches13 Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches Arrange the sub sequences with a hamming distance of one between each sub sequences. What is hamming distance? The number of disagreeing bits between twobinary vectors. Used as measure of dissimilarity. Eg. 1000011 1000001 These two binary numbers differ by one bit. Hamming distance of one here means that the each sub sequence differ from the one next to that by just one amino acid.

14 Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches14 Continued… Maintain a table which stores the hope index value for a mismatch. For example Row number Sub SequenceJump to row number 10231111100001027 1024111110001 1025111110002 1026111110003 10271111100131031 1028111110012 1029111110011 1030111110010 10311111100201035...


Download ppt "Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches1 By Jayakumar Rudhrasenan S3047315 Primary Supervisor: Prof. Heiko Schroder."

Similar presentations


Ads by Google