Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nadia Léonard Unité de Recherche en Biologie Moléculaire F.U.N.D.P. Developing a reliable methodology to align a sequence of known structure and a sequence.

Similar presentations


Presentation on theme: "Nadia Léonard Unité de Recherche en Biologie Moléculaire F.U.N.D.P. Developing a reliable methodology to align a sequence of known structure and a sequence."— Presentation transcript:

1 Nadia Léonard Unité de Recherche en Biologie Moléculaire F.U.N.D.P. Developing a reliable methodology to align a sequence of known structure and a sequence with low homology, to model it

2 Introduction 3D Structure : information to understand function to plan directed mutagenesis Number of known structures (8000) smaller than known sequences (500000). Experimental techniques : long and expensive Alternative: modeling Homology modeling : two homologues adopt the same structure

3 Pairwise alignment: most features well predicted multiple alignment Twilight zone Midnight zone fold recognition (not very reliable) Homology modeling (reliable) Not homologous BUT proteins of different sequences can adopt the same structure %id. Consensus of alignments, some features well predicted

4 Sequence alignment is the critical step for homology modeling Below 30% of identities, there is no automatic method which allows reliable protein modeling

5 Aim of our work to propose a reliable alignment method for proteins sharing a small percentage of identities with their template (<30%)

6 General strategy for homology modeling General strategy for homology modeling Search databanks (PSI-BLAST) Multiple alignment of sequences target-template alignment Modeling Theoretical model evaluation Comparison model to real structure PDB template Critical step

7 Our methodology 1. Target selection : PDB proteins of which template shares between10 and 30 % of identities (ALIGN) 2. Improvement of sequence-structure alignment Building of 3 alignments 2 from our method (consensus 1 and 2) pairwise alignment PSI-BLAST (best alignment method for Twilight Zone proteins) 3. Homology modeling from each target-template alignment 4. evaluation :geometrical features of the models 5. Comparison of each model to the real structure

8 Our approach consists in building consensus of several alignment programs Multiple alignment Target template Several programs Several programs Multiple alignment

9 Our approach consists in building consensus of several alignment programs Multiple alignment Targettemplate Pairwise alignment Several programs Several pairwise alignment Multiple alignment Pairwise alignment

10 Our approach consists in building consensus of several alignment programs Multiple alignment Target template Pairwise alignment Several programs Several pairwise alignments consensus consensus Multiple alignment Pairwise alignment Consensus building

11 Multiple alignments (8 alignements) multiple alignments (12 alignments) 13 pairwise alignments Consensus 2 8 pairwise alignments Consensus 1 pairwise alignment PSI-BLAST Databank searching PSI-BLAST Model PSI-BLAST Model 1 Model 2 1) Alignment and modeling

12 2) Comparison of models to real structure global RMSD between model and structure after superposition local RMSD :percentage of well predicted residues Lower the distance, closer the model from the real structure.Lower the distance, closer the model from the real structure. A wrong modeled region can dramatically increase the global RMSD.

13 3pte: D-alanyl- D- alanine carboxypeptidase de Streptomyces sp R161 Mod 2 PSI-BLAST Real structure

14 Results 9 proteins have been modelled. We can distinguish: 3 proteins of the midnight zone (<20% id.) 6 proteins of the twilight zone (20-30%)

15 Comparison of models to the real structure Midnight Zone proteins (<20% id) For all methods (models 1, 2, PSI), very bad results: most of the residues have been badly modeled. Actually, no reliable alignment method exists below 20%. Our method (models 1 et 2) can not lower this threshold. Modeling of these 3 proteins confirms the limits Modeling of these 3 proteins confirms the limits of alignment methods below 20%.

16 Twilight Zone proteins (20-30% id) global and local RMS : most accurate models (4/6 et 5/6) come from our method (consensus 1 and 2). In general, model 2 gives better results than model 1 and model PSI-BLAST. It is better to use many alignment programs. models build from our methodology seem to be better than PSI-BLAST models.

17 Comparison to CASP (Critical Assessment of techniques for protein Structure Prediction) modeling of proteins for which structure is unknown by the entrants (revealed after competition) comparison to the real structure (global RMS) The best CASP ’s models are taken as reference

18

19 Conclusions Limits of our method are defined below 20% of identities. Our alignment method appears to be better than PSI-BLAST (above 20% id.) Our results are comparable to the best CASP ’s performances (cfr. graph) consensus for sequence alignment has a future for homology modeling of Twilight Zone proteins.

20 Perspectives (1) Test our approach on a large set of proteins improve our method: giving more weight to better alignment programs increasing the number of alignment programs using several templates using SSP and fold recognition

21 Perspectives (2) Evaluate the confidence of regions predicted by a lot of programs take part in CASP competition Automate : expert system (PHD thesis)

22

23

24 61 1d2f MHGVFGYSRW KNDE-FLAAI AHWFSTQHYT AIDSQTVVYG PSVIYMVSEL IRQWSETGEG consensus1 AQGKTKYAPP AGIPELREAL AEKFRRENGL SVTEEETIVT VGGKQALFNL FQAILDPGDE score consensus2 AQGKTKYAPP AGIPELREAL AEKFRRENGL SVTPEETIVT VGGKQALFNL FQAILDPGDE score d2f VVIHTPAYDA FYKAIEGNQR TVMPVALEKQ ADGWFCDMGK LEAVLAKPEC KIMLLCSPQN consensus1 VIVLSPYWVS YPEMVRFAGG VVVEVETL R----R-T KALVVNSPNN score consensus2 VIVLSPYWVS YPEMVRFAGG VVVEVETL-P EEGFVPD-PE RVRRAITPRT KALVVNSPNN score d2f PTGKVWTCDE LEIMADLCER HGVRVISDEI HMDMVWGEQP HIPWSNVARG DWALLTSGSK consensus1 PTGAVYPKEV LEALARLAVE HDFYLVSDEI YEHLLYEG-E HFSPGRVAPE HTLTVNGAAK score consensus2 PTGAVYPKEV LEALARLAVE HDFYLVSDEI YEHLLYEGEH FSPGRVA-PE HTLTVNGAAK score

25 1nec

26 1d2f modèle 2

27


Download ppt "Nadia Léonard Unité de Recherche en Biologie Moléculaire F.U.N.D.P. Developing a reliable methodology to align a sequence of known structure and a sequence."

Similar presentations


Ads by Google