Bayesian Refinement of Protein Functional Site Matching

Bayesian Refinement of Protein Functional Site Matching
Kanti V Mardia, Vysaul B Nyirongo*, Peter J Green, Nicola D Gold, David R Westhead Presented by Deephan, Mohan

Presentation Flow Background Conventional Methods Bayesian Refinement
Results Conclusion Disclaimer : Contrary to the assumption made by the authors, the paper presenter does have a thorough understanding of all the concepts related to the topics of advanced statistical, graph theory and structural genomics discussed in the paper..

Motivation Structural Genomics Structural Site comparison
Functional Site comparison Knowledge based methods Similarity Search Algorithms

Protein Functional Site Matching
Modeled as a graph theoretic problem Shape analysis of Proteins Crucial for prediction of molecular interactions Infer functional relationship of proteins Classification of Binding Patterns Resource: SITESDB Database Contains Protein Structural data Entries formed from PDB (Protein Data Bank)

The Methodology Graph Similarity Problem Refining the Graph Match
Objective: Matching Functional sites -Comparing amino acid configurations (Cα and Cβ atoms) Functional site – Graph Amino acid positions – Vertices Refining the Graph Match Application of Bayesian Strategy Markov Chain Monte Carlo (MCMC) procedure

Need for Bayesian Refinement??
Bayesian Inference: Complete Distribution of matches Solution space Noise Adaptation Flexibility Edge over combinatorial methods

Bayesian Model Common Tool used in Statistical Inference
Based on Posterior Joint Distribution Product of Prior density and Likelihood Biologically speaking, Prior Density - Distribution of Transformation Parameters Likelihood - Related to matches between functional sites

Representation and Matching
Functional sites X and Y represented as Graphs G1 and G2 Vertex sets V1 = {Xj, j = 1, 2, ..., m} , V2 = {Yk, k = 1, 2, ..., n} Xj , Yk represents coordinates of amino acids in jth and kth positions of X,Y x1j, y1k – Cα coordinates for X,Y x2j, y2k – Cβ coordinates for X,Y x1 = {x1j : j = 1 ..., m}, x2 = {x2j : j = 1 ..., m} y1 = {y1k : k = 1 ..., n}, y2 = {y2k : k = 1 ..., n}

Graph Theoretic Approach
Objective: Creation of Vertex Product Graph (Hv) Hv = G1 ○v G2 VH=V1 x V2 An edge between two vertices vh = (Xj, Yk), vh' = (Xj', Yk') ∈ VH exists for j ≠ j' and k ≠ k' when 1. the absolute difference between distances |x1j - x1j'| and |y1k - y1k'| and 2. also the absolute difference between distances |x2j - x2j'| and | y2k - y2k'| are both less than 1.5Å (matching distance threshold).

Bayesian Alignment Matching between amino acids X and Y represented by matrix M, Mjk = Transformations to bring the configurations into alignment is given by xij = Ayik + τ for Mjk = 1, i = 1, 2 A – Rotation Matrix, τ – Translation vector 1 if jth amino acid corresponds to kth amino acid 0 otherwise

Bayesian Modeling (contd)
Joint Posterior Distribution: p(A), p(τ) and p(σ) denote prior distributions for A, τ and σ |A| - Jacobian Transformation presence of Gaussian noise N(0, σ2) in in the atomic positions for x1j and y1k

Bayesian Modeling (contd)
Side chains orientation: Extending the model by taking into account the relative orientation of Cα and Cβ in matching amino acids

MCMC Refinement Step Markov Chain Monte Carlo (MCMC) – used to sample the full joint distribution function p(M, A, τ, σ, x1, y1, x2, y2) p(M, A, τ, σ, x1, y1, x2, y2) – function of RMSD and angle for orientation difference between amino acids

Significance of RMSD RMSD – Root Mean Square Distribution
Matches of lower RMSD over larger numbers of matching residues are more statistically significant MCMC Refinement improved the RMSD (reduction) and the number of matching residues ( increase)

Decision tree for refining the graph solution by the MCMC method
Decision tree for refining the graph solution by the MCMC method. Boxes with curved corners show processes and their output while boxes with sharp corners are for branching conditions. The procedure starts with graph solution MG. The graph solution's RMSD and number of matches are denoted by RMSDG and LG respectively. MCMC is re-iterated until the MCMC solution: MB is better. The RMSD and number of matches for MB are denoted by RMSDB and LB respectively. MB and MG are compared using 1) RMSDs and the number of matches or 2) P-values for MG and MG, denoted by PG and PB respectively.

Results Two Binding Sites: Alcohol dehydrogenase structure
(60 amino acids) 17 – β hydroxysteroid dehydrogenase ( 63 amino acids) 4 Matching Studies were performed Each study was performed with and without considering the physico-chemical properties of amino-acids.

Case-I Case 1: Site 1hdx_1 matching against its own SCOP family
125/145 sites produced significant matches – increased to 131/145 (after refinement) RMSD is improved from > 1.5Å to less than 1Å Increase in the number of matching residues

Case 2: 17 – β hydroxysteroid dehydrogenase and family
After MCMC Refinement step significant matches increased from 248 to 318 of 326 sites Increased number of matching residues at a similar RMSD RMSD improvement in minority of the sites

Case 3: alcohol dehydrogenase and superfamily
Matching sites increased form 200 to 324 Case 4: Alcohol dehydrogenase and FAD/NAD(P)-binding domain 12 sites improved after MCMC refinement

Discussion of Results MCMC refinement step provides significant improvement over Graph Matching Techniques Success – Lack of dependence on strict distance matching criteria Computationally expensive Refinements adapts to shape variations in binding sites

Thank You!!!!

Bayesian Refinement of Protein Functional Site Matching

Similar presentations

Presentation on theme: "Bayesian Refinement of Protein Functional Site Matching"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Bayesian Refinement of Protein Functional Site Matching

Similar presentations

Presentation on theme: "Bayesian Refinement of Protein Functional Site Matching"— Presentation transcript:

Similar presentations

About project

Feedback