NUS CS5247 A dimensionality reduction approach to modeling protein flexibility By, By Miguel L. Teodoro, George N. Phillips J* and Lydia E. Kavraki Rice.

Slides:



Advertisements
Similar presentations
3D Geometry for Computer Graphics
Advertisements

Component Analysis (Review)
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Tensors and Component Analysis Musawir Ali. Tensor: Generalization of an n-dimensional array Vector: order-1 tensor Matrix: order-2 tensor Order-3 tensor.
Machine Learning Lecture 8 Data Processing and Representation
Dimensionality Reduction PCA -- SVD
PCA + SVD.
Solving Linear Systems (Numerical Recipes, Chap 2)
Slides by Olga Sorkine, Tel Aviv University. 2 The plan today Singular Value Decomposition  Basic intuition  Formal definition  Applications.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Principal Component Analysis
Dimensionality reduction. Outline From distances to points : – MultiDimensional Scaling (MDS) – FastMap Dimensionality Reductions or data projections.
Sampling algorithms for l 2 regression and applications Michael W. Mahoney Yahoo Research (Joint work with P. Drineas.
Principal Component Analysis
Computer Graphics Recitation 5.
3D Geometry for Computer Graphics
Principal component analysis (PCA)
“Inverse Kinematics” The Loop Closure Problem in Biology Barak Raveh Dan Halperin Course in Structural Bioinformatics Spring 2006.
Docking of Protein Molecules
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 9(b) Principal Components Analysis Martin Russell.
Information Retrieval in Text Part III Reference: Michael W. Berry and Murray Browne. Understanding Search Engines: Mathematical Modeling and Text Retrieval.
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman.
1 Numerical geometry of non-rigid shapes Spectral Methods Tutorial. Spectral Methods Tutorial 6 © Maks Ovsjanikov tosca.cs.technion.ac.il/book Numerical.
Procrustes analysis Purpose of procrustes analysis Algorithm Various modifications.
Dimension Reduction and Feature Selection Craig A. Struble, Ph.D. Department of Mathematics, Statistics, and Computer Science Marquette University.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Efficient Nearest-Neighbor Search in Large Sets of Protein Conformations Fabian Schwarzer Itay Lotan.
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
Microarray analysis Algorithms in Computational Biology Spring 2006 Written by Itai Sharon.
Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.
3D Geometry for Computer Graphics
3D Geometry for Computer Graphics
Ordinary least squares regression (OLS)
Atul Singh Junior Undergraduate CSE, IIT Kanpur.  Dimension reduction is a technique which is used to represent a high dimensional data in a more compact.
lecture 2, linear imaging systems Linear Imaging Systems Example: The Pinhole camera Outline  General goals, definitions  Linear Imaging Systems.
Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.
SVD(Singular Value Decomposition) and Its Applications
Summarized by Soo-Jin Kim
Chapter 2 Dimensionality Reduction. Linear Methods
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2014.
Next. A Big Thanks Again Prof. Jason Bohland Quantitative Neuroscience Laboratory Boston University.
Feature extraction 1.Introduction 2.T-test 3.Signal Noise Ratio (SNR) 4.Linear Correlation Coefficient (LCC) 5.Principle component analysis (PCA) 6.Linear.
CSE554AlignmentSlide 1 CSE 554 Lecture 5: Alignment Fall 2011.
Additive Data Perturbation: data reconstruction attacks.
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
A Molecular Dynamic Modeling of Hemoglobin-Hemoglobin Interactions 1 Tao Wu, 2 Ye Yang, 2 Sheldon Wang, 1 Barry Cohen, and 3 Hongya Ge 1 Department of.
Local Non-Negative Matrix Factorization as a Visual Representation Tao Feng, Stan Z. Li, Heung-Yeung Shum, HongJiang Zhang 2002 IEEE Presenter : 張庭豪.
Progress in identification of damping: Energy-based method with incomplete and noisy data Marco Prandina University of Liverpool.
Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2013.
ECE 8443 – Pattern Recognition LECTURE 08: DIMENSIONALITY, PRINCIPAL COMPONENTS ANALYSIS Objectives: Data Considerations Computational Complexity Overfitting.
8.4.2 Quantum process tomography 8.5 Limitations of the quantum operations formalism 量子輪講 2003 年 10 月 16 日 担当:徳本 晋
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
Principle Component Analysis and its use in MA clustering Lecture 12.
Modeling Protein Flexibility with Spatial and Energetic Constraints Yi-Chieh Wu 1, Amarda Shehu 2, Lydia Kavraki 2,3  Provided an approach to generating.
Principal Component Analysis (PCA)
FlexWeb Nassim Sohaee. FlexWeb 2 Proteins The ability of proteins to change their conformation is important to their function as biological machines.
Stereo March 8, 2007 Suggested Reading: Horn Chapter 13.
Protein structure prediction Computer-aided pharmaceutical design: Modeling receptor flexibility Applications to molecular simulation Work on this paper.
Chapter 13 Discrete Image Transforms
NUS CS 5247 David Hsu Protein Motion. NUS CS 5247 David Hsu2 What is a protein?  Primary level - a sequence of alphabets (amino acid molecules)  Amino.
CSE 554 Lecture 8: Alignment
Principal Component Analysis (PCA)
LECTURE 10: DISCRIMINANT ANALYSIS
Lecture: Face Recognition and Feature Reduction
Probabilistic Models with Latent Variables
Recitation: SVD and dimensionality reduction
Parallelization of Sparse Coding & Dictionary Learning
LECTURE 09: DISCRIMINANT ANALYSIS
Lecture 13: Singular Value Decomposition (SVD)
Presentation transcript:

NUS CS5247 A dimensionality reduction approach to modeling protein flexibility By, By Miguel L. Teodoro, George N. Phillips J* and Lydia E. Kavraki Rice University and University of Wisconsin-Madison* Presented by Zhang Jingbo

NUS CS52472 Outline  Motivation, Background and Our goal  Protein flexibility  The problems in current methods and the benefit of our methods in this paper  Dimensionality reduction techniques  Obtaining conformational Data  Application to Specific Systems  Summary

NUS CS52473 Motivation  Introduce a method to obtain a reduced basis representation of protein flexibility.

NUS CS52474 Background  Proteins are involved either directly or indirectly in all biological processes in living organisms.  Conformational changes of proteins can critically affect their ability to bind other molecules.  Any progress in modeling protein motion and flexibility will contribute to the understanding of key biological functions.  Today there is a large body of knowledge available on protein structure and function and this amount of information is expected to grow even faster in the future.

NUS CS52475 Our method and goal  Method: A dimensionality reduction technique — Principal Component Analysis  Goal: 1. Transform the original high dimensional representation of protein motion into a lower dimensional representation that captures the dominant modes of motions of the protein. 2. Obtain conformations that have been observed in laboratory experiments.

NUS CS52476 The focus of this paper  How to obtain a reduced representation of protein flexibility from raw protein structural data

NUS CS52477 What is Protein flexibility ?  Definition: A crucial aspect of the relation between protein structure and function.  Proteins change their three-dimensional shapes when binding or unbinding to other molecules.

NUS CS52478

9

10 Why we want to modeling protein flexibility?  Several applications for our work: 1. Pharmaceutical drug development 2. To model conformational changes that occur during protein-protein and protein-DNA/RNA interactions.

NUS CS RII molecular "handshake" (donut with two holes). Backbone of the RII dimer showing glycan binding sites. Models for the binding of RII to the glycophorin A receptor on red blood cells (erythrocytes).

NUS CS The problems in current methods  The computational complexity of explicitly modeling all the degrees of freedom of a protein is too high.  Modeling proteins as rigid structures limits the effectiveness of currently used molecular docking mithods.

NUS CS The benefit of our method in this paper (1)  Using the approximation  Make including protein flexibility in the drug process a computationally efficient way.

NUS CS Two most common structural biology experimental methods in use today  Protein X-ray crystallography  Nuclear magnetic resonance (NMR)  Limits:

NUS CS An alternative to experimental methods  Computational methods based on classical or quantum mechanics to approximate protein flexibility.  Limits:

NUS CS The benefit of our method in this paper (2)  Transform the basis of representation of molecular motion.  The new degrees of freedom will be linear combinations of the original variables.  Some degrees of freedom are significantly more representative of protein flexibility than others.  Consider only the most significant dof and the transformed dof are collective motions affecting the entire configuration of the protein.  Some tradeoff between the loss of information and effectively modeling protein flexibility in a largely reduced dimensionality subspace.

NUS CS What we acutually do in this paper?  Start from initial coordinate information from different data sources  Apply the principal component analysis method of dimensionality reduction.  Obtain a new structural representation using collective degrees of freedom.  Here, we will focus on a. the interpretation of the principal components as biologically relevant motions b. how combinations of a reduced number of these motions can approximate alternative conformations of the protein.

NUS CS Dimensionality reduction techniques  Aim: find a mapping between the data in a space and its subspace.  Two methods: a. Multidimensional scaling (MDS) b. Principal component analysis (PCA) Merits: Limits:

NUS CS PCA of conformational data  Merits: 1). the most established method 2). the most efficient algorithms 3). guaranteed convergence for computation 4). a upper bound on how much we can reduce the representation of conformational flexibility in proteins. 5). the principal components have a direct physical interpretation. 6). can readily project the high dimensional data to a low dimensional space and do it in the inverse direction recovering a representation of the original data with minimal reconstruction error.

NUS CS PCA of conformational data (continued)  Linear and non-linear

NUS CS PCA of conformational data (continued)  Conformational Data 1. The input data for PCA: Several atomic displacement vectors (3N dimension) corresponding to different structural conformations, which as the form corresponds to Cartesian coordinate information for the ith atom. 2. All atomic displacement vectors constitute the conformational vector set.

NUS CS Singular value decomposition (SVD)  We use the singular value decomposition (SVD) as an efficient computational method to calculate the principal components.  The SVD of a matrix, A, is defined as: where U and V are orthonormal matrices and is a nonnegative diagonal matrix whose diagonal elements are the singular values of A. the columns of matrices U and V are called the left and right singular vectors, respectively. the square of each singular value corresponds to the variance of the data in A. The SVD of matrix A was computed using the ARPACK library.

NUS CS Obtaining conformational Data  The most common data sources: 1. experimental laboratory methods: a. X-ray crystallography b. NMR, 2. computational sampling methods based forcefield such as molecular dynamics. laboratory methods VS computational methods: - laboratory methods generate less data - computational methods have a lower accuracy.

NUS CS Application to Specific Systems  Now, let’s see about

NUS CS HIV-1 Protease

NUS CS The advantages of using the PCA methodology to analyze protein flexibility  Can be used at different levels of detail : 1. the overall motion of the backbone. 2. the simplified flexibility of the protein as a whole. 3. include only the atoms that constitute the binding site.

NUS CS In the first experiment situation

NUS CS The second situation  The advantage of PCA

NUS CS The last situation  As a validation of our method.

NUS CS Another application: Aldose Reduction

NUS CS Summary

NUS CS Thank you