Presentation is loading. Please wait.

Presentation is loading. Please wait.

Graphical Models for Protein Kinetics Nina Singhal CS374 Presentation Nov. 1, 2005.

Similar presentations


Presentation on theme: "Graphical Models for Protein Kinetics Nina Singhal CS374 Presentation Nov. 1, 2005."— Presentation transcript:

1 Graphical Models for Protein Kinetics Nina Singhal CS374 Presentation Nov. 1, 2005

2 Outline Background material on proteins Why study protein kinetics Graphical models for kinetics Motion planning view (Apaydin et al, 2003) Molecular dynamics view (Singhal et al, 2004) Conclusions

3 Background on Proteins Alpha Helix Beta Strand and Sheet Beta Barrel

4 Structure Prediction Given an amino acid sequence, what 3D structure will the protein form? MTYKLILNGKTLKGETTTEAVDAATAEKVFK QYANDNGVDGEWTYDDATKTFTVTE ?

5 Pathways and Kinetics How does a protein actually get from an unfolded configuration to a folded configuration? ?

6 Folding Kinetics Rate of folding Uniqueness of pathway Order of secondary structure formation Secondary or tertiary structure

7 Applications Misfolded proteins and diseases Alzheimer's Cystic fibrosis Mad cow disease Intermediates may be important as drug targets Protein design

8 Representation of a Protein N1 CC C N2 phi psi omega R Psi A protein with n amino acids can be represented using 2n phi-psi angles, each in the range [0, 2  ) N1 N2

9 Graphical Models for Protein Kinetics Protein conformations have different energies Graphical models discretize the conformation space and connect nearby regions with edges

10 Robotics Motion Planning c1 c2 Robot with 2 degrees of freedom 0 22 22 c1 c2 2D configuration space Moving the robot arm from c1 to c2 is just finding a path in the configuration space from c1 to c2.

11 Roadmap Method Randomly sample points in configuration space. Keep feasible ones. Connect these points to form a graph. Process path queries using standard graph search techniques. 0 22 22 c2 c1

12 Protein Folding as a Search Problem Protein folding can be represented as a search through the protein’s configuration space Replace collision free constraint with a preference for low energy configurations Instead of finding any path, want to find all the energetically favorable paths

13 Stochastic Roadmap Simulation (Apaydin et. al. 2003) Sample protein configuration at random Add edges between nearby nodes Take advantage of the many folding pathways contained within a roadmap Efficiently calculate many properties of the entire landscape

14 Roadmap Construction Nodes in the graph are sampled uniformly at random Edges are added between nearest neighbors with probability: if  E ij > 0 otherwise

15 Roadmap as a Markov Chain We can view the molecular motion as a random walk over the roadmap Roadmap can be regarded as discretely sampled version of Monte Carlo simulation If fact, in the limit, probability distributions of Monte Carlo simulation and the roadmap converge

16 Transmission Coefficients Measures “kinetic distance” Probability that a conformation will fold before unfolding Can calculate by starting many Monte Carlo simulations from the conformation Very computationally expensive Folded state Unfolded state ? ?

17 Algebraic Method for Calculating Transmission Coefficients F U vivi vjvj P ij

18 Transmission Coefficients (cont) System of linear equations One equation and one unknown for each node Can be solved iteratively Low connectivity of the graph results in a sparse matrix

19 Results Studied a synthetic landscape and a real protein, ROP Protein was represented with 6 degrees of freedom, two vectors connected by a loop Correlation of transmission coefficients calculated by roadmaps and Monte Carlo simulations

20 Benefits and Drawbacks Extremely efficient at calculating kinetic properties like transmission coefficients Unclear whether low-dimension representation of protein is adequate Monte Carlo simulations may not be accurate enough for protein kinetics

21 Molecular dynamics 10 -15 femto 10 -12 pico 10 -9 nano 10 -6 micro 10 -3 milli 10 0 seconds Bond vibration Isomer- ation Water dynamics Helix forms Fastest folders typical folders slow folders long MD run where we need to be MD step where we’d love to be Simulate protein movement using Newton’s laws of motion

22 Folding@Home: Worldwide desktop grid computing ~150,000 CPUs over the world (CPU locations from IP address)

23 Markovian Model Method (Singhal et al. JCP 2004) Generate molecular dynamics trajectories from transition path sampling or independently Cluster nearby points into macrostates to build roadmap with also include transition time Calculate the mean first passage time and P fold using linear algebra

24 Step 1: sampling of paths Pick a random point from current path Shoot a path from this point If path reaches initial or final state by some cutoff time, stop simulation and accept it Define new current path

25 Step 2: Generation of roadmap Nodes are accepted points, edges connect successive nodes Cluster nearby points to make roadmap more connected Calculate edge weights by counting number of transitions between nodes and normalize

26 Step 3 (opt): Re-weighting of edges Can analyze roadmap at parameter values other than the simulated ones without need for additional simulations For temperature, can re-weight edges by the relative probabilities at the two temperatures according to the dynamics Renormalize edges so outgoing probability sums to one

27 Calculating P folds and MFPT Equation for each node is conditioned on which neighbor it transitions to One equation and one unknown for each node Can be solved iteratively

28 Energy landscape and initial pathway 2-D energy landscape Initial and final regions defined by circles around the two minima Initial paths generated by Monte Carlo or Langevin dynamics I F

29 Results - P fold Compare P fold values to those from many direct simulations Correlation coefficients are 0.99 for both

30 Results - MFPT Compare MFPT at different temperatures to those from 10,000 direct simulations

31 Results – Trp zipper  -hairpin Analyzed existing simulation data of a small, 12 residue, protein 1750 trajectories, each 10 - 450 ns, resolution of 10 ns for non- folding and 250 ps for folding Combine into roadmap Depending on clustering cutoffs, MFPT = 2-9  s Agrees with experimental results of 2.47 ±  0.05  s and previous analysis of simulation data of 4.5  s

32 Conclusions Graphical methods produce a network of possible protein pathways These networks can be efficiently analyzed to compute kinetic properties Very fast method for looking at simple protein models or analyzing existing molecular dynamics data


Download ppt "Graphical Models for Protein Kinetics Nina Singhal CS374 Presentation Nov. 1, 2005."

Similar presentations


Ads by Google