Presentation on theme: "CD and MD. What’s my problem with MD? 1.Its development has been manifestly unscientific 2.Its answers (numbers, trajectories, minima) are as unreliable."— Presentation transcript:
What’s my problem with MD? 1.Its development has been manifestly unscientific 2.Its answers (numbers, trajectories, minima) are as unreliable (or more) than simpler methods 3.Yet its manifest societal advantages- “physics”, movies, CPU time, complexity, jargon- lead to cognitive dissonance (hopeful thinking) concerning its actual value to drug discovery
CD: Cognitive Dissonance Cognitive dissonance theory explains human behavior by positing that people have a bias to seek consonance between their expectations and reality. According to Festinger, people engage in a process he termed "dissonance reduction," which can be achieved in one of three ways: lowering the importance of one of the discordant factors, adding consonant elements, or changing one of the dissonant factors. This bias sheds light on otherwise puzzling, irrational, and even destructive behavior. Lowering importance- Actually agreeing (numerically) with experiment Adding consonance- “It’s an idea generator” Changing the dissonance- Reparameterizing Wikipedia:- (+ Effort Justification Paradigm)
AM I CD? Came from Barry’s Lab (the Great PB MD Wars) Don’t sell MD (perhaps I’m jealous) Why should you believe me? -Don’t write/ need grants -Don’t need tenure -PB is not a significant OE income stream -Been observing MD for > 25 years -I hired an MD guy (who I sent to China!) -I manifestly want this to be a better industry
Also.. The fastest PB- DelPhi, ZAP The fastest surfacing algorithms- GRASP, ZAP The fastest 3D shape alignment- ROCS, FastROCS The fastest conformer generator- OMEGA The fastest, non-stochastic docker- FRED The fastest (accurate) Surface Area, RMSD, AM1, protein pka, proton placement.. If I wanted to do MD, mine would rock I believe the effort/reward ratio is (way) too low
How Galileo Transformed Science 1.Resolution 2.Demonstration 1.Experiment Think something up See if it matches available evidence Think of a new experiment to test it (to differentiate from old theories)
A Galilean Value Scale for Experiments Retrospective Data that shapes the theory – MD, Most of molecular modeling, economics Prospective without Controls – Rich Friesner, Xavier Barril Unanticipated Retrospective Data – SAMPL solvation energies Prospective designed with NULL model Controls – Bertrand Garcia Moreno, protein pKa Collective – Lyall Isaacs, SAMPL host-guest Prospective to distinguish from Best-of-Class Controls – Nobody Better
A Galilean Value Scale for Experiments Retrospective Data that shapes the theory – MD, Most of molecular modeling, economics Prospective without Controls – Rich Friesner, Xavier Barril Unanticipated Retrospective Data – SAMPL solvation energies Prospective designed with NULL model Controls – Lyall Isaacs, SAMPL host-guest – Bertrand Garcia Moreno, protein pKa Collective Prospective to distinguish against Best-of-Class Controls – Nobody Better Vast Majority
Prospective Without Controls Surgeons coming up with new procedures – Osteoarthritis & Arthroscopic knee surgery US Foreign policy – Just do something, claim success when it works, bury it when it doesn’t Anecdotal stories – The “hot hand” phenomena – I did “X”, it worked.
I did “X”, it worked Two chief fallacies (i) Fallacy of Composition -What else did you actually do (ii) Fallacy of Selection -File Drawer effect (False Positives) -Parameterization (implicit or explicit) to the result (False Negatives)
Fallacy of Composition Method X, e.g. MD, is but one part of a multipart process (filtering, chemists inspection, database bias)- success is claimed for X alone The same procedure with X replaced with a different method is never done/ presented
Example of Composition Error We predicted affinity with MM/QM and “It Worked” Was QM getting you anything? Did you do MM with QM-level charges, multipoles? MM alone? A scoring function?
Example of Composition Error We used a polarizable force field and got these results for the (SAMPL4) host-guest systems. “It Worked”, so polarization worked. Did you also try it without polarization? With better quality charges? With equivalent CPU time but without polarization (more sampling)?
Example of Composition Error We ran MD for a bit, looked at how the ligands wiggled and designed six drugs (Christopher Bayly & others at Merck Frosst) Did you compare to MM? To other simple heuristics? Without any chemists input? It’s not “Science” until someone else does it
Fallacy of Selection: The Tanimoto of Truth TM 0110100101 1110010100 Reality An Event Happened An Event Didn’t Predictions ToT = Events that happened and were predicted Events predicted or happened
The Tanimoto of Truth 0110100101 1110010100 Reality An Event Happened An Event Didn’t Predictions Published Especially by Academia The Tanimoto of Truth
0110100101 1110010100 Reality An Event Happened An Event Didn’t Predictions “File Drawer” False Positives Especially by Industry The Tanimoto of Truth
0110100101 1110010100 Reality An Event Happened An Event Didn’t Predictions False Negatives- Parameterize till publishable Especially by Academia The Tanimoto of Truth
0110100101 1110010100 Reality An Event Happened An Event Didn’t Predictions True Negatives- Not sexy, “Hempel’s Ravens” Largely ignored by Academia & Industry The Tanimoto of Truth
“Similarity” methods, Docking, Machine Learning All are judged by some kind of ToT Quantification for MD ‘events’? Never. MD is mostly uncontrolled, anecdotal & unscientific Psychology, Philosophy, Social Dynamics Psychology, Philosophy, Social Dynamics Underlying Physics, Examination of Successes Underlying Physics, Examination of Successes
Molecular Dynamics: Types of Applications 1) Global sampling- thermodynamic averages -FEP etc. Absolute or Relative Energies 2) Simulate time evolution (movies) -D.E. Shaw, Vijay Pande- Mechanism 3) Local sampling (thermally accessible barriers) -Bayly & co., WaterMap, MM/PBSA. Qualitative Assessment
Thermodynamic energies and Fables of Physics “We all know that if we had the perfect force field and simulated for an infinite time, we’d get the right answer”- Woody Sherman, ACS San Francisco, March 24 th, 2010 1)pKa, Tautomers 2)Finite temperature, MD & Stat Mech 3)Ergoticity? 4)The illusion of a ‘perfect” ForceField (that ≠ QM)
Typical FF Thinking: Polarization Polarization is tricky But it makes dipoles bigger, e.g. water – 1.85D (vacuum) 2.5~2.6D (condensed phase) So therefore increase charges by ~15% – E.g. use HF-6-31G* Now molecules are roughly correct
Polarization of Dipoles -|+-|+ -|+-|+ -|+-|+ -|+-|+ -|+-|+ -|+-|+ -|+-|+ -|+-|+ + + + + + - - - - - - E 0 E pol -|+-|+ -|+-|+ +|-+|- +|-+|- -|+-|+ -|+-|+ +|-+|- +|-+|- - - - - - - - - - - - E 0 E pol Favorable Unfavorable DD DD
Scaling vs Polarization AlignmentScaling ChargesPolarization Favorable Lowers Energy Unfavorable Raises EnergyLowers Energy Scaling dipoles can only be accurate on average (with parameterization) not locally!
JF PID AMOEBA EPIC Quantum mechanics Kim Sharp: Ah, but then there’s AMOEBA (“PB”!) (Jean-Francois Truchon)
Hydrogen Bonds: Formamide dimer “Close agreement between the orientation dependence of hydrogen bonds observed in protein structures and quantum mechanical calculations” A. V. Morozov, T. Kortemme, K. Tsemekhman and D. Baker, PNAS, Volume 101, page 6946, 2004. Methodδ(H- A) ψX DFT1.94112.34159.43-177.51 MP21.97110.49155.33-179.49 HF2.10138.16170.94-179.54 CHARMM271.82170.25170.83-106.83 OPLS-AA1.75165.04175.61145.12 MM3-20001.98121.16161.07149.63 PDB1.93115.00175.00
Geometry optimizations starting from the Baker MP2 minimum
Model Electrostatic Energy (kcal/mol) Point Monopole-4.33 Point Dipole-5.81 Point Quadrupole-6.36 Point Octupole-6.31 Exponential Monopole-7.68 Exponential Dipole-8.32 Exponential Quadrupole-8.52 Exponential Octupole-8.18 CCSD/aug-cc-pVTZ-8.23 Fitting to the electron density Denny Elking, Tom Darden
Model Electrostatic Energy (kcal/mol) Point Monopole-4.33 Point Dipole-5.81 Point Quadrupole-6.36 Point Octupole-6.31 Exponential Monopole-7.68 Exponential Dipole-8.32 Exponential Quadrupole-8.52 Exponential Octupole-8.18 CCSD/aug-cc-pVTZ-8.23 Or…… Increase Dipole from 1.85D to 2.56D
Details, Details.. 1) Just incorporate Volume Terms (PB) 2) And all those other terms: - Exchange interactions - VdW anisotropy - pKa & Tautomers - Cross-terms between valence and non-bonded - Three (N) body terms…. Eventually it’ll be right! Woody’ll be right. Inconceivable it can’t ever be right. (Wolynes)
Concrete MD Examples Binding Energies- Shirts - Also Solvation (Simpler system) Protein Trajectories- Shaw - Also Peptides (Simpler systems) “Minimization” – Shoichet - Is a simple system
FKBP-12 Yet Again Retrospective Data that shapes the theory
Contributions to Affinity VdW Coulombic Buried Area Desolvation Entropy Discrete Waters Polarization
Correlations to Affinity VdW Desolvation Entropy Discrete Waters Polarization Buried Area Shape Electrostatics Coulombic
E.g. VdW Train on 17 HIV-1 Protease Inhibitors 1) Minimization (MM2X) 2) pIC 50 =-0.15*E inter -8.1 Prospectively used on 16 more
E.g. Coulombic Urokinase Coulombic Interaction Brown & Muchmore, JCIM, 2007, (47) 4
MM-PBSA Buried Area “Fast and Accurate Predictions of Binding Free Energies using MM-PBSA and MM-GBSA” Rastelli, G., Del Rio, A.,Degliesposti,G., Sgobba, M. J. Comp. Chem. Vol 31, #4, pg 797-810 DHFR E.g. Buried Area
My observation over 20 years For congeneric series, something basic often correlates, sometime well (VdW, Coulombic) For non-congeneric usually nothing works If something works for non-congenerics, it’s usually something basic (mass, buried area)
SAMPL4: 50 Solvation Energies My PB Method Best MD QM + Specific Group-wise Parameterization
Structural basis for modulation of a G-protein-coupled receptor by allosteric drugs- D. E. Shaw 1)Where they bind - Confirmed by mutagenesis 2) A surprise in how they bind -pi-charge interactions -not charge-charge 3) Cause of allostery: (i)Charge (ii)Binding pocket width -Confirmed by synthesis
IMHO 1)Where they bind - Confirmed by mutagenesis 2) How they bind -pi-charge interactions -not charge-charge 3) Cause of allostery: (i)Charge (ii)Binding pocket width -Confirmed by synthesis 1)Docking with Glide did almost as well. Confirmation is WEAK. 2) THIS IS NOT A SURPRISE! 3) (i) Already known & follows charge multiplicity exactly. (ii) –ONE CMPD (better than most!)
Also.. Local ionizable residues never (de)protonate – Binding +3 ligands NMS was modeled, not simulated Experimental errors claimed are <0.1 kcal in vivo
Simpler Story- Peptides Poly-Ala propensities (2010) – Have to modify FF to get helicity right Side-chain conformation preferences (2012) – Little agreement between force-fields – Poor agreement with crystals (2013) H-bond geometries (2005) – Flawed Baker study Beta-hairpin simulations (2012) – Little agreement between force-fields
Simple System: Shoichet- Relative binding energies in a cavity A signal! Maybe not! RMSE from Phenol = 2.5 kcal/mol RMSE from from Catechol = 1.1 kcal/mol RMSE of the “NULL” hypothesis = 1.2 kcal/mol From “closest” Phenol|Catechol = 0.8 kcal/mol Poses selected, not found, so is this dynamics or minimization? NULL MODELS
One, Inescapable, Conclusion We cannot calculate the energies of protein microstates with any accuracy It is unclear even how bad we are Even ranking must be suspect Ranking Ligands, Absolute or Relative Flexible Docking Protein folding to atomic resolution Evaluating unfolded states Excursions from the crystal structure Of Dubious Value
So how can we fold (small) proteins? Luck- are small proteins self-selectingly robust? Some parameterization (Shaw) Stability of kinetic pathways might be more robust than energetics suggest (Pande) ?
But what’s the alternative? To Local Minimization – Sample (MC, Low Mode etc) and minimize To Energy evaluation – Exhaustively sample and minimize To time evolution – Elastic network? Low mode dynamics? – Run MD!
Experiments I Wish Were Done Protein Crystallography – Predict the room temperature density Small molecule NMR – Predict the dominant low energy conformer Protein Electrostatics – Predict potentials in the active site Host-guest systems – Binding energies, salt effects
And how I wish they were done: Maximal Disinformation Testing 1.FIRST calculate for two or more methods, e.g. polarization vs static, PB vs MD, MD vs MM 2.Prospectively measure those systems that most distinguish methods- mutual disinformation 3.Adapt theories- no one’s perfect! 4.Repeat steps 1,2 & 3 5.Does a prediction ‘gap’ persist? E.g. Kepler vs Epicycles.
Final Thoughts I’d love MD to work! Make my job easier It doesn’t. At least not as advertised/ believed It’s nature (“physics”, big calculations, movies) leads to overconfidence Until a more scientific approach is adopted it’s unlikely to get better. GPUs won’t save MD What’s needed is Maximal Disinformation Testing & Model systems