Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps Frank DiMaio, Jude Shavlik Computer Sciences Department George Phillips Biochemistry.

Similar presentations


Presentation on theme: "A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps Frank DiMaio, Jude Shavlik Computer Sciences Department George Phillips Biochemistry."— Presentation transcript:

1 A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps Frank DiMaio, Jude Shavlik Computer Sciences Department George Phillips Biochemistry Department University of Wisconsin – Madison USA Presented at the Fourteenth Conference on Intelligent Systems for Molecular Biology (ISMB 2006), Fortaleza, Brazil, August 7, 2006

2 X-ray Crystallography Protein Crystal Collection Plate FFT Electron Density Map (“3D picture”) X-ray beam

3 Given: Sequence + Density Map Sequence + Electron Density Map

4 Find: Each Atom’s Coordinates

5 Our Subtask: Backbone Trace CαCα CαCα CαCα CαCα

6 The Unit Cell  3D density function ρ(x,y,z) provided over unit cell  Unit cell may contain multiple copies of the protein

7 The Unit Cell  3D density function ρ(x,y,z) provided over unit cell  Unit cell may contain multiple copies of the protein

8 Density Map Resolution ARP/wARP (Perrakis et al. 1997) TEXTAL (Ioerger et al. 1999) Resolve (Terwilliger 2002) Our focus 2Å 3Å 4Å

9 Overview of ACMI (our method)  Local Match Algorithm searches for sequence-specific 5-mers centered at each amino acid Many false positives  Global Consistency Use probabilistic model to filter false positives Find most probable backbone trace  Global Consistency Use probabilistic model to filter false positives Find most probable backbone trace

10 5-mer Lookup and Cluster … VKH V LVSPEKIEELIKGY … PDB Cluster 1 Cluster 2 wt=0.67wt=0.33 NOTE: can be done in precompute step

11 5-mer Search  6D search (rotation + translation) for representative structures in density map  Compute “similarity”  Computed by Fourier convolution (Cowtan 2001)  Use tuneset to convert similarity score to probability

12 Convert Scores to Probabilities 5-mer representative scores t i (u i ) search density map Bayes’ rule probability distribution over unit cell P(5-mer at u i | Map) match to tuneset score distributions POS NEG

13 In This Talk…  Where we are now For each amino acid in the protein, we have a probability distribution over the unit cell  Where we are headed Find the backbone layout maximizing

14 Pairwise Markov Field Models  A type of undirected graphical model  Represent joint probabilities as product of vertex and edge potentials  Similar to (but more general than) Bayesian networks u1u1 u3u3 u2u2 y

15 Protein Backbone Model ALAGLYLYSLEU  Each vertex is an amino acid  Each label is location + orientation  Evidence y is the electron density map  Each vertex (or observational) potential comes from the 5-mer matching

16 Protein Backbone Model  Two types of edge (or structural) potentials Adjacency constraints ensure adjacent amino acids are ~3.8 Å apart and in the proper orientation ALAGLYLYSLEU

17 Protein Backbone Model  Two types of structural (edge) potentials Adjacency constraints ensure adjacent amino acids are ~3.8 Å apart and in the proper orientation Occupancy constraints ensure nonadjacent amino acids do not occupy same 3D space ALAGLYLYSLEU

18 Backbone Model Potential Constraints between adjacent amino acids: =x

19 Constraints between nonadjacent amino acids: Backbone Model Potential

20 Observational (“amino-acid-finder”) probabilities Backbone Model Potential

21 Probabilistic Inference  Exact methods are intractable  Use belief propagation (BP) to approximate marginal distributions  Want to find backbone layout that maximizes

22 Belief Propagation (BP)  Iterative, message-passing method (Pearl 1988)  A message,, from amino acid i to amino acid j indicates where i expects to find j  An approximation to the marginal (or belief), is given as the product of incoming messages

23 Belief Propagation Example ALAGLY

24 Technical Challenges  Representation of potentials Store Fourier coefficients in Cartesian space At each location x, store a single orientation r  Speeding up O(N 2 X 2 ) naïve implementation X = the unit cell size (# Fourier coefficients) N = the number of residues in the protein

25 Speeding Up O(N 2 X 2 ) Implementation  O(X 2 ) computation for each occupancy message Each message must integrate over the unit cell O(X log X) as multiplication in Fourier space  O(N 2 ) messages computed & stored Approx N-3 occupancy messages with a single message O(N) messages using a message product accumulator  Improved implementation O(NX log X)

26 1XMT at 3Å Resolution 1.12Å RMSd 100% coverage HIGH LOW 0.17 0.82 prob(AA at location)

27 1VMO at 4Å Resolution 3.63Å RMSd 72% coverage 0.02 0.25 HIGH LOW prob(AA at location)

28 1YDH at 3.5Å Resolution 1.47Å RMSd 90% coverage 0.02 0.27 HIGH LOW prob(AA at location)

29 Experiments  Tested ACMI against other map interpretation algorithms: TEXTAL and Resolve  Used ten model-phased maps  Smoothly diminished reflection intensities yielding 2.5, 3.0, 3.5, 4.0 Å resolution maps

30 RMS Deviation ACMI Textal Resolve Density Map Resolution Cα RMS Deviation ACMI

31 Model Completeness Density Map Resolution ACMI Textal Resolve % chain traced % residues identified ACMI

32 Per-protein RMS Deviation ACMI RMS Error TEXTAL RMS Error Resolve RMS Error

33 Conclusions  ACMI effectively combines weakly-matching templates to construct a full model  Produces an accurate trace even with poor-quality density map data  Reduces computational complexity from O(N 2 X 2 ) to O(N X log X)  Inference possible for even large unit cells

34 Future Work  Improve “amino-acid-finding” algorithm  Incorporate sidechain placement / refinement  Manage missing data Disordered regions Only exterior visible (e.g., in CryoEM)

35 Acknowledgements  Ameet Soni  Craig Bingman  NLM grants 1R01 LM008796 and 1T15 LM007359


Download ppt "A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps Frank DiMaio, Jude Shavlik Computer Sciences Department George Phillips Biochemistry."

Similar presentations


Ads by Google