Presentation is loading. Please wait.

Presentation is loading. Please wait.

CAPRA: C-Alpha Pattern Recognition Algorithm Thomas R. Ioerger Department of Computer Science Texas A&M University.

Similar presentations


Presentation on theme: "CAPRA: C-Alpha Pattern Recognition Algorithm Thomas R. Ioerger Department of Computer Science Texas A&M University."— Presentation transcript:

1 CAPRA: C-Alpha Pattern Recognition Algorithm Thomas R. Ioerger Department of Computer Science Texas A&M University

2

3

4

5

6

7

8

9

10

11 Overview of CAPRA goal: predict CA chains from density map not just “tracing” - more than Bones desire 1:1 correspondence, ~3.8A apart based on principles of pattern recognition –use neural net to estimate which pseudo-atoms in trace “look” closest to true C-alphas –use feature extraction to capture 3D patterns in density for input to neural net –use other heuristics for “linking” together into chains, including geometric analysis (s.s.)

12 What can you do with CA chains? build-in side-chain and backbone atoms –TEXTAL, Segment-Match Modeling (Levitt), Holm and Sander recognize fold from secondary structure –identify candidates for molecular replacement evaluate map quality (num/len of chains) density modification –create poly-alanine backbone and use it to do phase recombination

13 Role in Automated Model Building Model building is one of the bottlenecks in high-throughput Structural Genomics Automation is needed TEXTAL CAPRA PHENIX reflections map model CA chains (ha/dm/ncs) refinement

14 Steps in CAPRA

15 Examples of CAPRA Steps

16 Tracer + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

17 Neural Network

18 Feature Extraction characterize 3D patterns in local density must be “rotation invariant” examples: –average density in region –standard deviation, kurtosis... –distance to center of mass –moments of inertia, ratios of moments –“spoke angles” calculated over spheres of 3A and 4A radius

19 Forward Propagation: Backward Propagation:

20

21 Selection of Candidate C-alpha’s method: –pick candidates in order of lowest predicted distance first, –among all pseudo-atoms in trace, –as long as not closer than 2.5A notes: –no 3.8A constraint; distance can be as high as 5A –don’t rely on branch points (though often near) –picked in random order throughout map –initially covers whole map, including side-chains and disconnected regions (e.g. noise in solvent)

22 Linking into Chains initial connectivity of CA candidates based on the trace “over-connected” graph - branches, cycles... start by computing connected components (islands, or clusters) two strategies: –for small clusters (<=20 candidates), find longest internal chain with “good” atoms –for large clusters (>20 candidates), incrementally clip branch points using heuristics

23 Extracting Chains from Small Clusters exhaustive depth-first search of all paths scoring function: –length –penalty for inclusion of points with high predicted distance to true CA by neural net –preference for following secondary structure (locally straight or helical)

24 Secondary Structure Analysis generate all 7-mers (connected fragments of candidate CAs of length 7) evaluate “straightness” –ratio of sum of link lengths to end-to-end distance –straightness>0.8 ==> potential beta-strand evaluate “helicity” –average absolute deviation of angles and torsions along 7-mer from ideal values (95º and 50º) –helicity potential alpha-helix

25 Handling Large Clusters start by breaking cycles (near “bad” atoms) clip links at branch points till only linear chains remain clip the most “obvious” links first, e.g. –if other two links are part of sec. struct. –if clipped branch has “bad” atom nearby –if clipped branch is small and other 2 are large ?? ?

26 Results

27 Analysis of RMS by Sec. Struct. (DSSP)

28 Example of CA-chains for CzrA fit by CAPRA

29 Results for MVK

30 Effect of Resolution IF5a –initial map: 2.1A, RMS error: 1.23A –limited map: 2.8A, RMS error: 0.86A PCAa (2Fo-Fc) –initial map: 2.0A, RMS error: 1.1A –limited map: 2.8A, RMS error: 0.82A

31 Effect of Density Modification anecdotal evidence from ICL –before DM: many short, broken chains –after DM: longer chains, reasonable model hard to quantify, but the moral is: –the accuracy of CAPRA results depends on “quality” of density, and CAPRA might not give useful results in noisy maps experiments with “blurring” maps –convolution with Gaussian by FFT

32 Future Work build poly-alanine –must determine directionality –currently done as part of TEXTAL (fits backbone carbonyls as well as side-chain atoms) connect ends of chains –improve robustness to breaks in density use partial models to improve phases and hence make better maps (iteratively) –a new form of density modification?

33 Related Approaches Resolve (Terwilliger) –template convolution search, max. likelihood MAID (D. Levitt) –density correlation search, grow ends Critical-point analysis (Glasgow/Fortier) ARP/wARP (Perrakis and Lamzin) MAIN (D. Turk) –chiral carbons; iterate: extend ends, phase recomb. X-Powerfit (T. Oldfield, MSI)

34 Availability on pompano, add /xray/textal/bin/capra to your path run ‘capra ’ where.xplor is your map in X-PLOR fmt map should cover at least one whole molecule, though smaller=faster takes a minutes to an hour (especially for feature calculations) any space group & unit cell resolution: 2.2-3.2A, 2.8A recommended remember: quality of density must be high, e.g. post- solvent-flattening, etc.

35 Acknowledgements Funding –National Institutes of Health –Welch Foundation People –Dr. James C. Sacchettini –The TEXTAL Group! Tod Romo Kreshna Gopal Reetal Pai


Download ppt "CAPRA: C-Alpha Pattern Recognition Algorithm Thomas R. Ioerger Department of Computer Science Texas A&M University."

Similar presentations


Ads by Google