Presentation is loading. Please wait.

Presentation is loading. Please wait.

Winners and Losers: Ranking Crystals from Diffraction Images Angela R. Criswell Automation Scientist.

Similar presentations


Presentation on theme: "Winners and Losers: Ranking Crystals from Diffraction Images Angela R. Criswell Automation Scientist."— Presentation transcript:

1 Winners and Losers: Ranking Crystals from Diffraction Images Angela R. Criswell Automation Scientist

2 ACTOR Installations Pharmaceutical Companies (11)Pharmaceutical Companies (11)  Abbott Laboratories (Chicago, IL)  Astex Technology (UK)  AstraZeneca (UK)  Aventis (Frankfurt)  BMS (Princeton, NJ)  Exelixis (San Francisco, CA)  Merck (West Point, PA)  Novartis (Basel, Switzerland)  Novartis (Cambridge, MA)  Pfizer (St. Louis, MO)  Schering-Plough Research Inst. (NJ) Structural Genomics Groups (3)Structural Genomics Groups (3)  SGC – Oxford (UK)  University of Georgia  University of Toronto Beamlines (2)Beamlines (2)  Daresbury Laboratory (UK)  IMCA-CAT (APS) Future Installations (4)Future Installations (4)  2 additional beamlines (SLS, Diamond)  1 pharmaceutical company AGENT Installations (3)AGENT Installations (3)  ActiveSight (San Diego, CA)  2 future pharmaceutical sites

3 High Throughput Optimization Automate the processesAutomate the processes  Crystallization robots  Sample mounting robots  Automated structure solution Increase robustness for automated processesIncrease robustness for automated processes  Hardware and software improvements  Sample tracking methods and database management Ever increasing complexityEver increasing complexity  Incorporate intelligence and examine success/failure.  Heuristic and learning methods  Remote access and control of automated processes  VNC and mail-in crystallography  Diffraction improvement by controlled hydration  Free-mounting system (Proteros)

4 Crystal Ranking: An Evolution

5 Do I have another crystal??Do I have another crystal?? Is the crystal twinned?Is the crystal twinned? How far does the crystal diffract?How far does the crystal diffract? Are there ice rings?Are there ice rings? Do peaks have a decent spot shapes?Do peaks have a decent spot shapes? Can I assign a unit cell for the sample?Can I assign a unit cell for the sample? What are the unit cell dimensions and space group?What are the unit cell dimensions and space group? How do Crystallographers Rank Crystals?? I/sig(I) analysis is not sufficientI/sig(I) analysis is not sufficient Single image is probably not sufficientSingle image is probably not sufficient

6 Crystal Ranking Efforts d*TREK (Rigaku/MSC - Pflugrath)d*TREK (Rigaku/MSC - Pflugrath)  automatic indexing, ranking, strategy, integration, scaling DISTL and LABELIT (SSRL & LBNL)DISTL and LABELIT (SSRL & LBNL)  Automatic ranking and indexing, data processing DNA (SPINE)DNA (SPINE)  Automatic ranking and indexing CrySis (Brookhaven – Bernston, Stojanoff, and Takai)CrySis (Brookhaven – Bernston, Stojanoff, and Takai)  ranking with neural network trained with 500 diff images BEST (EMBL – Popov)BEST (EMBL – Popov)  Data collection strategy based upon statistic modeling

7 SpamAssassin SCORE: Advertisement for SuperBowl Celebration Event No. hits=3.9 Required=4.0No. hits=3.9 Required=4.0  tests= HTML_60_70 HTML_FONTCOLOR_RED HTML_FONTCOLOR_UNSAFE HTML_FONT_INVISIBLE HTML_MESSAGE HTTP_ESCAPED_HOST HTTP_EXCESSIVE_ESCAPES LINES_OF_YELLING Performs cursory header analysis: spots s that try to mask their identitiesPerforms cursory header analysis: spots s that try to mask their identities Performs in-depth text analysis: spam mails often have a characteristic style (to put it politely)Performs in-depth text analysis: spam mails often have a characteristic style (to put it politely)  characteristic disclaimers and lots of !!!!!  webpage links Enables blacklisting: block from existing blacklist sitesEnables blacklisting: block from existing blacklist sites Adaptive learns to recognize spam based upon user scores and amend blacklistsAdaptive learns to recognize spam based upon user scores and amend blacklists

8 Strategic Ranking Goals Incorporate image analysis tools aloneIncorporate image analysis tools alone  Diffraction limits  Bragg peak intensities  Background radiation  Ice ring identification – strong and diffuse Incorporate indexing and refinement resultsIncorporate indexing and refinement results  Spot shape  Lattice quality  Spot prediction analysis (discriminates twinned from non-twinned crystals) Incorporate Comparative analysisIncorporate Comparative analysis  Between samples (rank comparisons)  Images collected for same sample (different crystal orientations)  Automatic exposure time determination

9 Rules 1 and 2 Divide image into 10 resolution bins. Ignore lowest 3 bins. Analyze 7 highest resolution shells # reflns / shell S:N of reflns / shell

10 Rule 3: Spot Sharpness calculated for every peakcalculated for every peak output = avg 2(A/B) A = peak max position – peak center position x 1 x 2 x 1 x 2 B = ( Δx 2 + Δy 2 )1/2 B is the effective diameter of the peak.

11 Rules 4 – 5: Ice Ring Detection Step 1: filter out peaks from imagesStep 1: filter out peaks from images Step 2: bin pixels by 2θStep 2: bin pixels by 2θ Step 3: for each bin, sum pixel intensitiesStep 3: for each bin, sum pixel intensities Example plot:

12 Lysozyme 2_05 rank = 202

13 Lysozyme 2_01 rank = 179

14 Lysozyme 2_10 rank = 124

15 Rules Indexing Award for percentage of indexed spots 7. 7.Refinement Penalty based upon RMS MM residual 8. 8.Mosaicity Penalty based upon refined mosaicity 9. 9.Refinement Coverage Award for percentage of accepted reflections in prediction list Prediction Re-evaluate highest 7 resolution shells based upon number of found spots that match predicted reflection list Refined Reflection Resolution Re-evaluate highest 7 resolution shells based upon the signal-to-noise ratio of predicted reflections

16 Rule 1: Spot count in resolution shells (found spots) Rule 1: Spot count in resolution shells (found spots) Rule 2: I/Sigma in resolution shells (found spots) Rule 2: I/Sigma in resolution shells (found spots) Rule 3: Spot sharpness Rule 3: Spot sharpness Rule 4: Strong ice rings Rule 4: Strong ice rings Rule 5: Diffuse ice rings Rule 5: Diffuse ice rings Rule 6: Percentage of spots indexed Rule 6: Percentage of spots indexed Rule 7: RMS residual after refinement Rule 7: RMS residual after refinement Rule 8: Mosaicity Rule 8: Mosaicity Rule 9: Percentage of spots refined Rule 9: Percentage of spots refined Rule 10: Spot count in resolution shells (predicted and found spots) Rule 10: Spot count in resolution shells (predicted and found spots) Rule 11: I/Sigma in resolution shells (predicted and found spots) Rule 11: I/Sigma in resolution shells (predicted and found spots) Sample / Rules Total L:\Images\lyso101_????.osc Ranking Results

17 Sample Group #1 Tests with Lysozyme crystals

18 Lysozyme 2_05 rank = Category Points Cumul >=5 reflns found in 2nd shell ( )Å >=5 reflns found in 3rd shell ( )Å >=5 reflns found in 4th shell ( )Å >=5 reflns found in 5th shell ( )Å >=5 reflns found in 6th shell ( )Å >=5 reflns found in 7th shell ( )Å I/sig == 44.8 in 2nd found shell ( )Å 7 67 I/sig == 56.8 in 3rd found shell ( )Å 9 76 I/sig == 60.1 in 4th found shell ( )Å I/sig == 67.7 in 5th found shell ( )Å I/sig == 74.2 in 6th found shell ( )Å I/sig == 89.7 in 7th found shell ( )Å Penalty for spot sharpness of Penalty for strong ring (2.82%) near resln Penalty for diffuse ring (0.70%) near resln Indexed 404 spots, or 75% of all spots used in indexing Penalty for RMS residual value of Penalty for Mosaicity value of Refined 44 spots, or 4% of all predictions >=5 reflns predicted and found in 5th shell ( )Å >=5 reflns predicted and found in 6th shell ( )Å >=5 reflns predicted and found in 7th shell ( )Å I/sig == 77.7 in 5th predicted and found shell ( )Å I/sig == 80.8 in 6th predicted and found shell ( )Å I/sig == 94.5 in 7th predicted and found shell ( )Å Cumulative 202

19 Lysozyme 2_01 rank = Category Points Cumul >=5 reflns found in 2nd shell ( )Å >=5 reflns found in 3rd shell ( )Å >=5 reflns found in 4th shell ( )Å >=5 reflns found in 5th shell ( )Å >=5 reflns found in 6th shell ( )Å >=5 reflns found in 7th shell ( )Å I/sig == 49.8 in 2nd found shell ( )Å 8 68 I/sig == 47.0 in 3rd found shell ( )Å 7 75 I/sig == 52.8 in 4th found shell ( )Å 8 83 I/sig == 65.7 in 5th found shell ( )Å I/sig == 69.9 in 6th found shell ( )Å I/sig == 86.8 in 7th found shell ( )Å Penalty for spot sharpness of Penalty for strong ring (2.78%) near resln Penalty for diffuse ring (0.55%) near resln Indexed 342 spots, or 56% of all spots used in indexing Penalty for RMS residual value of Penalty for Mosaicity value of Refined 24 spots, or 2% of all predictions >=5 reflns predicted and found in 4th shell ( )Å >=5 reflns predicted and found in 5th shell ( )Å >=5 reflns predicted and found in 6th shell ( )Å I/sig == 44.4 in 4th predicted and found shell ( )Å I/sig == 87.2 in 5th predicted and found shell ( )Å I/sig == 67.0 in 6th predicted and found shell ( )Å Cumulative 179

20 Lysozyme 2_10 rank = Category Points Cumul >=5 reflns found in 3rd shell ( )Å >=5 reflns found in 4th shell ( )Å >=5 reflns found in 5th shell ( )Å >=5 reflns found in 6th shell ( )Å >=5 reflns found in 7th shell ( )Å I/sig == 54.8 in 3rd found shell ( )Å 9 59 I/sig == 55.3 in 4th found shell ( )Å 9 68 I/sig == 64.3 in 5th found shell ( )Å I/sig == 72.1 in 6th found shell ( )Å I/sig == 86.1 in 7th found shell ( )Å Penalty for spot sharpness of Penalty for strong ring (2.64%) near resln Penalty for strong ring (2.05%) near resln Penalty for strong ring (1.84%) near resln Penalty for strong ring (6.76%) near resln Penalty for strong ring (7.87%) near resln Penalty for strong ring (4.78%) near resln Indexed 305 spots, or 58% of all spots used in indexing Penalty for RMS residual value of Penalty for Mosaicity value of >=5 reflns predicted and found in 5th shell ( )Å >=5 reflns predicted and found in 6th shell ( )Å >=5 reflns predicted and found in 7th shell ( )Å I/sig == 57.8 in 5th predicted and found shell ( )Å I/sig == 61.2 in 6th predicted and found shell ( )Å I/sig == in 7th predicted and found shell ( )Å Cumulative 124

21 Lysozyme 4_12 rank = Category Points Cumul >=5 reflns found in 5th shell ( )Å >=5 reflns found in 6th shell ( )Å >=5 reflns found in 7th shell ( )Å I/sig == 15.7 in 5th found shell ( )Å 2 32 I/sig == 19.5 in 6th found shell ( )Å 3 35 I/sig == 22.9 in 7th found shell ( )Å 3 38 Penalty for spot sharpness of Penalty for strong ring (1.09%) near resln Indexed 242 spots, or 57% of all spots used in indexing Penalty for RMS residual value of Penalty for Mosaicity value of Refined 186 spots, or 19% of all predictions >=5 reflns predicted and found in 5th shell ( )Å >=5 reflns predicted and found in 6th shell ( )Å >=5 reflns predicted and found in 7th shell ( )Å I/sig == 17.6 in 5th predicted and found shell ( )Å I/sig == 19.7 in 6th predicted and found shell ( )Å I/sig == 22.4 in 7th predicted and found shell ( )Å Cumulative 112

22 Effect of Indexing on Rank Values

23 Score Variability Rank Values vs. Exposure Time Images / Rules Total Thaumatin – 5 sec/0.5º: R merge = 12.9 % (32.5 %) thau3 501, thau thau thau thau Thaumatin – 10 sec/0.5º: R merge = 10.3 % (27.5 %) thau3 1001, thau thau thau thau Thaumatin – 30 sec/0.5º: R merge = 8.4 % (25.8 %) thau3 3001, thau thau thau thau

24 Images / Rules Total VariMax-HR : R merge = 2.9 % (22.3 %) LYS0503_screen LYS0503_screen LYS0503_screen LYS0503_ LYS0503_ LYS0503_ LYS0503_ VariMax-HR : R merge = 2.8 % (15.0 %) LYS0503_screen LYS0503_screen LYS0503_screen LYS0503_ LYS0503_ LYS0503_ LYS0503_ Score Variability Data sets collected with VariMax optics

25 What Have We Learned? Signal-to-noise is predominant factor in current d*TREK releaseSignal-to-noise is predominant factor in current d*TREK release  This is intentional! Should it be?  Each of the 11 rules have independent parameters that can be adjusted to optimize for your case Image processing adds domino effect to rankingImage processing adds domino effect to ranking  Better refinement, higher rank  Lower mosaicity, higher rank  Fewer twin spots, higher rank Spot sharpness analysis is not robustSpot sharpness analysis is not robust  Incorporate graph theory Potential PitfallsPotential Pitfalls  Weak diffractors  lowest 3 resolution bins should not excluded from spot analysis  Image Header Accuracies  Anisotropy  Need images at multiple angles  These effects become effectively ‘averaged’ across images  Merohedral twinning

26 Recent d*TREK Improvements Don’t ignore lowest resolution binsDon’t ignore lowest resolution bins Image Header AccuraciesImage Header Accuracies  Command line override AnisotropyAnisotropy  Incorporated anisotropy check and another rule  Rank each image, calculate average and ESD  Apply penalty as multiple of ESD Data Collection Strategy improvementsData Collection Strategy improvements  Automatic exposure time calculation (using ‘intelligent’ algorithm)  Optimize detector space for diffraction resolution  Multiple scan strategy, if possible

27 Acknowledgements Russ Athay Robert Bolotovsky Joseph D. Ferrara Thad Niemeyer Karen Opersteny J.W. Pflugrath

28 ACTOR Acknowledgements Rigaku/MSC (top 2 rows) James Pflugrath Angela Criswell Joseph Ferrara David Edwards Russ Athay Keith Crane John Edwards Kris Tesh Thomas Hendrixson Thaddeus Niemeyer Not Shown: Robert Bolotovsky Charlie Stence Karen Opersteny Stephen Sherbert John Ziegler Oceaneering Space Systems (middle row) Richard Shafer Terry Nienaber Kent Copeland Bill Robertson Abbott Laboratories (bottom row) Jeff Olson Steve Muchmore Jonathan Greer Ronald Jones Jeffrey Pan


Download ppt "Winners and Losers: Ranking Crystals from Diffraction Images Angela R. Criswell Automation Scientist."

Similar presentations


Ads by Google