Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Computing for Crystallography experiences and opportunities Dr. Kenneth Shankland & Tom Griffin ISIS Facility CCLRC Rutherford Appleton Laboratory.

Similar presentations


Presentation on theme: "Distributed Computing for Crystallography experiences and opportunities Dr. Kenneth Shankland & Tom Griffin ISIS Facility CCLRC Rutherford Appleton Laboratory."— Presentation transcript:

1 Distributed Computing for Crystallography experiences and opportunities Dr. Kenneth Shankland & Tom Griffin ISIS Facility CCLRC Rutherford Appleton Laboratory STFC

2 Supercomputer Expensive State of the art Good results Dedicated Cluster Cheaper Can easily expand Dedicated Distributed Grid Cheaper Increase with time Can expand Not dedicated Many separate machines Easy to use Can easily expand May be dedicated Background – Parallel Computing Options

3 Spare cycles concept Typical PC usage is about 10% Usage minimal after 5pm Most desktop PCs are really fast Can we use (“steal?”) unused CPU cycles to solve computational problems?

4 Suitable apps CPU Intensive Low to moderate memory use Licensing issues Not too much file output Coarse grained Command line / batch driven

5 The United Devices GridMP System Server hardware Two, dual Xeon 2.8GHz servers RAID 10 Software Servers run RedHat Linux Advanced Server / DB2 Unlimited Windows (and other) clients Programming Web Services interface – XML, SOAP Accessed with C++ and Java Management Console Web browser based Can manage services, jobs, devices etc Large industrial user base GSK, J&J, Novartis etc.

6 GridMP Platform Object Model Docking GOLD Ligandfit MyDockTest ligandsproteins molec 1 molec m protein 1 protein n GOLD 2.0 WindowsLinux gold20win.exe gold20_rh.exe

7 Adapting a program for GridMP 1)Think about how to split your data 2)Wrap your executable 3)Write the application service Pre and Post processing 4)Use the Grid Fairly easy to write Interface to grid via Web Services So far used: C++, Java, Perl, C# (any.Net language)

8 Package your executable PROGRAM MODULE EXECUTABLE Uploaded to, and resident on, the server Executable DLLsStandard data files Environment variables Compress? Encrypt? }

9 Create / run a job Pkg1 Pkg4 Molecules Proteins Pkg2 Pkg3 Create job, generate cross product Datasets Workunits Client side Server side https:// Start job

10 Job execution

11 Current status at ISIS 218 registered devices 321 total CPUs Potential power ~300Gflops (cf HPCx @ 500Gflops)

12 Application: Structures from powders CT-DMF2: Solvated form of a polymorphic pharmaceutical from xtal screen –a=12.2870(7), b=8.3990(4), c=37.021(2), β= 92.7830(10) –V= 3816.0(4) –P2 1 /c, Z’=2 (N fragments =6) Asymmetric unit

13 DASH Optimise molecular models against diffraction data Multi-solution simulated annealing Execute a number of SA runs (say 25), pick the best one

14 Grid adapt - straightforward Run GUI DASH as normal up to SA run point, create.duff file Submit SA runs to GRID from own PC c:\dash-submit famot.grd uploading data to server… your job_id is 4300 Retrieve and collate SA results from GRID to your own PC c:\dash-retrieve 4300 retrieving job data from server… results stored in famot.dash View results as normal with DASH

15 Example of speedup Execute 80 SA runs on famotidine with #SA moves set to 4 million Elapsed time 6hrs 40mins on 2.8GHz P4 Elapsed time on grid 27 mins Speedup factor = 15 with only 24PCs

16 Think big… or Think different… 13 torsions + disordered benzoate Z’=4 72 non-H atoms / asu

17 HMC structure solution

18 Calculations embody ca. 6 months of CPU time. On our current grid, runs would be completed in ca. 20 hours. Algorithm ‘sweet spot’

19 MD Manager Slightly different to previous ‘batch’ style jobs More ‘interactive’

20 Large run - McStas Submit program breaks up –n##### Uploads new command line + data + executable Parameter scan, fixed neutron count Send each run to a separate machine Instrument simulation

21 Full diffraction simulation for HRPD Calc Obs (full MC simulation) Diff 5537 hours = 230 days Elapsed time =2.5 days

22 Hardware – very few Software – a few, but excellent support Security concerns – encryption and tampering System administrators are suspicious of us ! End user obtrusiveness Perceived Real (memory grab with povray) Unanticipated Problems / issues

23 Programs in the wild clinical trial patients side fx general population { drug interactions all connected PCs test computer pool runs ok { program interactions

24 Get a ‘head honcho’ involved early on Test, test and test again Employ small test groups of friendly users Know your application Don’t automatically dismiss complaints about obtrusiveness Top tips


Download ppt "Distributed Computing for Crystallography experiences and opportunities Dr. Kenneth Shankland & Tom Griffin ISIS Facility CCLRC Rutherford Appleton Laboratory."

Similar presentations


Ads by Google