Presentation on theme: "Virtual Laboratory: Data Intensive Science during Robinson Village in Italy! Rajkumar Buyya Melbourne, Australia WW Grid."— Presentation transcript:
Virtual Laboratory: Data Intensive Science during Robinson Village in Italy! Rajkumar Buyya Melbourne, Australia WW Grid
2 Grid Warning! This is a science fiction story on the future of grid computing All actors mentioned in this talk are Application under consideration is fictitious. Prof.Watson-II is researching on drug design. The complete story is fictitious except the Grid technology!
3 Prof. Watson-II Spends all his time in University of Lecce
4 Watson-II s wife was Unhappy Since he was not all spending any time with her & kids. Everyday he goes to 8am and comes backs to home at 11pm night. After few day he and his wife had a big Home: She gives him warning: If he does not come home tomorrow by 6pm, he will have to face life time consequence.
5 Prof. Watson-II works upto 5pm in University of Lecce Returns to home by 5.30PM! Goes to 9am
6 Watson-II having moon light dinner with his Wife
7 Prof. Watson-II works up to 5pm in University of Lecce Returns to home by 5.30PM! Goes to 9am
8 Watson-II promises his wife that he will soon take her for a Robinson Village
9 Prof. Watson-II hires assistant and works smarter! Returns to home by 5.30PM! Goes to 9am
10 Watson-II & Family starts their holiday
11 Watson-II & Family on 5 Day Robinson Village
Day Robinson Village
14 Watson-II happens to meet a Grid researcher on beach!
15 Watson-II quickly reads news clipping that he got from Grid researcher
17 Watson-II having moon light dinner with his Wife
Day Robinson Village
19 Goes to Internet Room & does some surfacing of Grid researcher page
20 Drug Design: Data Intensive Computing on Grid A Virtual Laboratory for Molecular Modelling for Drug Design on Peer-to-Peer Grid. It provides tools for examining millions of chemical compounds (molecules) in the Protein Data Bank (PDB) to identify those having potential use in drug design. In collaboration with: Kim Branson, Structural Biology, Walter and Eliza Hall Institute (WEHI)
21 Architecture A Virtual Lab for Molecular Modeling for Drug Design on P2P Grid Screen 2K molecules in 30min. for $10 Grid Market Directory Resource Broker Grid Info. Service GTS Give me list PDBs sources Of type aldrich_300? service cost? (GTS - Grid Trade Server) PDB2 get mol.10 from pdb1 & screen it. Data Replica Catalogue service providers? GTS PDB1 mol.10 please? mol.5 please? (RB maps suitable Grid nodes and Protein DataBank)
22 Software Tools Molecular Modelling Tools (DOCK) Parameter Modelling Tools (Nimrod/enFusion) Grid Resource Broker (Nimrod-G) Data Grid Broker Protein Data Bank (PDB) Management and Intelligent Access Tools PDB databse Lookup/Index Table Generation. PDB and associated index-table Replication. PDB Replica Catalogue (that helps in Resource Discovery). PDB Servers (that serve PDB clients requests). PDB Brokering (Replica Selection). PDB Clients for fetching Molecule Record (Data Movement). Grid Middleware (Globus and GrACE) Grid Fabric Management (Fork/LSF/Condor/Codine/ … )
23 DOCK code* (Enhanced by WEHI, U of Melbourne) A program to evaluate the chemical and geometric complementarities between a small molecule and a macromolecular binding site. It explores ways in which two molecules, such as a drug and an enzyme or protein receptor, might fit together. Compounds which dock to each other well, like pieces of a three-dimensional jigsaw puzzle, have the potential to bind. So, why is it important to able to identify small molecules which may bind to a target macromolecule? A compound which binds to a biological macromolecule may inhibit its function, and thus act as a drug. Thus disabling the ability of (HIV) virus attaching itself to molecule/protein! With system specific code changed, we have been able to compile it for Sun-Solaris, PC Linux, SGI IRIX, Compaq Alpha/OSF1 * Original Code: University of California, San Francisco:
24 Dock input file score_ligand yes minimize_ligand yes multiple_ligands no random_seed 7 anchor_search no torsion_drive yes clash_overlap 0.5 conformation_cutoff_factor 3 torsion_minimize yes match_receptor_sites no random_search yes maximum_cycles 1 ligand_atom_file S_1.mol2 receptor_site_file ece.sph score_grid_prefix ece vdw_definition_file parameter/vdw.defn chemical_definition_file parameter/chem.defn chemical_score_file parameter/chem_score.tbl flex_definition_file parameter/flex.defn flex_drive_file parameter/flex_drive.tbl ligand_contact_file dock_cnt.mol2 ligand_chemical_file dock_chm.mol2 ligand_energy_file dock_nrg.mol2 Molecule to be screened
28 Nimrod/TurboLinux enFuzion GUI tools for Parameter Modeling
29 Docking Experiment Preparation Setup PDB DataGrid Index PDB databases Pre-stage (all) Protein Data Bank (PDB) on replica sites Start PDB Server Create Docking GridScore (receptor surface details) for a given receptor on home node. Pre-Staging Large Files required for Docking: Pre-stage Dock executables and PDB access client on Grid nodes, if required (e.g., dock.Linux, dock.SunOS, dock.IRIX64, and dock.OSF1 on Linux, Sun, SGI, and Compaq machines respectively). Use globus- rcp. Pre-stage/Cache all data files (~3-13MB each) representing receptor details on Grid nodes. This can can be done demand by Nimrod/G for each job, but few input files are too large and they are required for all jobs). So, pre- staging/caching at http-cache or broker level is necessary to avoid the overhead of copying the same input files again and again!
30 Protein Data Bank Databases consist of small molecules from commercially available organic synthesis libraries, and natural product databases. There is also the ability to screen virtual combinatorial databases, in their entirety. This methodology allows only the required compounds to be subjected to physical screening and/or synthesis reducing both time and expense.
31 Target Testcase The target for the test case: electrocardiogram (ECE) endothelin converting enzyme. This is involved in heart stroke and other transient ischemia. Is · che · mi · a : A decrease in the blood supply to a bodily organ, tissue, or part caused by constriction or obstruction of the blood vessels.
32 Nimrod/G Computational Grid Broker Data Replica Catalogue PDB Broker Algorithm1 AlgorithmN... PDB Service PDB2 Screen mol.5 please? GSP1GSP2 GSP4GSP3 (Grid Service Provider) GSPm PDB Service GSPn 1 advise PDB source? 2 selection & advise: use GSP4! 5 Grid Info. Service 3 Is GSP4 healthy? 4 mol.5 please? 6 PDB replicas please? Screen 2K molecules in 30min. for $10 Resource Brokering Architecture for Molecular Screening on World Wide Grid 7 process & send results
33 Nimrod/G in Action: Screening on World-Wide Grid
35 Watson-II again saw Grid researcher on beach and asks him a favor! Can I borrow your Grid identity for 2 days ? Nice Grid Researcher Trusts Watson & Gives him his Grid identity including access to his World Wide Grid testbed! Grid Trust on the Beach!
Day Robinson Village
39 Watson Gets an Idea while surfing
40 Goes to Internet Room & connects to Grid researcher machine
41 Connects to his U.Lecce lab machine and copies all protein samples he prepared before taking holiday
42 Copies Test experiment of Grid researcher & modifies it to use his lab experiment data.
43 Starts Parameter Exploration
44 Starts Molecular Experimentation Nimrod/G Computational Grid Broker Data Replica Catalogue PDB Broker Algorithm1 AlgorithmN... PDB Service PDB2 Screen mol.5 please? GSP1GSP2 GSP4GSP3 GSPm PDB Service GSPn 1 advise PDB source? 2 use GSP4! 5 Grid Info. Service 3 Is GSP4 healthy? 4 mol.5 please? 6 PDB replicas please? Screen 50K molecules in 120min. for $200
45 Nimrod/G in Action: Screening on World-Wide Grid
47 Comes back to Internet room after 2 hours and asks his assistant to test results
48 Watson-II assistant conducts tests afternoon ? Sends to Wantson in the evening: looks like our client is improving…
Day Robinson Village
50 Watson-II does some more exploration: this time with one million molecules. Asks Nimrod to results to his assistant for testing...
51 Starts Parameter Exploration
53 After Lunch Watson-II reads that he received from his assistant and pleased with the results of his experiment. Sends to VC of his university to do Press release of his breakthrough discovery ?
54 Did Watson-II invents cure for AIDS ? Yes Of course. ?
Vice Chancellor calls for Press Meeting The news spreads like I love you Virus around the world including Sweden and Norway!
Day Robinson Village
59 Sweden Announces Nobel Award for a Scientist on Robinson Village !
60 Watson-II the Great at Robinson Village
61 Watson-II shares the success with Grid researcher!!!
Day 6 and Beyond!
63 Watson-II & Family returns to their home happily.
64 Watson-II having moon light dinner with his home
65 Prof. Watson-II works up to 5pm in University of Lecce Returns to home by 5.30PM! Goes to 9am
66 Watson-II had moon light dinner with his home all his life!
67 Do you want to repeat Watson-II s success in High Energy Physics ? ?
68 If so, download Software & Explore it in 2006 when LHC expt. starts Nimrod & Parameteric Computing: Economy Grid & Nimrod/G: Virtual Grid Simulation (Java based): World Wide Grid testbed: Looking for new volunteers to grow Please contact me to barter your & our machines! Want to build on our work/collaborate: Talk to me now or
69 Thank You… Any ??
70 Further Information Books: High Performance Cluster Computing, V1, V2, R.Buyya (Ed), Prentice Hall, The GRID, I. Foster and C. Kesselman (Eds), Morgan-Kaufmann, IEEE Task Force on Cluster Computing Global Grid Forum IEEE/ACM CCGrid xy: CCGrid 2002, Berlin: ccgrid2002.zib.de Grid workshop -
71 Further Information Cluster Computing Info Centre: Grid Computing Info Centre: IEEE DS Online - Grid Computing area: Compute Power Market Project