Presentation on theme: "Nimrod-G and Virtual Lab Tools for Data Intensive Computing on Grid: Drug Design Case Study Rajkumar Buyya Melbourne, Australia http://www.buyya.com/ecogrid."— Presentation transcript:
1 Nimrod-G and Virtual Lab Tools for Data Intensive Computing on Grid: Drug Design Case Study Rajkumar BuyyaMelbourne, Australia
3 Contents Introduction Resource Management challenges Nimrod-G Toolkit SPMD/Parameter-Study Creation ToolsGrid enabling Drug Design ApplicationNimrod-G Grid Resource BrokerScheduling Experiments on World Wide GridConclusionsSchedulingEconomicsGridEconomy Grid
4 A typical Grid environment and Players Resource BrokerApplicationResource Broker
5 Grid Characteristics Heterogeneous Distributed Resource Types: PC, WS, ClustersResource Architecture: CPU Arch, OSApplications: CPU/IO/message intensiveUsers and Owners RequirementsAccess Price: different for different users, resources and time.Availability: varies from time to time.DistributedResourcesOwnershipUsersEach have their own (private) policies and objectives.Very much similar to heterogeneity and decentralization that is present in “human economies” (democratic and capitalist world).Hence, we propose the use of “economics” as a metaphor for resource management and scheduling. It regulates supply and demand for resources and offers incentive for resource owners for contributing resources to the Grid.
6 Grid Tools for Handling Uniform AccessSystem ManagementComputational EconomySecurityResource DiscoveryResource Allocation& SchedulingData localityNetwork ManagementApplication Development
7 Nimrod-G: Grid Resource Broker A resource broker for managing, steering, and executing task farming (parametric sweep/SPMD model) applications on Grid based on deadline and computational economy.Based on users’ QoS requirements, our Broker dynamically leases services at runtime depending on their quality, cost, and availability.Key FeaturesA single window to manage & control experimentPersistent and Programmable Task Farming EngineResource DiscoveryResource TradingScheduling & PredicationsGeneric Dispatcher & Grid AgentsTransportation of data & resultsSteering & data managementAccounting
8 Parametric Processing ParametersMagic Engine forManufacturing Humans!Multiple RunsSame ProgramMultiple DataKiller Application for the Grid!Courtesy: Anand Natrajan, University of Virginia
9 Sample P-Sweep Applications Bioinformatics: Drug Design / Protein ModellingCombinatorial Optimization:Meta-heuristic parameter estimationEcological Modelling: Control Strategies for Cattle TickSensitivity experiments on smog formationData MiningHigh Energy Physics: Searching for Rare EventsElectronic CAD: Field Programmable Gate ArraysComputer Graphics: Ray TracingFinance: Investment Risk AnalysisVLSI Design: SPICE SimulationsCivil Engineering:Building DesignAutomobile:Crash SimulationNetwork SimulationAerospace: Wing Designastrophysics
10 Virtual Drug Design: Data Intensive Computing on Grid A Virtual Laboratory for “Molecular Modelling for Drug Design” on Peer-to-Peer Grid.It provides tools for examining millions of chemical compounds (molecules) in the Protein Data Bank (PDB) to identify those having potential use in drug design.In collaboration with:Kim Branson, Structural Biology, Walter and Eliza Hall Institute (WEHI)
11 Molecule to be screened Dock input filescore_ligand yesminimize_ligand yesmultiple_ligands norandom_seedanchor_search notorsion_drive yesclash_overlapconformation_cutoff_factor 3torsion_minimize yesmatch_receptor_sites norandom_search yesmaximum_cyclesligand_atom_file S_1.mol2receptor_site_file ece.sphscore_grid_prefix ecevdw_definition_file parameter/vdw.defnchemical_definition_file parameter/chem.defnchemical_score_file parameter/chem_score.tblflex_definition_file parameter/flex.defnflex_drive_file parameter/flex_drive.tblligand_contact_file dock_cnt.mol2ligand_chemical_file dock_chm.mol2ligand_energy_file dock_nrg.mol2Molecule to be screened
18 Scheduling Experiment on World Wide Grid Testbed WW GridScheduling Experiment on World Wide Grid TestbedCardiff/UKPortsmoth/UKTI-Tech/TokyoETL/TsukubaAIST/TsukubaANL/ChicagoUSC-ISC/LAUTK/TennesseeUVa/VirginiaDartmouth/NHBU/BostonEUROPE:ZIB/GermanyPC2/GermanyAEI/Germany Lecce/ItalyCNR/ItalyCalabria/ItalyPozman/PolandLund/SwedenCERN/SwissKasetsart/BangkokMonash/MelbourneVPAC/MelbourneSantiago/Chile
19 Deadline and Budget Constrained Scheduling Experiment Workload:165 jobs, each need 5 minute of CPU timeDeadline: 2 hrs. and budget: unitsStrategy: minimise time / costExecution Cost with cost optimisationOptimise Cost: (G$) (finished in 2hrs.)Optimise Time: (G$) (finished in 1.25 hr.)In this experiment: Time-optimised scheduling run costs double that of Cost-optimised.Users can now trade-off between Time Vs. Cost.
20 World Wide Grid (WWG) Internet Australia North America Monash Uni.: WW GridWorld Wide Grid (WWG)AustraliaNorth AmericaMonash Uni.:ANL: SGI/Sun/SP2USC-ISI: SGIUVa: Linux ClusterUD: Linux clusterUTK: Linux clusterNimrod/GLinux clusterGlobus+LegionGRACE_TSSolaris WSGlobus/LegionGRACE_TSInternetWW GridAsia/JapanEuropeTokyo I-Tech.:ETL, TuskubaZIB/FUB: T3E/MosixCardiff: Sun E6500Paderborn: HPCLineLecce: Compaq SCCNR: ClusterCalabria: ClusterCERN: ClusterPozman: SGI/SP2Linux clusterGlobus +GRACE_TSChile: ClusterGlobus +GRACE_TSGlobus +GRACE_TSSouth America
21 Resources Selected & Price/CPU-sec. Resource & LocationGrid services & FabricCost/CPU sec. or unitNo. of Jobs ExecutedTime_OptCost_OptLinux Cluster-Monash, Melbourne, AustraliaGlobus, GTS, Condor264153Linux-Prosecco-CNR, Pisa, ItalyGlobus, GTS, Fork371Linux-Barbera-CNR, Pisa, Italy46Solaris/Ultas2TITech, Tokyo, Japan9SGI-ISI, LA, US8375Sun-ANL, Chicago,US42Total Experiment Cost (G$)237000115200Time to Complete Exp. (Min.)70119
24 ConclusionsP2P and Grid Computing is emerging as a next generation computing platform for solving large scale problems through sharing of geographically distributed resources.Resource management is a complex undertaking as systems need to be adaptive, scalable, competitive,…, and driven by QoS.We proposed a framework based on “computational economies” and discussed several economic models for resource allocation and for regulating supply-and-demand for resources.Scheduling experiments on World Wide Grid demonstrate our Nimrod-G broker ability to dynamically lease or rent services at runtime based on their quality, cost, and availability depending on consumers QoS requirements.Easy to use tools for composing applications to run on Grid are essential to attracting and getting application community on board.Economics paradigm for QoS driven resource management is essential to push P2P/Grids into mainstream computing!
25 Download Software & Information Nimrod & Parameteric Computing:Economy Grid & Nimrod/G:Virtual Laboratory/Virtual Drug Design:Grid Simulation (GridSim) Toolkit (Java based):World Wide Grid (WWG) testbed:Looking for new volunteers to grow Please contact me to barter your & our machines!Want to build on our work/collaborate:Talk to me now or