A Grid approach to Environmental Molecular Simulations: Deployment and use of Condor within the eMinerals Mini Grid. Paul Wilson 1, Mark Calleja 2, John.

Slides:



Advertisements
Similar presentations
Condor use in Department of Computing, Imperial College Stephen M c Gough, David McBride London e-Science Centre.
Advertisements

CamGrid Mark Calleja Cambridge eScience Centre. What is it? A number of like minded groups and departments (10), each running their own Condor pool(s),
Using eScience to calibrate our tools: parameterisation of quantum mechanical calculations with grid technologies Kat Austen Dept. of Earth Sciences, University.
Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
A virtual research organization enabled by eMinerals minigrid: An integrated study of the transport and immobilization of arsenic species in the environment.
Building a secure Condor ® pool in an open academic environment Bruce Beckles University of Cambridge Computing Service.
Peter Berrisford RAL – Data Management Group SRB Services.
Overview of Wisconsin Campus Grid Dan Bradley Center for High-Throughput Computing.
The UCL Condor Pool Experience John Brodholt 1, Paul Wilson 3, Wolfgang Emmerich 2 and Clovis Chapman Department of Earth Sciences, University College.
Experience of the SRB in support of collaborative grid computing Martin Dove University of Cambridge.
Dr. David Wallom Use of Condor in our Campus Grid and the University September 2004.
John Kewley e-Science Centre GIS and Grid Computing Workshop 13 th September 2005, Leeds Grid Middleware and GROWL John Kewley
OxGrid, A Campus Grid for the University of Oxford Dr. David Wallom.
A quick introduction to CamGrid University Computing Service Mark Calleja.
EMinerals and the Condor Pool John Brodholt UCL Arnaud Marmier Zhimei Du Maria Alfredsson Clovis Chapman Marc Blanchard Presenting the work of :
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
1 Integrating GPUs into Condor Timothy Blattner Marquette University Milwaukee, WI April 22, 2009.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
John Kewley e-Science Centre CCLRC Daresbury Laboratory 28 th June nd European Condor Week Milano Heterogeneous Pools John Kewley
Computational grids and grids projects DSS,
Open Science Grid For CI-Days Elizabeth City State University Jan-2008 John McGee – OSG Engagement Manager Manager, Cyberinfrastructure.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
20 October 2006Workflow Optimization in Distributed Environments Dynamic Workflow Management Using Performance Data David W. Walker, Yan Huang, Omer F.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
CHEP 2003Stefan Stonjek1 Physics with SAM-Grid Stefan Stonjek University of Oxford CHEP th March 2003 San Diego.
Condor Birdbath Web Service interface to Condor
GRAM5 - A sustainable, scalable, reliable GRAM service Stuart Martin - UC/ANL.
Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes – Bretagne Atlantique Rennes, France
Grid tool integration within the eMinerals project Mark Calleja.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
Neil Geddes GridPP-10, June 2004 UK e-Science Grid Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Support Centre.
Cliff Addison University of Liverpool Campus Grids Workshop October 2007 Setting the scene Cliff Addison.
Experiences with the Globus Toolkit on AIX and deploying the Large Scale Air Pollution Model as a grid service Ashish Thandavan Advanced Computing and.
NGS Innovation Forum, Manchester4 th November 2008 Condor and the NGS John Kewley NGS Support Centre Manager.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Leveraging HTC for UK eScience with Very Large Condor Pools: Demand for transforming untapped power into results. Paul Wilson 1, John Brodholt 1, and Wolfgang.
“Grids and eScience” Mark Hayes Technical Director - Cambridge eScience Centre GEFD Summer School 2003.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
The eMinerals minigrid and the national grid service: A user’s perspective NGS169 (A. Marmier)
Applications & a Reality Check Mark Hayes. Applications on the UK Grid Ion diffusion through radiation damaged crystal structures (Mark Calleja, Mark.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Building the e-Minerals Minigrid Rik Tyer, Lisa Blanshard, Kerstin Kleese (Data Management Group) Rob Allan, Andrew Richards (Grid Technology Group)
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
Campus grids: e-Infrastructure within a University Mike Mineter National e-Science Centre 14 February 2006.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
Utility Computing: Security & Trust Issues Dr Steven Newhouse Technical Director London e-Science Centre Department of Computing, Imperial College London.
1 Grid Activity Summary » Grid Testbed » CFD Application » Virtualization » Information Grid » Grid CA.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
John Kewley e-Science Centre All Hands Meeting st September, Nottingham GROWL: A Lightweight Grid Services Toolkit and Applications John Kewley.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Università di Perugia Enabling Grids for E-sciencE Status of and requirements for Computational Chemistry NA4 – SA1 Meeting – 6 th April.
The National Grid Service Mike Mineter.
John Kewley e-Science Centre CCLRC Daresbury Laboratory 15 th March 2005 Paradyn / Condor Week Madison, WI Caging the CCLRC Compute Zoo (Activities at.
Holding slide prior to starting show. Lessons Learned from the GECEM Portal David Walker Cardiff University
© Geodise Project, University of Southampton, Workflow Support for Advanced Grid-Enabled Computing Fenglian Xu *, M.
Intersecting UK Grid & EGEE/LCG/GridPP Activities Applications & Requirements Mark Hayes, Technical Director, CeSC.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
Collaborative Tools for the Grid V.N Alexandrov S. Mehmood Hasan.
Campus grids: e-Infrastructure within a University Mike Mineter National e-Science Centre 22 February 2006.
Campus grids: e-Infrastructure within a University Mike Mineter National e-Science Centre
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.
18 May 2006CCGrid2006 Dynamic Workflow Management Using Performance Data Lican Huang, David W. Walker, Yan Huang, and Omer F. Rana Cardiff School of Computer.
1 eScience Grid Environments th May 2004 NESC - Edinburgh Deployment of Storage Resource Broker at CCLRC for E-science Projects Ananta Manandhar.
SCI-BUS project Pre-kick-off meeting University of Westminster Centre for Parallel Computing Tamas Kiss, Stephen Winter, Gabor.
Reading e-Science Centre
Dynamic Deployment of VO Specific Condor Scheduler using GT4
Presentation transcript:

A Grid approach to Environmental Molecular Simulations: Deployment and use of Condor within the eMinerals Mini Grid. Paul Wilson 1, Mark Calleja 2, John Brodholt 1, Martin Dove 2, Maria Alfreddson 1, Zhimei Du 3, Nora H. de Leeuw 3, Arnaud Marmier 4 and Rik Tyer Department of Earth Sciences, University College London, Gower Street, London WC1E 6BT, UK 2. Department of Earth Sciences, University of Cambridge, Downing street, Cambridge, CB2 3EQ, UK 3. Birkbeck College, University of London, Malet Street, London WC1E 7HX UK 4. Department of Chemistry, University of Bath, Bath, BA2 7AY, UK 5. Daresbury Laboratory, Daresbury, Cheshire, WA4 4AD, UK Environment from the Molecular Level A NERC eScience testbed project

This talk: Part 1 1. The eMinerals problem area 2. The Computational job-types this generates 3. How Condor can help to sort these jobs out 4. What we gain from Condor and where to go next 5. UK Institutional Condor programmes and the road ahead. This talk: Part 2 1. Condor’s additional features and how we use them. 2. The eMinerals mini grid. 3. Conclusion. Environment from the Molecular Level A NERC eScience testbed project

THE PROBLEM AREA. 1. Simulation of pollutants in the environment Binding of heavy metals and organic molecules in soils. 2. Studies of materials for long-term nuclear waste encapsulation Radiocactive waste leaching through ceramic storage media. 3. Studies of weathering and scaling Mineral/water interface simulations, e.g oil well scaling. Codes relying on empirical descriptions of interatomic forces: DL-POLY - molecular dynamics simulations GULP – lattice energy/lattice dynamics simulations METADISE – interface simulations Codes using a quantum mechanical description of interactions between atoms: CRYSTAL – Hartree-Fock implementation. SIESTA – Density Function Theory, numerical basis sets to describe electronic wave function. ABINIT - DFT, plane wave descriptions of electronic wave functions Environment from the Molecular Level A NERC eScience testbed project WHAT TYPE OF JOBS WILL THESE PROBLEMS BE MANIFESTED AS?

2 TYPES OF JOB: 1) High to mid performance: Requiring powerful resources, potential process intercommunication, long execution times, CPU and memory intensive. 2) Low performance/high throughput: Requiring access to many hundreds or thousands of PC-level CPU’s. No process intercommunication, short execution times, low memory usage. WHERE CAN WE GET THE POWER? TYPE 1 JOB: Masses of UK HPC resources around- it seems that UK grid resources are largely HPC! TYPE 2 JOB: ???????? Environment from the Molecular Level A NERC eScience testbed project THERE HAS GOT TO BE A BETTER WAY TO OPTIMISE TYPE 2 JOBS!

…AND THERE IS: WE USE WHAT’S ALREADY THERE: 930 win2K PC’s (1GHz P3, 256/512Mb Ram, 1Gbit e-net.) clustered in 30 student cluster rooms across every department on the UCL campus, with the potential to scale up to ~3000 PC’s. These machines waste 95% of their CPU cycles 24/7: A MASSIVE UNTAPPED RESOURCE- A COUP FOR eMINERALS! This is where Condor enters the scene. THE ONLY AVAILABLE OF-THE-SHELF RESOURCE MANAGEMENT AND JOB BROKER FOR WINDOWS: Install Condor on our clusters, and we harness 95% of the power of 930+ machines 24 hours a day, without spending any money. Environment from the Molecular Level A NERC eScience testbed project Is it really this simple?

YES! It has surpassed all expectations, with diverse current use and ever-rising demand smiley happy people ( our current group of users, and increasing monthly.): eMinerals project, eMaterials project, UCL Computer Science, UCL medical school, University of Marburg, Universities of Bath and Cambridge, Birkbeck College, The Royal Institution… - Over 900,000 hours of work completed in 6 months (105 CPU-years equivalent and counting) - Codes migrated to Windows representing huge variety: environmental molecular work (all eMinerals codes!), materials polymorph prediction, financial derivatives research, quantum mechanical codes, climatic research, medical image realisation… NUMBER 1 METRIC FOR SUCCESS: Users love it. simple to use, doesn’t break and they can forget about their jobs. NUMBER 2 METRIC FOR SUCCESS: UCL admin love it. 100% utilisation levels 24/7on the entire cluster network with no drop in performance and negligible costs satisfies our dyed-in-the-wool, naturally paranoid, sys admin. NUMBER 3 METRIC FOR SUCCESS: eMinerals developers love it: fast deployment, tweakable, can build on top of it, low admin, integratable with globus, great metadata, great free support, great workflow capabilities, Condor-G. NUMBER 4 METRIC FOR SUCCESS: eScience loves it. Other institutions are following our example, interest is high. Environment from the Molecular Level A NERC eScience testbed project

 This is the largest single Condor pool in the UK (according to Condor)  This is the first fully x-department institutional Condor pool in the UK.  Several other Institutions have followed our lead: Cambridge, Cardiff.  Much scope for combining resources (flocking, glide-in) Environment from the Molecular Level A NERC eScience testbed project WHAT IS MOST IMPORTANT? Condor ENABLES any scientist to do their work in a way they previously dreamed about: Beginning to make real the ability to match unbounded science with unbounded resources. Condor has slashed time-to-results from years to weeks- Scientists using our Condor resource have Redefined their ability to achieve their goals. Condor has organised resources at many levels: Desktop- June 2002 (2 nodes) Cluster- Sept 2002 (18 nodes) Department – Jan 2003 (150 nodes) Campus – October 16 th 2003 (930 nodes) WHERE NEXT- (?????? nodes, ???? Pools)… One million Condor nodes in a hollowed out volcano! Mwahahaha… …Regional and national Condor resources are next…

Here is an example- CamGrid: The Current Plan Environment 1: Single pool of ~400 linux boxes (plus ~500 Windows and Mac Os X to follow later). Owned and administered by the University Computing Services (UCS). Small number of submit nodes. X.509 certificate host authentication. No firewalls or private IP addresses. Environment 2: Desktop and teaching machines (some hundreds) of many colleges and departments, each with a pool. These will be flocked. Heterogeneous mix of architectures and operating systems. Many firewalls and private IP addresses. Hence, use a single VPN (secnet) Each machine has an another IP address on VPN. Each pool has a gateway on VPN (only one that needs a public IP). Gateway needs just one udp port allowed through a firewall. Traffic between pools/gateways is automatically encrypted. VPN model has already been tested between two flocked pools. Cardiff University are also following their own similar Condor programme…

 This is the largest single Condor pool in the UK (according to Condor)  This is the first fully x-department institutional Condor pool in the UK.  Several other Institutions have followed our lead: Cambridge, Cardiff.  Much scope for combining resources (flocking, glide-in) Environment from the Molecular Level A NERC eScience testbed project …Regional and national Condor resources continued. Many UK institutions have small/medium Condor pools. Many UK institutions have resources wasting millions of CPU cycles. We have proved the usefulness of large Windows Condor resources. Assurances regarding security, authorisation, authentication, access and reliable job execution are essential to the take up of Condor on this scale in the UK Many potential resources are Windows, which complicates matters (for example, poor GSI port to Windows and lack of Windows check-pointing.) With education, awareness, support and a core group to lead the way, UK institutions can form a national-level Condor infrastructure leveraging HTC resources for scientists within UK eScience. The UK eScience programme is heavily GRID-oriented. What can Condor provide to Grid-enable eMinerals and other UK eScience projects and resources?

Here’s the eMinerals answer: Workflow and Scripting: DAGman Grid connectivity: Condor-G These two extremely useful Condor tools provide the means to provide an integrated, usable, eMinerals ‘mini-grid’ for our Scientists, embracing several tools OUT OF THE BOX: –Globus 2.4: Gatekeeper to all compute resources –PBS: 3 x 16-node Beowolf Cluster MPI job queues –Condor: 2 x pools (UCL and Cambridge) –SRB: Storage Resource Broker, Virtual file system, 4 distributed eMinerals vaults. Environment from the Molecular Level A NERC eScience testbed project Here is what the eMinerals minigrid looks like:

lake.geol.ucl.ac.uk SRB lake.esc.cam.ac.uk 16 node queue: MPI SRB MCAT server (Daresbury) SRB PBS Job Manager Condor Job Manager Globus PBS Job Manager Condor Job Manager Globus Client machine runs a cmd-line perl- based DAGman/Condor-G submission script generator for PBS and Condor. Non-minigrid facilities (HPCx, JISC clusters, etc.) 930-node UCL Condor pool: vanilla, java 24-node CES Condor pool: std, vanilla 16 node queue: MPI e-Minerals minigrid THE eMINERALS MINIGRID SRB VAULT Daresbury resources University College resourcesCambridge resources Condor-G GT2.4 client User cert proxy CGI web interface: Condor q/status & real-time job output viewing CGI web interface: Condor q &status

Summary. Condor has enabled eMinerals scientists and their UK colleagues to perform their science: 1.in significantly new ways, 2.on previously un-tapped resources, 3.on previously unutilised operating systems, 4.in weeks rather than years, 5.in an integrated, heterogeneous, grid-enabled environment. 6.easily, painlessly and for no cost. 7.with equal importance given to data handling. 8.using out-of-the-box tools. Environment from the Molecular Level A NERC eScience testbed project

Conclusion: THIS MUST CONTINUE! Condor has an important part to play in the UK eScience programme: 1.Through meeting the increasing demands from users for large scale, accessible Condor-enabled HTC resources. 2.Through harnessing the significant volumes of existing, under- utilised, heterogeneous UK institutional hardware. 3.Through providing functionality to facilitate secure accessibility to heterogeneous compute and data resources. 4.Through engaging with the UK eScience programme within Condor’s grid/web service and standardisation developments. Environment from the Molecular Level A NERC eScience testbed project

Thanks for listening! Paul Wilson (eMinerals, UCL) Mark Calleja (eMinerals, Cambridge) Bruce Beckles (CamGrid) eMinerals project Environment from the Molecular Level A NERC eScience testbed project