Presentation is loading. Please wait.

Presentation is loading. Please wait.

From Campus Resources to Federated International Grids: Bridging the Gap with the Application Hosting Environment Joohyun Kim Center for Computation and.

Similar presentations


Presentation on theme: "From Campus Resources to Federated International Grids: Bridging the Gap with the Application Hosting Environment Joohyun Kim Center for Computation and."— Presentation transcript:

1 From Campus Resources to Federated International Grids: Bridging the Gap with the Application Hosting Environment Joohyun Kim Center for Computation and Technology at Louisiana State University Stefan Zasada, Peter Coveney University College London

2 The Application Hosting Environment AHE is a tool for building Science Gateways – currently used at UCL to provide a gateway for our users Based on the idea of applications as stateful WSRF web services Lightweight hosting environment for running unmodified applications on grid and local resources Community model: expert user installs and configures an application and uses the AHE to share it with others Simple clients with very limited dependencies No intrusion onto target grid resources Also being looked at by Campus Champions as a way of bridging the gap between local and TeraGrid resources.

3 Bridging the gap DEISA UK NGS Leeds Manchester Oxford RAL HPCx NGS Local resources GridSAM Globus UNICORE TeraGrid Globus

4 Motivation for the AHE Problems with current middleware solutions: Difficult for an end user to configure and/or install Dependent on lots of supporting software also being installed Require modified versions of common libraries Require non-standard ports to be opened on firewall We have access to many international grids running different middleware stacks, and need to be able to seamlessly interoperate between them

5 AHE Design Constraints Client does not have require any other middleware installed locally Client is NAT'd and firewalled Client does not have to be a single machine Client needs to be able to upload and download files but doesn’t have local installation of GridFTP Client doesn’t maintain information on how to run the application Client doesn’t care about changes to the backend resources

6 Meeting the Constraints AHE Client behind firewall => polls server to update job state etc. Uses intermediate filestaging area => don’t need to know how to upload data to each target machine All application specific information for running simulations on the grid resource is maintained on a central service => user can switch clients etc. Location of binary on grid resource configured on server => user doesn’t need to know AHE submits to middleware on target resource (Globus/Unicore/GridSAM)

7 Architecture of the AHE

8 Virtualizing Applications Application Instance/Simulation is central entity; represented by a stateful WS-Resource. State properties include: simulation owner target grid resource job ID simulation input files and urls simulation output files and urls job status

9 AHE 2.0 Provides a Single Platform to: Launch applications on Unicore and Globus 4 grids by acting as an OGSA-BES and GT4 client Create advanced reservations using HARC and launch applications in to the reservation Steer applications using the RealityGrid steering API or GENIUS project steering system Launch cross-site MPIg applications AHE 3 plans to support: SPRUCE urgent job submission Lightweight certificate sharing mechanisms

10 Service Architecture of the AHE

11 11 GINSC2008-001 – Architecture 11 Scientific Application (e.g. bloodflow simulation HemeLB) Application Hosting Environment (AHE) as scientific-specific Client technology Globus GRAM 4 Interface TeraGrid Globus 4 stack Security Policies with gridmaps GLUE- based information GridFTP Interface HPC Ressource within NGS HPC Ressource within DEISA Storage Resources (e.g. tape archives, robots) Common Environment massively parallel jobs with HemeLB Common Environment DEISA Modules with HemeLB massively parallel jobs with HemeLB OGSA-BES Interface Grid Middleware UNICORE with BES Security Policies (e.g. XACML) GLUE- based information Stefan Zasada, Steven Manos, Morris Riedel, Johannes Reetz, Michael Rambadt et al., Preparation for the Virtual Physiological Human (VPH) project that requires interoperability of numerous Grids

12 Cross-site Runs with MPI-g MPI-g has been designed to run allow single applications run across multiple machines Some problems won’t fit on a single machine, and require the RAM/processors of multiple machines on the grid. MPI-g allows for jobs to be turned around faster by using small numbers of processors on several machines - essential for clinician

13 MPI-g Requires Co-Allocation We can reserve multiple resources for specified time periods Co-allocation is useful for meta-computing jobs using MPIg, viz and for workflow applications. We use HARC - Highly Available Robust Co- scheduler (developed by Jon Maclaren at LSU). Slide courtesy Jon Maclaren

14 HARC HARC provides a secure co-allocation service –Multiple Acceptors are used –Works well provided a majority of Acceptors stay alive –Paxos Commit keeps everything in sync –Gives the (distributed) service high availability –Deployment of 7 acceptors --> Mean Time To Failure ~ years –Transport-level security using X.509 certificates HARC is a good platform on which to build portals/other services –XML over HTTPS - simplerthan SOAP services –Easy to interoperate with –Very easy to use with the Java Client API Highly Available Robust Co-scheduler

15 AHE 2.0 - HARC integration Users can co-reserve resources from the AHE GUI client using HARC When submitting a job, users can either run their jobs in normal queues, or use one of their reservations AHE passes reservation through to the Globus GRAM AHE client uses HARC Java client API to manage reservations

16 Closing the loop between simulation and user Monitoring a running simulation: –Pause, resume or stop if required Altering parameters in a running simulation –Receive feedback via on-line visualization Restarting a simulation from a known checkpoint –Reproducibility, provenance 16 Computational Steering RenderSimulateControlVisualize

17 AHE/ReG Steering Integration Simulation Steering library Visualization Registry Steering WS connect register bind data transfer (sockets/file s) bind Steering library AHE launch start Extended AHE Client launch Slide adapted from Andrew Porter

18 Client Implementation GUI & command line clients implemented in Java Client allows user to: –Discover appropriate resources –Launch application –Monitor running jobs –Query registry of running jobs –Stage files to and from resource –Terminate jobs GUI client implements application launching as a wizard

19 AHE Deployment AHE Server Released as a VirtualBox VM image – download image file and import to VirtualBox All required services, containers etc pre-configured User just needs to configure services for their target resources/applications AHE Client User’s machine must have Java installed User downloads and untars client package Imports X.509 certificate into Java keystore using provided script Configures client with endpoints of AHE services supplied by expert user

20 Hosting a New Application Expert user must: Install and configure application on all resources on which it is being shared Create an XSLT template for the application (easily cloned from exiting template) Add the application to the RMInfo.xml file Run a script to reread the configuration Documentation covers whole process of deploying AHE & applications on NGS and TeraGrid

21 Future plans We’re working to incorporate several new technologies in to AHE 3.0 –SPRUCE –Lightweight certificate sharing –SAGA Hosting –Workflow hosting and integration –Resource brokering 21

22 SPRUCE Applications with dynamic data and result deadlines are being deployed Late results are useless –Wildfire path prediction –Storm/Flood prediction –Patient specific medical treatment Some jobs need priority access “Right-of-Way Token” Special PRiority and Urgent Computing Environment

23 Deployed and Available on TeraGrid - –UC/ANL –NCSA –SDSC –NCAR –Purdue –TACC Other sites –LSU –Virginia Tech –LONI Computational Infrastructure: Urgent Computing

24 Making Grids Work (In)Security on the Grid Grid “security”: –PKI (digital certificate) based –Crosses organisational trust boundaries –Unfamiliar to users and system administrators –Complex to deploy and understand Problems for users: –Digital certificates difficult to obtain and use …so users dislike them –So inconvenient that users share certificates (and not just within a single institution) –Multiple copies of certificates scattered across the grid (not always protected) –Some users refuse to use the Grid if it will involve certificates

25 User Friendly Security: A Proposed Solution A designated individual puts a single certificate into a credential repository controlled by our new “gateway” service. User uses a local authentication service to authenticate to our gateway service. Our gateway service provides a session key (not shown) to our modified AHE client and our modified AHE Server to enable the AHE client to authenticate to the AHE Server. Our gateway service obtains a proxy certificate from its credential repository as necessary and gives it to our modified AHE Server to interact with the grid. User now has no certificate interaction. Private key of the certificate is never exposed to the user. User Credential Repository Our “gateway” service Grid Middleware Computational Grid PI, sysadmin, … certificate Modified AHE Server using our modified AHE client and a local authentication service

26 Virtual Physiological Human Funded under EU FP 7 15 projects: 1 NoE, 3 IPs, 9 STREPs, 2 CAs. “a methodological and technological framework that, once established, will enable collaborative investigation of the human body as a single complex system... It is a way to share observations, to derive predictive hypotheses from them, and to integrate them into a constantly improving understanding of human physiology/pathology, by regarding it as a single system.”

27 Summary AHE provides a single user interface to both grid and local resources AHE middleware stack allows federated access to resources from multiple independent grids, running different middlewares – Globus, Unicore etc. AHE 2.0 provides a single interface to several other tools, including HARC and RealityGrid steering The client is easily installed by any end user, requiring no intervention by system/network administrators By calling the command line clients from scripts, complex scientific workflows can be implemented User federation of resources allows us to make efficient use of our allocations, and investigate problems that would take too long on a single grid

28 Further Information stefan.zasada@ucl.ac.uk RealityGrid web site: http://www.realitygrid.org/AHE Wiki: http://wiki.realitygrid.org/wiki/AHE


Download ppt "From Campus Resources to Federated International Grids: Bridging the Gap with the Application Hosting Environment Joohyun Kim Center for Computation and."

Similar presentations


Ads by Google