Presentation is loading. Please wait.

Presentation is loading. Please wait.

First Steps in the Clouds

Similar presentations


Presentation on theme: "First Steps in the Clouds"— Presentation transcript:

1 First Steps in the Clouds
Kate Keahey University of Chicago Argonne National Laboratory

2 Why Clouds? Resource consumers Resource providers
Individual users or Virtual Organization Requirements Customized environments for their services/applications Services/applications can be short-lived New environments/services deployed quickly and often Resource providers Own and operate physical resources Ability to monitor and control their resources Provide resources at reasonable operational cost Protection from activities performed by resource consumer Consumers need to be able to lease (potentially for short-term) platforms that they can customize and control Virtual Workspaces: http//workspace.globus.org

3 Cloud Computing for Grid Communities: The STAR Application Use Case

4 The STAR Application Complex experimental application codes
Developed over more than 10 years, by more than 100 scientists, comprises ~2 M lines of C++ and Fortran code Require complex, customized environments Rely heavily on the right combination of compiler versions and available libraries Dynamically load external libraries depending on the task to be performed Environment validation To ensure reproducibility and result uniformity across environments Why do we need a cloud? Resources with the right configuration are hard to find A VM-based cloud gives us the required control Virtual Workspaces: http//workspace.globus.org

5 Running STAR in a Cloud First Challenge: finding VM-enabled resources
Amazon Elastic Compute Cloud (EC2) More Challenges: Can we use X.509 certs to submit to a cloud? Can we use Grid access protocols? How much manual configuration do we need to do for a cluster that we need for 4 hours? How do we integrate the cluster into the Grid infrastructure? Workspace Service X.509 certificates are mapped to a project account Grid access protocols Creating a virtual cluster dynamically Contextualization (cluster context): the cluster node VMs find out about each other and integrate that information at boot time Integrating the cluster into the Grid Contextualization (grid context): cluster is configured with appropriate host certs, gridmapfiles, etc. Virtual Workspaces: http//workspace.globus.org

6 with thanks to Jerome Lauret and Doug Olson of the STAR project
with thanks to Jerome Lauret and Doug Olson of the STAR project, presented at CHEP’07 with thanks to Jerome Lauret and Doug Olson of the STAR project Running jobs : 124 Running jobs : 142 Running jobs : 150 Running jobs : 94 Running jobs : 42 Running jobs : 0 Running jobs : 150 Running jobs : 73 Running jobs : 109 Running jobs : 230 VWS/EC2 BNL Running jobs : 300 Running jobs : 300 Running jobs : 0 Running jobs : 300 Running jobs : 221 Running jobs : 195 Running jobs : 243 Running jobs : 282 Running jobs : 140 Running jobs : 76 WSU Fermi Running jobs : 96 Running jobs : 54 Running jobs : 37 Running jobs : 136 Running jobs : 0 Running jobs : 152 Running jobs : 183 Running jobs : 150 Running jobs : 200 Running jobs : 195 Running jobs : 9 Running jobs : 0 Running jobs : 50 Running jobs : 15 Running jobs : 50 Running jobs : 21 Running jobs : 27 Running jobs : 39 Running jobs : 34 Running jobs : 42 PDSF Job Completion : File Recovery : Virtual Workspaces: http//workspace.globus.org

7 Nersc PDSF EC2 (via Workspace Service) WSU
with thanks to Jerome Lauret and Doug Olson of the STAR project with thanks to Jerome Lauret and Doug Olson of the STAR project, presented at CHEP’07 Nersc PDSF EC2 (via Workspace Service) WSU Accelerated display of a workflow job state Y = job number, X = job state Virtual Workspaces: http//workspace.globus.org

8 What Did We Learn? Performance was not an issue
The real comparison is having a resource to run on vs not having a resource to run on Contextualization is key for dynamic virtual cluster deployment Next steps: a more challenging application Virtual Workspaces: http//workspace.globus.org

9 Cloud Computing for Grid Providers: Building the Science Cloud at the University of Chicago

10 Challenges Virtualization adoption has been relatively slow among Grid Providers Challenge: integrating VMs into current provisioning models Integrate into a site without disrupting the current operation of resources I.e., be able to run jobs as well as VMs Non-invasive from the perspective of currently used tools E.g., no modification to the currently used schedulers and resource managers Can be used alongside the current mode of operation Batch jobs Represent as small a change as possible Operate within familiar metaphors Avoid error-generating complexity Virtual Workspaces: http//workspace.globus.org

11 Roll Your Own Cloud The Workspace Pilot
Operates on resources that can support jobs as well as VMs E.g., have been booted into Xen domain 0 Non-invasive extension to batch schedulers (e.g., PBS) Wrappers for submission operation, scheduler signals to operate on VMs Glidein approach: submits a “pilot program” that prepares a resource slot for VM deployment E.g., adjusts Xen domain 0 memory Comes with administrator tools E.g., kill-all Virtual Workspaces: http//workspace.globus.org

12 Workspace Pilot in Action
Level 2: provision VMs Level 1: provision raw resources Workspace Service VM Xen dom0 LRM/PBS Xen dom0 Xen dom0 VMs are decomissioned raw resources are decomissioned Virtual Workspaces: http//workspace.globus.org

13 The Pilot Program Uses Xen balloon driver to reduce/restore domain0 memory so that guest domains (VMs) can be deployed Secure VM deployment The pilot requires sudo privilege and thus can be used only with site administrator’s approval The workspace service provides fine-grained authorization for all requests Signal handling SIGTERM: pilot exceeded its allotted time Notifies VWS, allows it to clean up After a configurable time period takes things into its hands. Default policy: one VM per physical node Available for download Workspace Release 1.3.1: Virtual Workspaces: http//workspace.globus.org

14 Nimbus @ UC What is it? What can it do for me? Who can use it?
The Science Cloud at University of Chicago UC TeraPort cluster configured with the workspace pilot Currently 16 nodes What can it do for me? Allow you to “lease out” a cluster of VMs Who can use it? Members of scientific community In as much as usage policies will allow What do I need to do if I want to use it? Contact us: You will need a VM image (we can help and know others who can), a certificate, and a simple client Virtual Workspaces: http//workspace.globus.org

15 Cloud Interoperability
Moving an app from a hardware platform to a cloud is relatively hard Need to develop a VM image, learn about cloud computing, figure our logistics Moving between clouds E.g., STAR app EC2->Science Cloud and vice versa is very easy Rough consensus on the interfaces needed to provision resources in the cloud OGF gridvit-wg Chairs: Erol Bozak, Wolfgang Reichert Define the requirements for integration of Grid architecture with system virtualization platforms Exploring the impact of virtualization on Grid use cases Exploring the relationship with standards (DMTF, etc.) Virtual Workspaces: http//workspace.globus.org


Download ppt "First Steps in the Clouds"

Similar presentations


Ads by Google