Presentation on theme: "Virtualization, Cloud Computing, and TeraGrid Kate Keahey (University of Chicago, ANL) Marlon Pierce (Indiana University)"— Presentation transcript:
Virtualization, Cloud Computing, and TeraGrid Kate Keahey (University of Chicago, ANL) Marlon Pierce (Indiana University)
Virtual Workspaces: http//workspace.globus.org Virtualization and Cloud Computing l The Virtues of Virtualization u Portable environments, enforcement and isolation, fast to deploy, suspend/resume, migration… l Cloud computing: a nebulous concept… u SaaS: software as a service u Service: provide me with a workspace u Virtualization makes it easy to provide a workspace/VM l Cloud computing u resource leasing, utility computing, elastic computing u Amazons Elastic Compute Cloud (EC2) l Is this real? Or is this just a proof-of-concept? u Successfully used commercially on a large scale u More experience for scientific applications
Virtual Workspaces: http//workspace.globus.org What is a Cloud? Two major types of cloud (at least) l Compute and Data Cloud u EC2, Google Map Reduce, Science clouds u Provision platform for running science codes u Open source infrastructure: workspace, eucalyptus, hub0 u Virtualization: providing environments as VMs l Hosting Cloud u GoogleApp Engine u Highly-available, fault tolerance, robustness, etc for Web capabilities u Community example: IU hosting environment (quarry)
Virtual Workspaces: http//workspace.globus.org The Science Clouds: A Case Study l Objectives: u Make it easy for scientific projects to experiment with cloud computing l You too can run on the cloud! (we can give you cycles) l You too can be a cloud provider! (we can give you open source software) u Evolve software in response to the needs of scientific projects Start with EC2 - Refine SLAs - One-click virtual clusters (contextualization) - Lower adoption barriers - Miscellaneous useful new features
Virtual Workspaces: http//workspace.globus.org The Science Clouds l Powered by workspace tools l EC2-like interfaces (PKI credential vs credit card) l More clouds on the way l Stratus University of Florida 16x4 nodes Nimbus University of Chicago 16x2 nodes Public IPs Private IPs (via VPN)
Virtual Workspaces: http//workspace.globus.org Who Runs on the Science Clouds? l Nimbus utilization breakdown since March 4th l ~30 DNs (a DN represents a community)
Virtual Workspaces: http//workspace.globus.org STAR l Motivation for STAR u Resources with the right configuration are hard to find l Complex environments: correct versions of operating systems, libraries, tools, etc all have to be installed. l Require validation l Virtual Workspace: an OSG STAR cluster u OSG cluster l OSG CE (headnode), gridmapfiles, host certificates, NSF, PBS u STAR worker nodes: SL4 + STAR conf l Requirements u One-click virtual clusters u Migration: nimbus/scientific resources -> EC2
Virtual Workspaces: http//workspace.globus.org STAR (cntd) l From proof-of-concept to production runs u ~2 years ago: proof-of-concept u Last September: EC2 runs of up to 100 nodes (production scale) u Testing for full production deployment l Performance u Within 10% of expected performance for applications l Work by Jerome Lauret, Doug Olson, Leve Hajdu, Lidia Didenko l Long-lived community of many l Similar work for other HEP communities (Alice and Atlas), bioinformatics, geofest, and others
Virtual Workspaces: http//workspace.globus.org Virtual Network Overlays l Motivation u CS research: investigate latency-sensitive apps l Virtual workspace: ViNE router + app VM l Requirements: access to distributed resources l First steps in creating a federated cloud l Work by Mauricio Tsugawa, Andrea Matsunaga, Jose Fortes and others l Medium-lived community of a few Stratus Nimbus ViNE router ViNE router
Virtual Workspaces: http//workspace.globus.org Scalability Testing l Motivation u Test scalability of various Globus components u Test on a different platform l Workspaces u Globus others l Requirements u very short-term but flexible access to diverse platforms l Work by various members of the Globus Toolkit (Tom Howe and John Bresnahan) l Typically very short-lived communities of one
Virtual Workspaces: http//workspace.globus.org Resource Providers: Scientific computing providers: Science Clouds Commercial providers: EC2 Grid Providers? Users, Communities, Providers Appliance Providers: All communities large and small commercial and open marketplaces Appliance management software available Appliance Deployment: appliances -> leased compute resources Coordinating creation of virtual resources Software layers: an evolving middleware for clouds
Why isnt TeraGrid like this? (science cloud user)
Are there any benefits of this approach that would be relevant to you as a user? l What do you hate about supercomputers? l What would convince you to go to the hassle of providing a VM image for your community and giving it a shot? l What problems does it solve? l What problems does it create? l (Are we overall in the black on that?)
Virtual Workspaces: http//workspace.globus.org Are there any benefits of this approach that would be relevant to you as a provider? l What would have to happen to convince you to provide a part of your resource to the user community as a VM-serving platform? l What problems does it solve? l What problems does it create? l That balance sheet again?
Virtual Workspaces: http//workspace.globus.org What Should We Do? l Establish interest group to coordinate and communicate TG activities? u Evaluate existing software l What are the gaps? l What are the best solutions for various problems? u What are the problems? l Overhead is really an issue? l What about security? l How do you deal with big data? l What are interesting projects that we can do?