Sponsors and Acknowledgments This work is supported in part by the National Science Foundation under Grants No. OCI-0910812, IIP-0758596 and CNS-0821622.

Sponsors and Acknowledgments This work is supported in part by the National Science Foundation under Grants No. OCI-0910812, IIP-0758596 and CNS-0821622 and in part by the MCS Division subprogram of the Office of Advanced Scientific Computing Research, SciDAC Program, Office of Science, U.S. Department of Energy, under Contract DE-AC02-06CH11357. The authors also acknowledge the support of the BellSouth Foundation. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or BellSouth Foundation. Experiments were carried out using the Grid'5000 experimental testbed, being developed under the INRIA ALADDIN development action with support from CNRS, RENATER and several Universities as well as other funding bodies (see https://www.grid5000.fr). Sky computing is an emerging computing model where resources from multiple cloud providers are leveraged to create large scale distributed infrastructures. Experimental Testbeds IntroductionIntroduction ConclusionConclusion ArchitectureArchitecture VM Image Propagation Mechanisms To deploy virtual clusters, each VM requires an independent replica of a common VM image. Nimbus transfers a copy of the required VM image to each VM host (a step called propagation), using SCP from a single repository. This propagation scheme doesnt scale with the number of VMs as it is limited by the repository disk or network bandwidth. To overcome this problem, we developed two new propagation mechanisms. The first one leverages the TakTuk and Kastafior tools developed at INRIA to create a broadcast chain used to transfer image data. The second one relies on Copy-on-Write capabilities of the Xen hypervisor. Sky Computing on FutureGrid and Grid5000 Pierre Riteau 1, Mauricio Tsugawa 2, Andrea Matsunaga 2, José Fortes 2, Tim Freeman 3, David LaBissoniere 4, Kate Keahey 3,4 1 Université de Rennes 1, IRISA/INRIA Rennes – Bretagne Atlantique 2 University of Florida 3 Argonne National Labs 4 University of Chicago Computation Institute The above graph compares instantiation times of virtual clusters using different propagation mechanisms. In the SCP and TakTuk cases, the image is compressed and is 2.2GB in size (12 GB uncompressed). In the QCOW case, the 12GB image is pre-propagated on all hypervisors. Propagation consists in creating a new Copy-On-Write volume and contextualizing the virtual cluster. FutureGrid is an experimental testbed for grid and cloud research. It is distributed over 6 sites in the US and offers more than 5,000 cores. Grid5000 is an experimental testbed for research in large-scale parallel and distributed systems. It is distributed over 9 sites in France and offers more than 5,500 cores. This work uses resources across two experimental projects: FutureGrid and Grid5000. This showcases not only the capabilities of the experimental platforms, but also their emerging collaboration. The two platforms are used to create a Sky Computing environment. To validate our approach in a real-world scenario, we run a MapReduce version of a popular bioinformatics application (BLAST). However, any kind of distributed application can be run on these infrastructures. The Sky Computing model allows the creation of large scale infrastructures using resources from multiple cloud providers. These infrastructures are able to run embarrassingly parallel computation with high performance. Our work shows how it is possible to federate multiple infrastructures and improve the speed of virtual cluster creation, using experimental testbeds in the US and in France as an example. Our Sky Computing deployment makes use of: Xen to minimize platform (hardware and operating system stack) differences Nimbus to offer VM provisioning and contextualization services (contextualization automatically assigns roles and configures VMs) ViNe, a virtual network based on an IP-overlay, to enable all-to-all communication between virtual machines spread across multiple clouds Hadoop for parallel fault-tolerant execution and dynamic cluster extension ScalabilityScalability We deployed a Sky Computing infrastructure consisting of 1114 CPU cores (457 VMs) distributed over 3 sites in FutureGrid and 3 sites in Grid5000 (OGF- 29 demo, Chicago, IL, June 2010). FutureGrid Grid5000 San Diego University of Florida University of Chicago Lille Rennes Sophia Queue ViNe Router Grid5000 firewall Cloud A Cloud B Nimbus ViNe Hadoop MapReduce App (e.g. BLAST) Cloud C Nimbus Distributed Application (e.g. MPI BLAST) Distributed Application (e.g. MPI BLAST) ViNe router VMs FutureGrid Grid5000

Sponsors and Acknowledgments This work is supported in part by the National Science Foundation under Grants No. OCI-0910812, IIP-0758596 and CNS-0821622.

Similar presentations

Presentation on theme: "Sponsors and Acknowledgments This work is supported in part by the National Science Foundation under Grants No. OCI-0910812, IIP-0758596 and CNS-0821622."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sponsors and Acknowledgments This work is supported in part by the National Science Foundation under Grants No. OCI-0910812, IIP-0758596 and CNS-0821622.

Similar presentations

Presentation on theme: "Sponsors and Acknowledgments This work is supported in part by the National Science Foundation under Grants No. OCI-0910812, IIP-0758596 and CNS-0821622."— Presentation transcript:

Similar presentations

About project

Feedback