Presentation is loading. Please wait.

Presentation is loading. Please wait.

FermiCloud Project: Integration, Development, Futures Gabriele Garzoglio Associate Head Grid & Cloud Computing Department Fermilab Work supported by the.

Similar presentations


Presentation on theme: "FermiCloud Project: Integration, Development, Futures Gabriele Garzoglio Associate Head Grid & Cloud Computing Department Fermilab Work supported by the."— Presentation transcript:

1 FermiCloud Project: Integration, Development, Futures Gabriele Garzoglio Associate Head Grid & Cloud Computing Department Fermilab Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359

2 FermiCloud Development Goals Goal: Make virtual machine-based workflows practical for scientific users: Cloud bursting: Send virtual machines from private cloud to commercial cloud if needed Grid bursting: Expand grid clusters to the cloud based on demand for batch jobs in the queue. Federation: Let a set of users operate between different clouds Portability: How to get virtual machines from desktop  FermiCloud  commercial cloud and back. Fabric Studies: enable access to hardware capabilities via virtualization (100G, Infiniband, …) 5-Feb-2013FermiCloud Review - Development & Futures1

3 Overlapping Phases 5-Feb-2013FermiCloud Review - Development & Futures2 Phase 1: “Build and Deploy the Infrastructure” Phase 1: “Build and Deploy the Infrastructure” Phase 2: “Deploy Management Services, Extend the Infrastructure and Research Capabilities” Phase 2: “Deploy Management Services, Extend the Infrastructure and Research Capabilities” Phase 3: “Establish Production Services and Evolve System Capabilities in Response to User Needs & Requests” Phase 3: “Establish Production Services and Evolve System Capabilities in Response to User Needs & Requests” Phase 4: “Expand the service capabilities to serve more of our user communities” Phase 4: “Expand the service capabilities to serve more of our user communities” Time Today

4 FermiCloud Phase 4: “Expand the service capabilities to serve more of our user communities” Complete the deployment of the true multi-user filesystem on top of a distributed & replicated SAN, Demonstrate interoperability and federation: -Accepting VM's as batch jobs, -Interoperation with other Fermilab virtualization infrastructures (GPCF, VMware), -Interoperation with KISTI cloud, Nimbus, Amazon EC2, other community and commercial clouds, Participate in Fermilab 100 Gb/s network testbed, -Have just taken delivery of 10 Gbit/second cards. Perform more “Virtualized MPI” benchmarks and run some real world scientific MPI codes, -The priority of this work will depend on finding a scientific stakeholder that is interested in this capability. Reevaluate available open source Cloud computing stacks, -Including OpenStack, -We will also reevaluate the latest versions of Eucalyptus, Nimbus and OpenNebula. 5-Feb-2013FermiCloud Review - Development & Futures3 Specifications Being Finalized Specifications Being Finalized

5 FermiCloud Phase 5 See the program of work in the (draft) Fermilab-KISTI FermiCloud CRADA, This phase will also incorporate any work or changes that arise out of the Scientific Computing Division strategy on Clouds and Virtualization, Additional requests and specifications are also being gathered! 5-Feb-2013FermiCloud Review - Development & Futures4 Specifications Under Development Specifications Under Development

6 Proposed KISTI Joint Project (CRADA) Partners: Grid and Cloud Computing Dept. @FNAL Global Science Experimental Data hub Center @KISTI Project Title: Integration and Commissioning of a Prototype Federated Cloud for Scientific Workflows Status: Finalizing CRADA for DOE and KISTI approval, Work intertwined with all goals of Phase 4 (and potentially future phases), Three major work items: 1.Virtual Infrastructure Automation and Provisioning, 2.Interoperability and Federation of Cloud Resources, 3.High-Throughput Fabric Virtualization. Note that this is potentially a multi-year program: The work during FY13 is a “proof of principle”. 5-Feb-2013FermiCloud Review - Development & Futures5

7 Virtual Infrastructure Automation and Provisioning [CRADA Work Item 1] Accepting Virtual Machines as Batch Jobs via cloud API’s, Grid and Cloud Bursting, Launch pre-defined user virtual machines based on workload demand, Completion of idle machine detection and resource reclamation, Detect when VM is idle – DONE Sep ’12, Convey VM status and apply policy to affect the state of the Cloud – TODO. 5-Feb-2013FermiCloud Review - Development & Futures6

8 Virtual Machines as Jobs OpenNebula (and all other open-source IaaS stacks) provide an emulation of Amazon EC2. Condor team has added code to their “Amazon EC2” universe to support the X.509-authenticated protocol. Planned use case for GlideinWMS to run Monte Carlo on clouds public and private. Feature already exists, this is a testing/integration task only. 5-Feb-2013FermiCloud Review - Development & Futures7

9 Grid Bursting Seo-Young Noh, KISTI visitor @ FNAL, showed proof-of-principle of “vCluster” in summer 2011: Look ahead at Condor batch queue, Submit worker node virtual machines of various VO’s to FermiCloud or Amazon EC2 based on user demand, Machines join grid cluster and run grid jobs from the matching VO. Need to strengthen proof-of-principle, then make cloud slots available to FermiGrid. Several other institutions have expressed interest in extending vCluster to other batch systems such as Grid Engine. KISTI staff has a program of work for the development of vCluster. 5-Feb-2013FermiCloud Review - Development & Futures8

10 9 vCluster at SC2012 5-Feb-2013FermiCloud Review - Development & Futures

11 Cloud Bursting OpenNebula already has built-in “Cloud Bursting” feature to send machines to Amazon EC2 if the OpenNebula private cloud is full. Need to evaluate/test it, see if it meets our technical and business requirements, or if something else is necessary. Need to test interoperability against other stacks. 5-Feb-2013FermiCloud Review - Development & Futures10

12 True Idle VM Detection In times of resource need, we want the ability to suspend or “shelve” idle VMs in order to free up resources for higher priority usage. This is especially important in the event of constrained resources (e.g. during building or network failure). Shelving of “9x5” and “opportunistic” VMs allows us to use FermiCloud resources for Grid worker node VMs during nights and weekends This is part of the draft economic model. Giovanni Franzini (an Italian co-op student) has written (extensible) code for an “Idle VM Probe” that can be used to detect idle virtual machines based on CPU, disk I/O and network I/O. This is the biggest pure coding task left in the FermiCloud project, If KISTI joint project approved—good candidate for 3-month consultant. 5-Feb-2013FermiCloud Review - Development & Futures11

13 Idle VM Information Flow 5-Feb-2013FermiCloud Review - Development & Futures12 VM Raw VM State DB Raw VM State DB Idle VM Collector Idle VM Collector Idle VM Logic Idle VM List Idle VM List VM Idle VM Trigger Idle VM Trigger Idle VM Shutdown Idle VM Management Process

14 Interoperability and Federation [ CRADA Work Item 2] Driver: Global scientific collaborations such as LHC experiments will have to interoperate across facilities with heterogeneous cloud infrastructure. European efforts: EGI Cloud Federation Task Force – several institutional clouds (OpenNebula, OpenStack, StratusLab). HelixNebula—Federation of commercial cloud providers Our goals: Show proof of principle—Federation including FermiCloud + KISTI “G Cloud” + one or more commercial cloud providers + other research institution community clouds if possible. Participate in existing federations if possible. Core Competency: FermiCloud project can contribute to these cloud federations given our expertise in X.509 Authentication and Authorization, and our long experience in grid federation 5-Feb-2013FermiCloud Review - Development & Futures13

15 Virtual Image Formats Different clouds have different virtual machine image formats: File system ++, Partition table, LVM volumes, Kernel? Have to identify the differences and find tools to convert between them if necessary This is an integration/testing issue not expected to involve new coding. 5-Feb-2013FermiCloud Review - Development & Futures14

16 Interoperability/Compatibility of API’s Amazon EC2 API is not open source, it is a moving target that changes frequently. Open-source emulations have various feature levels and accuracy of implementation: Compare and contrast OpenNebula, OpenStack, and commercial clouds, Identify lowest common denominator(s) that work on all. 5-Feb-2013FermiCloud Review - Development & Futures15

17 VM Image Distribution Investigate existing image marketplaces (HEPiX, U. of Victoria). Investigate if we need an Amazon S3-like storage/distribution method for OS images, OpenNebula doesn’t have one at present, A GridFTP “door” to the OpenNebula VM library is a possibility, this could be integrated with an automatic security scan workflow using the existing Fermilab NESSUS infrastructure. 5-Feb-2013FermiCloud Review - Development & Futures16

18 High-Throughput Fabric Virtualization [CRADA Work Item 3] Follow up earlier virtualized MPI work: Use it in real scientific workflows Example – simulation of data acquisition systems (the existing FermiCloud Infiniband fabric has already been used for such). Will also use FermiCloud machines on 100Gbit Ethernet test bed Evaluate / optimize virtualization of 10G NIC for the use case of HEP data management applications Compare and contrast against Infiniband 5-Feb-2013FermiCloud Review - Development & Futures17

19 Long Term Vision [another look at some work in the FermiCloud Project Phases] 5-Feb-2013FermiCloud Review - Development & Futures18 FY10-12 Batch system support (CDF) at KISTI Proof-of-principle “Grid Bursting” via vCluster on FermiCloud Demonstration of virtualized InfiniBand FY10-12 Batch system support (CDF) at KISTI Proof-of-principle “Grid Bursting” via vCluster on FermiCloud Demonstration of virtualized InfiniBand FY13 Demonstration of VM image transfer across Clouds Proof-of-principle launch of VM as a job Investigation of options for federation interoperations FY13 Demonstration of VM image transfer across Clouds Proof-of-principle launch of VM as a job Investigation of options for federation interoperations FY14 Integration of vCluster for dynamic resource provisioning Infrastructure development for federation interoperations Broadening of supported fabric utilization FY14 Integration of vCluster for dynamic resource provisioning Infrastructure development for federation interoperations Broadening of supported fabric utilization FY15 Transition of Federated infrastructure to production Full support of established workflow fabric utilization Fully automated provisioning of resources for scientific workflows FY15 Transition of Federated infrastructure to production Full support of established workflow fabric utilization Fully automated provisioning of resources for scientific workflows Time Today

20 High-level Architecture 5-Feb-2013FermiCloud Review - Development & Futures19 Job / VM queue Img Repository 100 Gbps VM Virt. InfiniBand Cloud API 1 VM Idle Detection Resource Reclaim Cloud API 2 Cloud Federation Federated Authorization Layer vCluster (FY14) Workflow Submit Provision

21 OpenNebula Authentication OpenNebula came with “pluggable” authentication, but few plugins initially available. OpenNebula 2.0 Web services by default used “access key” / “secret key” mechanism similar to Amazon EC2. No https available. Four ways to access OpenNebula: Command line tools, Sunstone Web GUI, “ECONE” web service emulation of Amazon Restful (Query) API, OCCI web service. FermiCloud project wrote X.509-based authentication plugins: Patches to OpenNebula to support this were developed at Fermilab and submitted back to the OpenNebula project in Fall 2011 (generally available in OpenNebula V3.2 onwards). X.509 plugins available for command line and for web services authentication. 5-Feb-2013FermiCloud Review - Development & Futures20

22 Grid AuthZ Interoperability Profile Use XACML 2.0 to specify DN, CA, Hostname, CA, FQAN, FQAN signing entity, and more. Developed in 2007, (still) used in Open Science Grid and in EGEE. Currently in the final stage of OGF standardization Java and C bindings available for authorization clients Most commonly used C binding is LCMAPS Used to talk to GUMS, SAZ, and SCAS Allows one user to be part of different Virtual Organizations and have different groups and roles. For Cloud authorization we will configure GUMS to map back to individual user names, one per person Each personal account in OpenNebula created in advance. 5-Feb-2013FermiCloud Review - Development & Futures21

23 X.509 Authorization OpenNebula authorization plugins written in Ruby Use existing Grid routines to call to external GUMS and SAZ authorization servers Use Ruby-C binding to call C-based routines for LCMAPS or ***Use Ruby-Java bridge to call Java-based routines from Privilege proj. GUMS returns uid/gid, SAZ returns yes/no. Works with OpenNebula command line and non-interactive web services TRICKY PART—how to load user credential with extended attributes into a web browser? It can be done, but high degree of difficulty (PKCS12 conversion of VOMS proxy with all certificate chain included). We have a side project to make it easier/automated. Currently—have proof-of-principle running on our demo system, command line only. In frequent communication with LCMAPS developers @ NIKHEF and OpenNebula developers 5-Feb-2013FermiCloud Review - Development & Futures22

24 Virtualized Storage Service Investigation Motivation: General purpose systems from various vendors being used as file servers, Systems can have many more cores than needed to perform the file service, Cores go unused => Inefficient power, space and cooling usage, Custom configurations => Complicates sparing issues. Question: Can virtualization help here? What (if any) is the virtualization penalty? 5-Feb-2013FermiCloud Review - Development & Futures23

25 Virtualized Storage Server Test Procedure Evaluation: Use IOzone and real physics root based analysis code. Phase 1: Install candidate filesystem on “bare metal” server, Evaluate performance using combination of bare metal and virtualized clients (varying the number), Also run client processes on the “bare metal” server, Determine “bare metal” filesystem performance. Phase 2: Install candidate filesystem on a virtual machine server, Evaluate performance using combination of bare metal and virtualized clients (varying the number), Also use client virtual machines hosted on same physical machine as the virtual machine server, Determine virtual machine filesystem performance. 5-Feb-2013FermiCloud Review - Development & Futures24

26 FermiCloud Test Bed - “Bare Metal” Server 2 TB 6 Disks eth FCL : 3 Data & 1 Name node Dom0: - 8 CPU - 24 GB RAM Lustre Server Lustre 1.8.3 on SL5 2.6.18- 164.11.1: 3 OSS (different striping) CPU: dual, quad core Xeon E5640 @ 2.67GHz with 12 MB cache, 24 GB RAM Disk: 6 SATA disks in RAID 5 for 2 TB + 2 sys disks ( hdparm  376.94 MB/sec ) 1 GB Eth + IB cards 5-Feb-2013FermiCloud Review - Development & Futures25 eth mount BlueArc mount ITB / FCL Ext. Clients (7 nodes - 21 VM) CPU: dual, quad core Xeon X5355 @ 2.66GHz with 4 MB cache; 16 GB RAM. 3 Xen VM SL5 per machine; 2 cores / 2 GB RAM each.

27 FermiCloud Test Bed - Virtualized Server 2 TB 6 Disks eth FCL : 3 Data & 1 Name node mount BlueArc mount Dom0: - 8 CPU - 24 GB RAM On Board Client VM 7 x Opt. Storage Server VM 8 KVM VM per machine; 1 cores / 2 GB RAM each. ( ) 5-Feb-2013FermiCloud Review - Development & Futures26 ITB / FCL Ext. Clients (7 nodes - 21 VM) CPU: dual, quad core Xeon X5355 @ 2.66GHz with 4 MB cache; 16 GB RAM. 3 Xen VM SL5 per machine; 2 cores / 2 GB RAM each.

28 Virtualized File Service Results Summary See ISGC ’11 talk and proceedings for the details - http://indico3.twgrid.org/indico/getFile.py/access?contribId=32&sessionId=36&resId=0&materialId=slides&confId =44 FileSystemBenchmark Read (MB/s) “Bare Metal” Write (MB/s) VM Write (MB/s) Notes Lustre IOZone35025070 Significant write penalty when FS on VM Root-based12.6-- Hadoop IOZone50 - 24080 - 300 Varies on number of replicas, fuse does not export a full posix fs. Root-based7.9-- OrangeFS IOZone150 - 330220 - 350 Varies on number of name nodes Root-based8.1-- BlueArc IOZone300330n/a Varies on system conditions Root-based8.4-- 5-Feb-2013FermiCloud Review - Development & Futures27

29 Summary Collaboration: Leveraging external work as much as possible, Contribution of our work back to external collaborations. Using (and if necessary extending) existing standards: AuthZ, OGF UR, Gratia, etc. Relatively small amount of development FTEs: Has placed FermiCloud in a potential leadership position! CRADA with KISTI, Could be the beginnings of a beautiful relationship… 5-Feb-2013FermiCloud Review - Development & Futures28

30 Thank You! Any Questions?

31 X.509 Authentication—how it works Command line: User creates a X.509-based token using “oneuser login” command This makes a base64 hash of the user’s proxy and certificate chain, combined with a username:expiration date, signed with the user’s private key Web Services: Web services daemon contacts OpenNebula XML-RPC core on the users’ behalf, using the host certificate to sign the authentication token. Use Apache mod_ssl or gLite’s GridSite to pass the grid certificate DN (and optionally FQAN) to web services. Limitations: With Web services, one DN can map to only one user. 5-Feb-2013FermiCloud Review - Development & Futures30

32 “Authorization” in OpenNebula Note: OpenNebula has pluggable “Authorization” modules as well. These control Access ACL’s—namely which user can launch a virtual machine, create a network, store an image, etc. Focus on privilege management rather than the typical grid-based notion of access authorization. Therefore, we make our “Authorization” additions to the Authentication routines of OpenNebula 5-Feb-2013FermiCloud Review - Development & Futures31


Download ppt "FermiCloud Project: Integration, Development, Futures Gabriele Garzoglio Associate Head Grid & Cloud Computing Department Fermilab Work supported by the."

Similar presentations


Ads by Google