Auxiliary services Web page Secrets repository RSV Nagios Monitoring Ganglia NIS server Syslog Forward FermiCloud: A private cloud to support Fermilab.

Slides:



Advertisements
Similar presentations
Virtual Machine Technology Dr. Gregor von Laszewski Dr. Lizhe Wang.
Advertisements

Fermilab, the Grid & Cloud Computing Department and the KISTI Collaboration GSDC - KISTI Workshop Jangsan-ri, Nov 4, 2011 Gabriele Garzoglio Grid & Cloud.
System Center 2012 R2 Overview
What’s New: Windows Server 2012 R2 Tim Vander Kooi Systems Architect
ANTHONY TIRADANI AND THE GLIDEINWMS TEAM glideinWMS in the Cloud.
An Approach to Secure Cloud Computing Architectures By Y. Serge Joseph FAU security Group February 24th, 2011.
FermiCloud Dynamic Resource Provisioning Condor Week 2012 Steven Timm Fermilab Grid & Cloud Computing Dept. For FermiCloud team:
“It’s going to take a month to get a proof of concept going.” “I know VMM, but don’t know how it works with SPF and the Portal” “I know Azure, but.
Idle virtual machine detection in FermiCloud Giovanni Franzini September 21, 2012 Scientific Computing Division Grid and Cloud Computing Department.
Implementing Failover Clustering with Hyper-V
System Center 2012 Setup The components of system center App Controller Data Protection Manager Operations Manager Orchestrator Service.
By Mihir Joshi Nikhil Dixit Limaye Pallavi Bhide Payal Godse.
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 7 2/23/2015.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Nimbus & OpenNebula Young Suk Moon. Nimbus - Intro Open source toolkit Provides virtual workspace service (Infrastructure as a Service) A client uses.
SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI CloudBroker Platform integration into WS-PGRADE/gUSE Zoltán Farkas MTA.
M.A.Doman Short video intro Model for enabling the delivery of computing as a SERVICE.
Virtualization at Fermilab Keith Chadwick Fermilab Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
Presented by: Sanketh Beerabbi University of Central Florida COP Cloud Computing.
Grid Developers’ use of FermiCloud (to be integrated with master slides)
MDC417 Follow me on Working as Practice Manager for Insight, he is a subject matter expert in cloud, virtualization and management.
Virtualization within FermiGrid Keith Chadwick Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
Center for Autonomic Computing Intel Portland, April 30, 2010 Autonomic Virtual Networks and Applications in Cloud and Collaborative Computing Environments.
WNoDeS – Worker Nodes on Demand Service on EMI2 WNoDeS – Worker Nodes on Demand Service on EMI2 Local batch jobs can be run on both real and virtual execution.
EVGM081 Multi-Site Virtual Cluster: A User-Oriented, Distributed Deployment and Management Mechanism for Grid Computing Environments Takahiro Hirofuchi,
Cluster Software Overview
What’s New with Windows Server 2012 and Microsoft System Center 2012 SP1 Vijay Tewari Principal Group Program Manager Microsoft Corporation.
GLIDEINWMS - PARAG MHASHILKAR Department Meeting, August 07, 2013.
Microsoft Azure Active Directory. AD Microsoft Azure Active Directory.
NTU Cloud 2010/05/30. System Diagram Architecture Gluster File System – Provide a distributed shared file system for migration NFS – A Prototype Image.
FermiCloud: Enabling Scientific Workflows with Federation and Interoperability Steven C. Timm FermiCloud Project Lead Grid & Cloud Computing Department.
An Introduction to Campus Grids 19-Apr-2010 Keith Chadwick & Steve Timm.
Grid testing using virtual machines Stephen Childs*, Brian Coghlan, David O'Callaghan, Geoff Quigley, John Walsh Department of Computer Science Trinity.
OpenStack Chances and Practice at IHEP Haibo, Li Computing Center, the Institute of High Energy Physics, CAS, China 2012/10/15.
Network Virtualization Policy-Based Isolation QoS Performance Metrics Live & Storage Migrations Cross-Premise Connectivity Dynamic & Multi-Tenant.
OpenNebula: Experience at SZTAKI Peter Kacsuk, Sandor Acs, Mark Gergely, Jozsef Kovacs MTA SZTAKI EGI CF Helsinki.
FermiCloud Project Overview and Progress Keith Chadwick Grid & Cloud Computing Department Head Fermilab Work supported by the U.S. Department of Energy.
Øg fleksibiliteten i din infrastruktur 32 virtual processors per VM 1 TB virtual machine memory New 64TB VHDX format Native 4k disk support Hyper-V.
KAASHIV INFOTECH – A SOFTWARE CUM RESEARCH COMPANY IN ELECTRONICS, ELECTRICAL, CIVIL AND MECHANICAL AREAS
FermiCloud Infrastructure as a Service (IaaS) Cloud Computing In Support of the Fermilab Scientific Program OSG All Hands Meeting 2012 Steven Timm
Fermilab / FermiGrid / FermiCloud Security Update Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359 Keith Chadwick Grid.
Unit 2 VIRTUALISATION. Unit 2 - Syllabus Basics of Virtualization Types of Virtualization Implementation Levels of Virtualization Virtualization Structures.
The FermiCloud Infrastructure- as-a-Service Facility Steven C. Timm FermiCloud Project Leader Grid & Cloud Computing Department Fermilab Work supported.
FermiCloud Status Report Fall 2010 Keith Chadwick Grid & Cloud Computing Department Head Fermilab Work supported by the U.S. Department.
Authentication, Authorization, and Contextualization in FermiCloud S. Timm, D. Yocum, F. Lowe, K. Chadwick, G. Garzoglio, D. Strain, D. Dykstra, T. Hesselroth.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
DIRAC for Grid and Cloud Dr. Víctor Méndez Muñoz (for DIRAC Project) LHCb Tier 1 Liaison at PIC EGI User Community Board, October 31st, 2013.
INFN/IGI contributions Federated Clouds Task Force F2F meeting November 24, 2011, Amsterdam.
FermiCloud Project: Integration, Development, Futures Gabriele Garzoglio Associate Head Grid & Cloud Computing Department Fermilab Work supported by the.
WP5 – Infrastructure Operations Test and Production Infrastructures StratusLab kick-off meeting June 2010, Orsay, France GRNET.
Brian Lauge Pedersen Senior DataCenter Technology Specialist Microsoft Danmark.
Hao Wu, Shangping Ren, Gabriele Garzoglio, Steven Timm, Gerard Bernabeu, Hyun Woo Kim, Keith Chadwick, Seo-Young Noh A Reference Model for Virtual Machine.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
Enabling On-Demand Scientific Workflows on a Federated Cloud Steven C. Timm, Gabriele Garzoglio, Seo-Young Noh, Haeng-Jin Jang KISTI project evaluation.
GPCF* Update Present status as a series of questions / answers related to decisions made / yet to be made * General Physics Computing Facility (GPCF) is.
FermiCloud Review Response to Questions Keith Chadwick Steve Timm Gabriele Garzoglio Work supported by the U.S. Department of Energy under contract No.
Virtualization within FermiGrid Keith Chadwick Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
July 18, 2011S. Timm FermiCloud Enabling Scientific Computing with Integrated Private Cloud Infrastructures Steven Timm.
NIIF Cloud Infrastructure and Services EGI Technical Forum September 20, 2011 Lyon, France Ivan Marton.
High Throughput & Resilient Fabric Deployments on FermiCloud Steven C. Timm Grid & Cloud Services Department Fermilab Work supported by the U.S. Department.
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Use of HLT farm and Clouds in ALICE
StratusLab First Periodic Review
Cloud Challenges C. Loomis (CNRS/LAL) EGI-TF (Amsterdam)
Blueprint of Persistent Infrastructure as a Service
StratusLab Final Periodic Review
StratusLab Final Periodic Review
Securing Cloud-Native Applications Jason Schmitt CEO
Versatile HPC: Comet Virtual Clusters for the Long Tail of Science SC17 Denver Colorado Comet Virtualization Team: Trevor Cooper, Dmitry Mishin, Christopher.
Presentation transcript:

Auxiliary services Web page Secrets repository RSV Nagios Monitoring Ganglia NIS server Syslog Forward FermiCloud: A private cloud to support Fermilab Scientific Users S.Timm, K. Chadwick, D. Yocum, G. Garzoglio, H. Kim, P. Mhashilkar, T. Levshina Dark Blue RGB Color Web Color # Light Blue RGB Color HTML Color #1A86BB Light Blue RGB Color HTML Color #1A86BB Green RGB Color Web Color #66B763 Red RGB Color Web Color #D92229 Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359 Monitoring and Metrics What is FermiCloud? Infrastructure-as-a-service private cloud for Fermilab Scientific Program. Integrated into Fermilab site security structure. Virtual machines have full access to existing Fermilab network and mass storage devices. Scientific stakeholders get on-demand access to virtual machines without system administrator intervention. Virtual machines created by users and destroyed or suspended when no longer needed. Testbed for developers and integrators to evaluate new grid and storage applications on behalf of scientific stakeholders. Ongoing project to build and expand the facility: I.Technology evaluation, requirements, deployment. II.Scalability, monitoring, performance improvement. III.High availability and reliability X.509 Authentication Use OpenNebula Pluggable authentication feature. Wrote X.509 authentication plugin and contributed back to OpenNebula, included in OpenNebula 3. X.509 Authentication is integrated into command line tools, EC2 Query API, OCCI API, SunStone management GUI. Contributing to standards bodies to make authorization callout to external services, similar to Grid authentication. Virtualization and MPI FermiCloud Architecture Diagrams Image Repository OpenNebula Head Node Sunstone Web GUI Master Scheduler Query API (EC2) OCCI APICLI Grid Computing Center VM Host fcl003 VM Host fcl005 VM Host fcl004 Head Node fcl002 VM Host fcl006 PRIVATENETWORK+IBPRIVATENETWORK+IB PRIVATENETWORK+IBPRIVATENETWORK+IB SAN PUBLICNETWORKPUBLICNETWORK PUBLICNETWORKPUBLICNETWORK Feynman Computing Center VM Host fcl302 VM Host fcl304 VM Host fcl303 Head Node fcl301 VM Host fcl305 PRIVATENETWORK+IBPRIVATENETWORK+IB PRIVATENETWORK+IBPRIVATENETWORK+IB SAN PUBLICNETWORKPUBLICNETWORK PUBLICNETWORKPUBLICNETWORK FermiCloud Operations Stock virtual machine images are provided for new users. Active virtual machines get security patches from site patching services. Dormant virtual machines get woken up periodically to get their patches. New virtual machines scanned by site anti-virus and vulnerability scanners, don’t get network access until they pass. Three levels of service: 24 by 7 high availability, can have fixed IP number, 9 by 5 development/integration, use one of a pool of fixed IP’s, Opportunistic—Can be pre-empted if idle or if higher-priority users need cloud. Configuration #Host Systems #VM/ host #CPU Total Physical CPU HPL Benchmark (Gflops) Bare Metal without pinning Bare Metal with pinning (Note 2) VM no pinning (Notes 2,3) 281 vCPU168.2 VM with pinning (Notes 2,3) 281 vCPU VM+SRIOV with pinning (Notes 2,4) 272 vCPU Notes:(1) Work performed by Dr. Hyunwoo Kim of KISTI in collaboration with Dr. Steven Timm of Fermilab. (2) Process/Virtual Machine “pinned” to CPU and associated NUMA memory via use of numactl. (3) Software Bridged Virtual Network using IP over IB (seen by Virtual Machine as a virtual Ethernet). (4) SRIOV driver presents native InfiniBand to virtual machine(s), 2 nd virtual CPU is required to start SRIOV, but is only a virtual CPU, not an actual physical CPU. FermiCloud Capacity # of Units Nominal (1 physical core = 1 VM) % over subscription % over subscription (1 HT core = 1 VM) % over subscription552 Note – FermiGrid Production Services are operated at 100% to 200% “oversubscription” FermiCloud Target VM states as reported by “virsh list” Accounting Grid Cluster On Demand Define policy-based expressions for “Idle” Detect Idle virtual machines Suspend idle virtual machines Use vCluster package: Look ahead at batch queue Submit correct virtual machine to FermiCloud Submit to Amazon EC2 if extra capacity needed vCluster a collaboration between Fermilab and KISTI High Availability Machines in two different buildings Mirrored SAN between buildings Global shared file system between all nodes Copies of all VM’s available in both buildings Network routable from each building Pre-emptive live migration for scheduled outage Restart of VM’s after unscheduled building failure