Presentation is loading. Please wait.

Presentation is loading. Please wait.

VMs at a Tier-1 site EGEE’09, 21-09-2009 Sander Klous, Nikhef.

Similar presentations


Presentation on theme: "VMs at a Tier-1 site EGEE’09, 21-09-2009 Sander Klous, Nikhef."— Presentation transcript:

1 VMs at a Tier-1 site EGEE’09, 21-09-2009 Sander Klous, Nikhef

2 Contents Introduction –Who are we? Motivation –Why are we interested in VMs? –What are we going to do with VMs? Status –How do we approach this issue? –Where do we stand? Challenges 03-09-2009 BIG Grid - Virtualization working group 2

3 Introduction Collaboration between –NCF: national computing facilities –Nikhef: national institute for subatomic physics –NBIC: national bioinformatics center Participation from Philips, SARA, etc. Goal: “Enables access to grid infrastructures for scientific research in the Netherlands” 03-09-2009 BIG Grid - Virtualization working group 3

4 Motivation: Why Virtual Machines? Site perspective –Resource flexibility (e.g. SL4 / SL5) –Resource management Scheduling / multi-core / sandboxing User perspective –Isolation from environment Identical environment on multiple sites Identical environment on local machine 03-09-2009 BIG Grid - Virtualization working group 4

5 Different VM classes Class 1: Site generated Virtual Machines –No additional trust issues –Benefits for system administration Class 2: Certified Virtual Machines –Inspection and certification to establish trust –Requirements for monitoring / integration Class 3: User generated Virtual Machines –No trust relation –Requires appropriate security measures 03-09-2009 BIG Grid - Virtualization working group 5

6 Resource management Site infrastructure Typical use case Class 1 VM Torque/PBS Box 2 “8 Virtual SL4 WNs” Box 3 “8 Virtual SL5 WNs” Virtual Machine Manager Job queue VM queue Box 1 “Normal WN” 03-09-2009BIG Grid - Virtualization working group6

7 Typical use case Class 2 VM Analysis on Virtual Machines Run minimal analysis on desktop/laptop –Access to grid services Run full analysis on the grid –Identical environment –Identical access to grid services No interest to become system administrator –Standard experiment software is sufficient 03-09-2009 BIG Grid - Virtualization working group 7

8 Typical use case Class 3 VM Identification and classification of GPCRs Requires very specific software set –Blast 2.2.16 –HMMER 2.3.2 –BioPython1.50 Even non-x86 (binary) applications! Specific software for this user No common experiment software 03-09-2009 BIG Grid - Virtualization working group 8

9 Project status Working group: virtualization of worker nodes https://wiki.nbic.nl/index.php/BigGrid_virtualisatie Kick-off meeting July 6 th 2009 –System administrators, User support, management Phase 1 (3 months) –Collect site and user requirements –Identify other ongoing efforts in Europe –First design Phase 2 (3 months) –Design and implement proof of concept 03-09-2009 BIG Grid - Virtualization working group 9

10 Active working group topics Policies/Security issues for Class 2/3 VMs Technology study –Managing Virtual Machines –Distributing VM images –Interfacing the VM infrastructure with ‘the grid’ Identify missing functionality and alternatives –Accounting and fare share, image management, authentication/authorization, etc. 03-09-2009 BIG Grid - Virtualization working group 10

11 The Amazon identity crisis The three most confronting questions: 1.What is the difference between a job and a VM? 2.Why can I do it at Amazon, but not at the grid? 3.What is the added value of grids over clouds? “We don’t want to compete with Amazon!” 03-09-2009 BIG Grid - Virtualization working group 11

12 Policy and security issues E-science services and functionality Data integrity, confidentiality and privacy Non-repudiation of user actions System administrator point of view Trust user intentions, not their implementations Incident response more costly than certification Forensics is time consuming 03-09-2009 BIG Grid - Virtualization working group 12

13 Compromised user space is often already enough trouble Security 101 = Attack surface 03-09-2009BIG Grid - Virtualization working group13

14 Available policies Grid Security Policy, version 5.7a VO Portal Policy, version 1.0 (draft) Big Grid Security Policy, version 2009-025 –Grid Acceptable Use Policy, version 3.1 –Grid Site Operations Policy, version 1.4a –LCG/EGEE Incident Handling and Response Guide, version 2.1 –Grid Security Traceability and Logging Policy, version 2.0 VO-Box Security Recommendations and Questionnaire, version 0.6 (draft, not ratified) 03-09-2009 BIG Grid - Virtualization working group 14

15 Relevant policy statements Network security is covered by site local security policies and practices A VO Box is part of the trusted network fabric. Privileged access is limited to resource administrators Software deployed in the grid must include sufficient and relevant site central logging. 03-09-2009 BIG Grid - Virtualization working group 15

16 First compromise Certified package repository –Base templates –Certified packages Separate user disk –User specific stuff –Permanent storage At run time –No privileged access –Comparable to VO box 03-09-2009 BIG Grid - Virtualization working group 16 Licenses?

17 Second compromise Make separate grid DMZ for Class 3 VMs Comparable to “Guest networks” –Only outbound connectivity Detection of compromised guests –Extended security monitoring Packet inspection, netflows (SNORT, nfsen) Honeypots, etc. Simple policy: one warning, you’re out. Needs approval (network policy) from OST (Operations Steering Team) 03-09-2009 BIG Grid - Virtualization working group 17

18 TECHNOLOGY STUDY 03-09-2009 BIG Grid - Virtualization working group 18

19 Resource management Site Managing VMs Torque/PBS Box 2 “8 Virtual WNs” Box 3 “8 Class 2/3 VMs” OpenNebula Job queue VM queue Box 1 “Normal WN” Haizea 03-09-2009BIG Grid - Virtualization working group19

20 Class 2/3 upload solution iSCSI/LVM Distributing VM images Box 3 “8 Class 2/3 VMs” Box 1 “Normal WN” Box 2 “8 Virtual WNs” Repository (SAN) Image 03-09-2009BIG Grid - Virtualization working group20

21 Cached copy-on-write 03-09-2009BIG Grid - Virtualization working group21 Box 1 Repository Cache Image COW VM Box 2 Cache Image COW VM Image

22 Interfacing VMs with ‘the grid’ Resource management Torque/PBSOpenNebula Class 2/3 upload solution Repository (SAN) Image Class 2 Class 3 discussion Grid middleware globus-job-run globus-gatekeeper globus-job-manager contact-string jm-pbs-long jm-opennebula qsub / opennebula Nimbus/OCCI 03-09-2009BIG Grid - Virtualization working group22

23 VM contact-string User management mapping –Mapping to OpenNebula users Authentication / Authorization –Access to different VM images Grid middleware components involved: –Cream-CE, BLAHp, glexec –Execution Environment Service https://edms.cern.ch/document/1018216/1 –Authorization Service Design https://edms.cern.ch/document/944192/1 03-09-2009 BIG Grid - Virtualization working group 23 Coffee table discussion Parameter passing issue

24 Monitoring/Performance testing 03-09-2009 BIG Grid - Virtualization working group 24

25 Performance Small cluster –4 dual CPU quad core machines –Image server with 2 TB storage Integration with experimental testbed –Existing Cream-CE / Torque Testing –Network I/O, is NAT feasible? –File I/O, what is the COW overhead? –Realistic jobs 03-09-2009 BIG Grid - Virtualization working group 25

26 Other challenges Accounting, scheduling based on Fair Share Scalability! Rapidly changing landscape –New projects every week –New versions every month So many alternatives –VMWare, SGE, Eucalyptus, Enomaly –iSCSI, NFS, GFS, Hadoop –Monitoring and security tools 03-09-2009 BIG Grid - Virtualization working group 26

27 Conclusions Maintainability: no home grown scripting –Each solution should be part of a product –Validation procedure with each upgrade Deployment –Gradually move VM functionality in production 1.Introduce VM worker nodes 2.Virtual machine endpoint in grid middleware 3.Test with a few specific Class 2/3 VMs 4.Scaling and performance tuning 03-09-2009 BIG Grid - Virtualization working group 27


Download ppt "VMs at a Tier-1 site EGEE’09, 21-09-2009 Sander Klous, Nikhef."

Similar presentations


Ads by Google