Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

Similar presentations

Presentation on theme: "© 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc."— Presentation transcript:

1 © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

2 © 2007 Open Grid Forum 2 Outline Types of grids Storage provisioning in various grid types Case study performance stability

3 © 2007 Open Grid Forum 3 Grid types Cycle scavenging Clusters Data center grids

4 © 2007 Open Grid Forum 4 Cycle scavenging grids Widely distributed as a rule campus or department wide global grids Typically for collaborative science resource scavenging Main focus to date of GGF, OGF, Globus, et al. Category includes "grid of grids"

5 © 2007 Open Grid Forum 5 Clusters Grid-like systems good scaleout cluster-wide namespace Especially attractive in HPC settings Many concepts in common with cycle- scavenging systems but proprietary infrastructure no management standards yet

6 © 2007 Open Grid Forum 6 Data Center Grids Focus of this talk Typically fairly homogenous standard compute node hardware two or three OS possibilities Two variants Nodes have disks Topologically homomorphic to cycle scavenging grids May use cycle scavenging grid technology Nodes are diskless Storage becomes much more important storage grids

7 © 2007 Open Grid Forum 7 Storage technology adoption curves Market Adoption Cycles Enterprise Storage Market Today Grid Frameworks Today ? Direct attached Storage Networked Storage Storage Grids Global Storage Network Focus of this talk

8 © 2007 Open Grid Forum 8 Diskless compute farms Connected to storage grids Boot over iSCSI or FCP OS is provisioned in a boot LUN on a storage array Applications can be provisioned as well Key benefit – nodes can be repurposed at any time from a different boot image Key benefit – smart storage and provisioning technology can use LUN cloning to deliver storage efficiencies through block sharing Key benefit – no rotating rust in compute nodes reduced power and cooling requirements no OS/applications to provision on new nodes

9 © 2007 Open Grid Forum 9 Local fabric technologies SAN blah blah blah blah blah products = e.g. shadowimage, flexclone iSCSI or FC Servers boot over iSCSI or FCP SAN Storage server(s) maintain golden image + clones

10 © 2007 Open Grid Forum 10 Global deployment technologies iSAN WAN products e.g. snapmirror, trucopy Long-haul replication from central data center to local centers

11 © 2007 Open Grid Forum 11 Diskless booting LU – Logical Unit LUN – Logical Unit Number Mapping – LUNs :: initiator ports Masking – Initiators :: LUNs (views) Node shuts down Storage maps desired image to LUN 0 for the zone (FCP) or initiator group (iSCSI) the node is in Node restarts Node boots from LUN 0 mounts scratch storage space if also provided starts up grid-enabled application Node proceeds to compute until done or repurposed

12 © 2007 Open Grid Forum 12 Example /vol/vol1/geotherm2 LUN 0 mapped to gridsrv1 gridsrv1gridsrv2gridsrv3 /vol/vol1/mysql_on_linux LUN 0 mapped to gridsrv2 /vol/vol1/mysql_on_linux LUN 0 mapped to gridsrv3 compute grid storage grid

13 © 2007 Open Grid Forum 13 What makes this magic happen? /vol/vol1/geotherm2 LUN 0 mapped to gridsrv1 gridsrv1gridsrv2gridsrv3 /vol/vol1/mysql_on_linux LUN 0 mapped to gridsrv2 /vol/vol1/mysql_on_linux LUN 0 mapped to gridsrv3 compute grid storage grid SGME

14 © 2007 Open Grid Forum 14 SGME Storage Grid Management Entity Component of overall GME in OGF Reference model GME is the collection of software that assembles the components of a grid into a grid Provisioning, monitoring etc. Many GME products: Condor et al Current storage grid incarnations are often home-rolled scripts Also Stork, Lustre, qlusters

15 © 2007 Open Grid Forum 15 Provisioning a diskless node Add HBAs to white box if necessary Fiddle with CMOS to boot from SAN For iSCSI: DHCP supplies address, node name SGME provisions igroup for node address SGME creates LU for node SGME maps LU to igroup For FC: zone, mask, map, etc. SGME Grid Storage Management software HBA Host Bus Adapter CMOS BIOS settings DHCP IP boot management

16 © 2007 Open Grid Forum 16 Provisioning a diskless node Add HBAs to white box if necessary We used QLogic 4052 adapters Fiddle with CMOS to boot from SAN Get your white box vendor to do this Blade server racks generally easily configurable for this as well

17 © 2007 Open Grid Forum 17 Preparing a gold image On a client – this is manual one-time work Install Windows server (e.g.) Setup HBA e.g. QLogic needs iscli.exe and commands in startup.bat C:\iscli.exe –n 0 KeepAliveTO 180 IP_ARP_Redirect on Software initiators must be prevented from paging out HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\ Control\Session Manager\Memory Management:DisablePagingExecutive => 1 Run Microsoft sysprep setup mgr and seal image

18 © 2007 Open Grid Forum 18 Preparing a gold lun On storage server – manual one time work Copy the golden image to a new base LUN (over CIFS) des > lun show /vol/vol1/gold/win2k3_hs 10g... Create a snap shot of the volume with the gold lun…. des > snap create vol1 windows_lun Create an igroup for each initiator des > igroup create -i -t windows kc65b1 \ iqn com.qlogic:qmc4052.zj1ksw5c Note: commands in blue type are Netapp-specific, for purposes of illustration only

19 © 2007 Open Grid Forum 19 Preparing cloned LUNs SGME: for each client create a qtree des > qtree create /vol/vol1/iscsi/ Create a lun clone from the gold lun des > lun create –b \ /vol/vol1/.snapshot/windows_lun/gold/win2k3_hs \ /vol/vol1/iscsi/kc65b1 Map the lun to the igroup. des > lun map /vol/vol1/iscsi/kc65b1 kc65b1 0

20 © 2007 Open Grid Forum 20 Getting clients to switch horses SGME: for each client Notify client to clean up Bring down client remote power strips/blade controllers Remap client LUN on storage des > lun offline /vol/vol1/iscsi/kc65b1 des > lun unmap /vol/vol1/iscsi/kc65b1 kc65b1 0 des > lun online /vol/vol1/iscsi2/kc65b1 des > lun map /vol/vol1/iscsi/kc65b1 kc65b1 0 Bring up client DHCP

21 © 2007 Open Grid Forum 21 Lab results Experiment conducted at Network Appliance FAS 3050 clustered system 224 clients (112 per cluster node) dual core 3.2GHz/2GB Intel Xeon IBM H20 Blades Qlogic QMC 4052 adapters Windows Server 2003 SE SP1 Objectives determine robustness and performance characteristics of configuration, under conditions of storage failover and giveback determine viability of keeping paging file on central storage

22 © 2007 Open Grid Forum 22 Network configuration Not your daddys network

23 © 2007 Open Grid Forum 23 Client load Program to generate heavy CPU and paging activity (2 GB memory area, lots of reads and writes) Several instances per client

24 © 2007 Open Grid Forum 24 Client load, cont. ~400 pages/sec

25 © 2007 Open Grid Forum 25 Load on storage Near 100% disk utilization on storage system in takeover mode des (takeover)> sysstat -u 1 CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk ops/s in out read write read write age hit time ty util 18% % 0% - 100% 42% % 0% - 100% 53% % 45% D 100% 54% % 100% : 79% 49% % 100% D 73% 83% % 100% : 83% 42% % 100% : 128% 39% % 35% D 93% 74% % 100% : 112% 38% % 100% : 100% 51% % 90% D 86% 29% % 100% : 100% 19% % 49% : 100% 18% % 0% - 100% 30% % 0% - 100% 33% % 0% - 100% 67% % 0% - 100%

26 © 2007 Open Grid Forum 26 Observations Failover and giveback transparent No BSOD when times within windows recall: KeepAliveTO = 180 some tuning opportunities here actual failover was < 60 seconds iscsi stop+start used to increase failover time for testing Slower client access during takeover expected behavior Heavy paging activity not an issue Higher number of clients / storage server an option, depending on application behavior

27 © 2007 Open Grid Forum 27 Economic analysis Assume 256 clients / storage server 20w / drive $80 / client-side drive 80G client-side drive, 10G used per application $3000 / server-side drive 300G server-side drive Calculate server-side actual usage cost of client-side drives vs. cost for server space cost of power+cooling for client-side drives and server space

28 © 2007 Open Grid Forum 28 Results Server side usage 512 clients x 10GB per application = 5 TB Assume 50% usable space on server 20w typical per drive 2.3 x multiplier to account for cooling 5000GB * 2 / 300GB/drive * 20w/drive * KW 10TB $10/GB $100,000 Workstation side drives Same assumptions (note: power supply issue) 512 drives * 20w/drive * KW 512 drives * $80/drive $40,960 At $0.10/KWH, cost curves cross over in three years in some scenarios, its less than two years

29 © 2007 Open Grid Forum 29 Conclusion Dynamic provisioning from golden images is here Incredibly useful technology in diskless workstation farms Fast turnaround Central control Simple administration Nearly effortless client replacement Green!

30 © 2007 Open Grid Forum 30 Questions?

Download ppt "© 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc."

Similar presentations

Ads by Google