We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byJesse Kilgore
Modified over 3 years ago
© 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.
© 2007 Open Grid Forum 2 Outline Types of grids Storage provisioning in various grid types Case study performance stability
© 2007 Open Grid Forum 3 Grid types Cycle scavenging Clusters Data center grids
© 2007 Open Grid Forum 4 Cycle scavenging grids Widely distributed as a rule campus or department wide global grids Typically for collaborative science resource scavenging Main focus to date of GGF, OGF, Globus, et al. Category includes "grid of grids"
© 2007 Open Grid Forum 5 Clusters Grid-like systems good scaleout cluster-wide namespace Especially attractive in HPC settings Many concepts in common with cycle- scavenging systems but proprietary infrastructure no management standards yet
© 2007 Open Grid Forum 6 Data Center Grids Focus of this talk Typically fairly homogenous standard compute node hardware two or three OS possibilities Two variants Nodes have disks Topologically homomorphic to cycle scavenging grids May use cycle scavenging grid technology Nodes are diskless Storage becomes much more important storage grids
© 2007 Open Grid Forum 7 Storage technology adoption curves Market Adoption Cycles Enterprise Storage Market Today Grid Frameworks Today ? Direct attached Storage Networked Storage Storage Grids Global Storage Network Focus of this talk
© 2007 Open Grid Forum 8 Diskless compute farms Connected to storage grids Boot over iSCSI or FCP OS is provisioned in a boot LUN on a storage array Applications can be provisioned as well Key benefit – nodes can be repurposed at any time from a different boot image Key benefit – smart storage and provisioning technology can use LUN cloning to deliver storage efficiencies through block sharing Key benefit – no rotating rust in compute nodes reduced power and cooling requirements no OS/applications to provision on new nodes
© 2007 Open Grid Forum 9 Local fabric technologies SAN blah blah blah blah blah products = e.g. shadowimage, flexclone iSCSI or FC Servers boot over iSCSI or FCP SAN Storage server(s) maintain golden image + clones
© 2007 Open Grid Forum 10 Global deployment technologies iSAN WAN products e.g. snapmirror, trucopy Long-haul replication from central data center to local centers
© 2007 Open Grid Forum 11 Diskless booting LU – Logical Unit LUN – Logical Unit Number Mapping – LUNs :: initiator ports Masking – Initiators :: LUNs (views) Node shuts down Storage maps desired image to LUN 0 for the zone (FCP) or initiator group (iSCSI) the node is in Node restarts Node boots from LUN 0 mounts scratch storage space if also provided starts up grid-enabled application Node proceeds to compute until done or repurposed
© 2007 Open Grid Forum 12 Example /vol/vol1/geotherm2 LUN 0 mapped to gridsrv1 gridsrv1gridsrv2gridsrv3 /vol/vol1/mysql_on_linux LUN 0 mapped to gridsrv2 /vol/vol1/mysql_on_linux LUN 0 mapped to gridsrv3 compute grid storage grid
© 2007 Open Grid Forum 13 What makes this magic happen? /vol/vol1/geotherm2 LUN 0 mapped to gridsrv1 gridsrv1gridsrv2gridsrv3 /vol/vol1/mysql_on_linux LUN 0 mapped to gridsrv2 /vol/vol1/mysql_on_linux LUN 0 mapped to gridsrv3 compute grid storage grid SGME
© 2007 Open Grid Forum 14 SGME Storage Grid Management Entity Component of overall GME in OGF Reference model GME is the collection of software that assembles the components of a grid into a grid Provisioning, monitoring etc. Many GME products: Condor et al Current storage grid incarnations are often home-rolled scripts Also Stork, Lustre, qlusters
© 2007 Open Grid Forum 15 Provisioning a diskless node Add HBAs to white box if necessary Fiddle with CMOS to boot from SAN For iSCSI: DHCP supplies address, node name SGME provisions igroup for node address SGME creates LU for node SGME maps LU to igroup For FC: zone, mask, map, etc. SGME Grid Storage Management software HBA Host Bus Adapter CMOS BIOS settings DHCP IP boot management
© 2007 Open Grid Forum 16 Provisioning a diskless node Add HBAs to white box if necessary We used QLogic 4052 adapters Fiddle with CMOS to boot from SAN Get your white box vendor to do this Blade server racks generally easily configurable for this as well
© 2007 Open Grid Forum 17 Preparing a gold image On a client – this is manual one-time work Install Windows server (e.g.) Setup HBA e.g. QLogic needs iscli.exe and commands in startup.bat C:\iscli.exe –n 0 KeepAliveTO 180 IP_ARP_Redirect on Software initiators must be prevented from paging out HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\ Control\Session Manager\Memory Management:DisablePagingExecutive => 1 Run Microsoft sysprep setup mgr and seal image
© 2007 Open Grid Forum 18 Preparing a gold lun On storage server – manual one time work Copy the golden image to a new base LUN (over CIFS) des > lun show /vol/vol1/gold/win2k3_hs 10g... Create a snap shot of the volume with the gold lun…. des > snap create vol1 windows_lun Create an igroup for each initiator des > igroup create -i -t windows kc65b1 \ iqn com.qlogic:qmc4052.zj1ksw5c Note: commands in blue type are Netapp-specific, for purposes of illustration only
© 2007 Open Grid Forum 19 Preparing cloned LUNs SGME: for each client create a qtree des > qtree create /vol/vol1/iscsi/ Create a lun clone from the gold lun des > lun create –b \ /vol/vol1/.snapshot/windows_lun/gold/win2k3_hs \ /vol/vol1/iscsi/kc65b1 Map the lun to the igroup. des > lun map /vol/vol1/iscsi/kc65b1 kc65b1 0
© 2007 Open Grid Forum 20 Getting clients to switch horses SGME: for each client Notify client to clean up Bring down client remote power strips/blade controllers Remap client LUN on storage des > lun offline /vol/vol1/iscsi/kc65b1 des > lun unmap /vol/vol1/iscsi/kc65b1 kc65b1 0 des > lun online /vol/vol1/iscsi2/kc65b1 des > lun map /vol/vol1/iscsi/kc65b1 kc65b1 0 Bring up client DHCP
© 2007 Open Grid Forum 21 Lab results Experiment conducted at Network Appliance FAS 3050 clustered system 224 clients (112 per cluster node) dual core 3.2GHz/2GB Intel Xeon IBM H20 Blades Qlogic QMC 4052 adapters Windows Server 2003 SE SP1 Objectives determine robustness and performance characteristics of configuration, under conditions of storage failover and giveback determine viability of keeping paging file on central storage
© 2007 Open Grid Forum 22 Network configuration Not your daddys network
© 2007 Open Grid Forum 23 Client load Program to generate heavy CPU and paging activity (2 GB memory area, lots of reads and writes) Several instances per client
© 2007 Open Grid Forum 24 Client load, cont. ~400 pages/sec
© 2007 Open Grid Forum 25 Load on storage Near 100% disk utilization on storage system in takeover mode des (takeover)> sysstat -u 1 CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk ops/s in out read write read write age hit time ty util 18% % 0% - 100% 42% % 0% - 100% 53% % 45% D 100% 54% % 100% : 79% 49% % 100% D 73% 83% % 100% : 83% 42% % 100% : 128% 39% % 35% D 93% 74% % 100% : 112% 38% % 100% : 100% 51% % 90% D 86% 29% % 100% : 100% 19% % 49% : 100% 18% % 0% - 100% 30% % 0% - 100% 33% % 0% - 100% 67% % 0% - 100%
© 2007 Open Grid Forum 26 Observations Failover and giveback transparent No BSOD when times within windows recall: KeepAliveTO = 180 some tuning opportunities here actual failover was < 60 seconds iscsi stop+start used to increase failover time for testing Slower client access during takeover expected behavior Heavy paging activity not an issue Higher number of clients / storage server an option, depending on application behavior
© 2007 Open Grid Forum 27 Economic analysis Assume 256 clients / storage server 20w / drive $80 / client-side drive 80G client-side drive, 10G used per application $3000 / server-side drive 300G server-side drive Calculate server-side actual usage cost of client-side drives vs. cost for server space cost of power+cooling for client-side drives and server space
© 2007 Open Grid Forum 28 Results Server side usage 512 clients x 10GB per application = 5 TB Assume 50% usable space on server 20w typical per drive 2.3 x multiplier to account for cooling 5000GB * 2 / 300GB/drive * 20w/drive * KW 10TB $10/GB $100,000 Workstation side drives Same assumptions (note: power supply issue) 512 drives * 20w/drive * KW 512 drives * $80/drive $40,960 At $0.10/KWH, cost curves cross over in three years in some scenarios, its less than two years
© 2007 Open Grid Forum 29 Conclusion Dynamic provisioning from golden images is here Incredibly useful technology in diskless workstation farms Fast turnaround Central control Simple administration Nearly effortless client replacement Green!
© 2007 Open Grid Forum 30 Questions?
1 Before Between After 2 What comes before. _____ 10 _____
1 INSTALLING "WINDOWS 8.." SO THAT IT CREATES A LOCAL USER ACCOUNT.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
×1= 9 4 1×1= 1 5 8×1= 8 6 7×1= 7 7 8×3= 24.
PP Test Review Sections 6-1 to 6-6 Mrs. Rivas 1. 2.
Installing Windows XP Professional Using Attended Installation Slide 1 of 30Session 8 Ver. 1.0 CompTIA A+ Certification: A Comprehensive Approach for all.
PSSA Preparation. Question 1(no calculator) D Question 2 (no calculator)
I 1. can 2 see 3 A 4 to 5 come 6 my 7 the 8.
1 Step-by-Step Guide to Synchronous Volume Replication (Block Based) with Active-Active iSCSI Failover supported by Open-E ® DSS V7 Software.
BMU - E I 1 Development of renewable energy sources in Germany in
CALENDAR NEW CALENDAR
Simulations The basics for simulations. Simulation is a way to model random events, such that simulated outcomes closely match real-world outcomes. By.
Symantec 2010 Windows 7 Migration Global Results.
Subtraction: Adding UP. Category 1 The whole is a multiple of ten.
Adding Up In Chunks. Category 1 Adding multiples of ten to any number.
Symantec 2010 Windows 7 Migration EMEA Results. Methodology Applied Research performed survey 1,360 enterprises worldwide SMBs and enterprises Cross-industry.
Time for a BREAK! You have 45 Minutes. Time Left 44.
4.1 © 2004 Pearson Education, Inc. Exam Managing and Maintaining a Microsoft® Windows® Server 2003 Environment Lesson 4: Organizing a Disk for Data.
Adding & Subtracting Fractions Teacher Notes: Use this presentation to teach the steps for adding and subtracting fractions with unlike denominators.
BMU – KI III 1 Development of renewable energy sources in Germany in
Break Time Remaining 10:00. Break Time Remaining 9:59.
Jamie Glendinning ENGL Imaging Toolkit 6 Windows XP unattended install using bootable USB.
1 Budapest University of Technology and Economics, BME, 1872 Budapest University of Technology and Economics, BME, 1872 Happy New Year 2012.
Chapter 12 Static Equilibrium; Elasticity and Fracture Physics for Scientists & Engineers, 3 rd Edition Douglas C. Giancoli © Prentice Hall.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Unit I Topic 2-7 MAC Protocols for Ad Hoc Wireless Networks Department of Computer Science and Engineering Kalasalingam University 1 CSE 6007 Mobile Ad.
Threads, SMP, and Microkernels Chapter 4 1. Process Resource ownership - process includes a virtual address space to hold the process image Scheduling/execution-
Copyright © Action Works 2008 All Rights Reserved - Photos by David D. Kempster 1.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 2: Managing Hardware Devices.
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
13:00 Clock will move after 1 minute PPT – VCIC Timer 15.ppt.
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
©2004 by Pearson Education11-1 R.C. Hibbeler Resistência dos Materiais, 5ª ed. 11 – Projetos de Vigas e Eixos.
1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs.
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© Tally Solutions Pvt. Ltd. All Rights Reserved Shoper 9 License Management December 09.
1 BACKGROUND ALARM SYSTEM OF ZXJ10 Training Center Zhongxing Telecom Pakistan (Pvt.) Ltd.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 6 Processes and Operating Systems.
C Copyright © 2005, Oracle. All rights reserved. Practice Solutions.
The 5S numbers game. 1. This sheet represents our current work place. Our job during a 20 second shift, is to strike out the numbers 1 to 49 in correct.
Dynamic Access Control the file server, reimagined Presented by Mark on twitter 1 contents copyright 2013 Mark Minasi.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory % Grid or Zone Samples Tested Compared To Conventional Whole Field Composite.
Operating Systems Operating Systems - Winter 2010 Chapter 3 – Input/Output Vrije Universiteit Amsterdam.
Tom Hamilton – Americas Channel Database CSE Oracle Architecture 1.
© 2017 SlidePlayer.com Inc. All rights reserved.