Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sizing & TCO for bullion

Similar presentations


Presentation on theme: "Sizing & TCO for bullion"— Presentation transcript:

1 Sizing & TCO for bullion
Click to add comments February, 2013

2 Agenda Sizing: TCO: Methodology Bullion performance numbers
Consolidation (scale-out vs. scale-up) Excel tool TCO: Examples ghfghgfhjg

3 Input data Inventory of physical servers and VMs to by replaced by bullions => enables to find the SPECint*rate performance Performance requirements (SPECint*rate, SAPS, …) Physical : number of cores, GHz, sockets, RAM size Number of VMs and possible VM consolidation ESXi over-commitment ratio for CPU and memory : usually from 1 to 5 according the performance expected for the applications IOs : bandwith for vMotion, for VMs, H.A. High Availability : number of nodes in the cluster DRS (2 sites) : yes or no; 1 or 2 clusters(synchronous/asynchronous)

4 bullion perf.(1) ratio within E7-4800 series
6 cores 1.86 GHz 95W H.T E7-4820 8 cores 2.00 GHz 105W E7-4830 2.13 GHz E7-8837 2.66 GHz 130W No H.T E7-4850 10 cores E7-8867L E7-4860 2.26 GHz E7-4870 2.40 GHz ratio perf (2) E7-xxxx/E7-4870 0.461 0.672 0.727 0.741 0.853 0.887 0.940 1 perf 4 sock. bullion 517 753 814 829 955 994 1052 1120(3) perf 8 sock. bullion 974 1419 1533 1563 1800(3) 1872 1983 2110(3) perf 12 sock bullion 1430 2084 2253 2296 2645 2750 2913 3100(4) perf 16 sock bullion 1896 2763 2987 3044 3506 3646 3862 4110(3) ratio perf/price bullion 4s 2.241 2.286 1.594 1.679 1.45 0.943 1.102 (1) SPECint*rate_base2006 Intel reference Published on spec.org estimated E is the best perf E is the best ratio perf/price in 8 cores E is the best ratio perf./price in 10 cores

5 CPU perf. (native Linux SPECint®_rate2006 with E7-4870)
Perfect linearity Scalability ~x4 (x3.67)

6 SPECint*rate BCS or BCS-like Glueless 8 sockets Benchmark
Hardware Vendor System Result Baseline # Cores # Chips CINT2006rate Bull SAS bullion E (160 cores - 4TB RAM) 4110 3890 160 16 Hewlett-Packard Company ProLiant DL980 G7 (2.4 GHz, Intel Xeon E7-4870) 2180 2070 80 8 bullion E (80 cores - 2TB RAM) 2110 2000 HITACHI BladeSymphony BS2000 (Intel Xeon E7-8870) 1920 1790 Compute Blade 2000 (Intel Xeon E7-8870) Unisys Corporation Unisys ES7000 Model 7600R G3 (Intel Xeon E7-8870) 1910 1780 NEC Corporation Express5800/A1080a-E (Intel Xeon E7-8870) 1900 Fujitsu PRIMEQUEST 1800E2, Intel Xeon E7-8870, 2.40 GHz 1890 1770 PRIMERGY RX900 S2, Intel Xeon E7-8870, 2.40 GHz IBM Corporation IBM System x 3850 X5 (Intel Xeon E7-8870) Glueless 8 sockets

7 Intel Xeon Processor E5 and E7 performance comparison
Bullion 4-sockets Intel Xeon E series ideal for data-demanding application performance Intel Xeon E for HPC

8 On-Line Transaction Processing (OLTP) perf.
bullion E7-4870 with VMware tpmC (estimation) tpsE 4 sockets ~ 2,800,000 ~ 2,700 8 sockets ~ 5,500,000 ~ 4,600 12 sockets ~ 7,500,000 ~ 7,100 16 sockets ~ 10,000,000 ~ 9,500

9 Virtualization perf. : SPECvirt benchmark
Bullion with VMware SPECvirt_sc2010 4 sockets (X7560) (28 tiles) (1) 8 sockets (E7-4870) (85 tiles) (2) 12 sockets N/A (3) 16 sockets published in February 2011 with 512GB, 32 cores & ESXi 4.1 estimation ESXi V5 is limited to 512 VMs and 160 logical CPUs

10 ERP performance bullion with VMware SAPS 41 420 (1) 100 000 (2)
4 sockets (X7560) (1) 8 sockets (E7-4870) (2) 12 sockets (E7-4870) (2) 16 sockets (2) (1) published in may 2010 with 128 GB in a 2-tier SD architecture (2) estimation in a 3-tier SD architecture

11 CPU load & VMs: comparison scale-out/scale-up
scale-out (2-socket x 8-core servers) scale-up (16-socket x10-core bullions) In each server : 2 VMs of 8 vCPUs => no vCPU left 1 VM of 16 vCPUs 0 VM of 32 vCPUs In each server : 16 VMs of 8 vCPUs 8 VMs of 16 vCPUs => 32 vCPUs left VMs of 32 vCPUs 2 VMs of 64 vCPUs VMs limited to 16 vCPUs Load peaks => servers are 100% full vMotion impossible No limitation on the VM size Load peak: fully managed (without vMotion) vMotion possible for big VMs 32 free vCPUs 32 free vCPUs Better manageability, lower license cost, no limit on VM size, 20x 16-core servers => 20 ESXi 2x 160-core bullions => 2 ESXi same number of cores (320) = 8 vms w/ vCPUs

12 320 CPU load & VMs: comparison scale-out/scale-up
scale-out (2-socket x 8-core servers) scale-up (16-socket x10-core bullions) 256 cores used 512 cores paid 256 cores used 320 cores paid HW investissement : -37,5 % 320 32 free vCPUs 32 free vCPUs 32x 16-core servers => 32 ESXi 2x 160-core bullions => 2 ESXi = 8 vCPUs

13 CPU load & VMs: comparison scale-out/scale-up
scale-out (2-socket x 8-core servers) scale-up (8-socket x10-core bullions) 256 cores used 512 cores paid 320 cores used 400 cores paid HW investment : -22 % Performance : +25% HA VMware 32x 16-core servers => 32 ESXi 5x 80-core bullions => 5 ESXi = 8 vCPUs

14 CPU load & VMs: comparison scale-out/scale-up
scale-out (2-socket x 8-core servers) scale-up (16-socket x10-core bullions) Communication internal to bullion => less Eth. adapters/cables/switches => best performance Communication through the NICs 32 free vCPUs 32 free vCPUs 32x 16-core servers => 32 ESXi 2x 160-core bullions => 2 ESXi = 8 vCPUs

15 VMs : size and quantity In a 16 socket bullion you can theoretically fit up to 5 VMs with 32 vCPUs with one physical core available for each vCPU (160 cores) with best performance (no over-commitment) On a 4 socket X7560 bullion (64 logical CPUs with H.T.), we could run 168 VMs with a CPU over-commitment of x2,6 and a good QoS (cf SpecVirt constraints): 28 tiles each one with 6 VMs (with 1 vCPU): 1 DB server + 1 JAVA Application Servers + 1 mail server + 1 WEB server + 1 NFS server + 1 server in standby to measure the latency of the network latency (SPECpoll, 99,5% of request < 1 s) Some consolidation projects allow to consolidate VMs inside the same cluster: allows reduction of the necessary HW (CPU, RAM, IOs)

16 VDI sizing For Citrix XenDesktop (used above ESXi hypervisor):
1 VM per user 1 physical core for 8 VMs 1 GB of memory per VM (no memory over commitment in order to avoid swapping) More precisely, memory varies according the OS guest : from 512 MB for a Windows XP VM to 2 GB for Windows 7 VM Example: for 1500 concurrent users => 190 cores (1500/8) & 1,5 TB memory Configuration must be tuned to take into account the following: Considerations about load and HA (number of ESX) Hosting of other necessary VMs for XenApp (XenApp broker, ...) other CITRIX modules Consolidation of other applications Etc.

17 CPU/memory load & High Availability
use several bullions (ESXi) for your VMware cluster : if one ESXi/bullion fails, VMware HA will restart the VMs on the other bullions Minimum is 2 bullions (fail-safe / maintenance) For no perf degradation (no CPU/memory over-commitment*): 50% average load for 2 bullions 67% average load for 3 bullions <= best compromise 75% average load for 4 bullions 80% average load above *max average load regardless of number of bullions s.b. up to 80%

18 CPU consolidation Consolidating an existing park of small (1/2 sockets) physical servers By default consider average CPU load to be no more than 15% Use Capacity Planner to obtain the exact number (e.g. XX => 7% CPU for 49 servers) Consolidating an existing park of small (1/2 sockets) virtualized servers By default consider the average CPU load to be 50% bullion proposition : bullion should be sized for an average load of up to 80%

19 Memory consolidation get the amount and load of memory of existing park to be consolidated % memory load is either given by an audit tool like Capacity Planner, or use 80% if you don’t know sizing rules for the memory in bullion are the same than CPU (50%-50%, 67%-67%-67%, max 80%)

20 bullion Inputs/Outputs sizing
I/O : check the capabilities of bullion : 6 PCIe adapters / module : FC 4/8 Gbps, Ethernet 1/10 Gbps 4 internal 1 GigE / module WARNING: check bullion limitations with multi-modules Consider that VMs running in the same server (specially 16 sockets) allows to reduce the number of Ethernet adapters compared to smaller servers where VMs need to communicate out of the server

21 Sizing Ethernet communication
For applications with many IOs between VMs (e.g. Xerox dematerialisation application) : => you may decrease up to ~25% your global bullion configuration (compared to a small server) For applications with not many IOS between VMs (e.g. VDI): => decrease from -5% your global bullion configuration

22 IO configurations max for quadri-module bullion
smaller configurations are possible by removing adapters Activated Kawela (1 GigE) 4 2 MegaRAID (disks) 1 LPE12002/1250 (FC) 7 I350-T2 (1 GigE) i350-T4 (1 GigE) X520-SR2/T2 * (10 gigE) 3 * X520-DA2 can be ordered through SFR

23 IO configurations max for tri-module bullion
smaller configurations are possible by removing adapters Activated Kawela (1 GigE) MegaRAID (disks) 1 LPE12002/1250 (FC) 6 5 I350-T2 (1 GigE) 2 i350-T4 (1 GigE) 4 X520-SR2/T2 * (10 GigE) 3 * X520-DA2 can be ordered through SFR

24 IO configurations max for bi-module bullion
smaller configurations are possible by removing adapters Activated Kawela (1 GigE) 4 2 0* 2** MegaRAID 1 LPE12002/1250 (FC) 3 i350-T2 (1 GigE) i350-T4 (1 GigE) X520-SR2/T2 (10 GigE) * vSphere 5 maximum of 8x Eth 10 Gbps ports is respected SFR only ** vSphere 5 maximum of Eth combinated 6x 10 Gbps ports + 4x 1 Gbps ports is respected

25 Ethernet network example for a bi-module
- 2 links 10 Gb/s dedicated to vMotion (1 TB can be evacuated in ~20’) + admin VMware – huge bandwith for the VMs (6 links 10 Gb/s) – internal bandwith inter-modules very important (~300 Gb/s)) – Hyper-Threading can be activated (perf %)

26 FC SAN example for a bi-module
4 HBAs (2 HBAs per module) 4 boot paths

27 Bullion sizing calculator (excel file)

28 Sizing exercise Propose an alternative solution with bullions to a DC with : 20 blades UCS B200 M2 (2x 6-core CPU X5690) , 96 GB mem/blade SPECint*rate 1 blade = 432

29 Sizing exercise SPECint*rate 1 UCS B200 M2 (2x 6-core CPU X5690) = 432
20 blades => SPECint*rate CPU load blade = 50% => SPECint*rate 3 bi-module E bullions provide : With a 100% CPU load: SPECint*rate, i.e. 101% of the target With a 2/3 CPU load: 2851 SPECint*rate, i.e. 68% of the target 4 bi-module E bullions provide : With a 100% CPU load: SPECint*rate, i.e. 134% of the target With a 2/3 CPU load: 4256 SPECint*rate, i.e. 101% of the target A good choice is to propose 4 bi-module E bullions

30 Project example: target architecture

31 Comparison blades vs bullion
4 sockets E cores/ 256 GB (32 DIMMs of 8 Go; max 48 DIMMs) + 1 châssis + Fabric Extender + 1 switch Fabric Interconnect 7U + 2U 1 553 watts 1 module bullion 4 sockets E cores / 256 GB (32 DIMMs of 8 Go; max 64 DIMMs) 3U 900 watts (-42%)

32 Project example : initial proposal
Needs : 752 vCPUs => /5 = 150 cores 1504 GB vRAM => x 0,7 = 1052 Go Galois Fermat Galois Fermat blades bullion 4 blades => 4 ESXi 16 sockets (160 cores) 1 024 GB 18 U 5 256 watts 4 servers bullion => 4 ESXi 16 sockets (160 cores) 1 024 GB 12 U 3 600 watts (-32%)

33 Project example: 1st evolution
Needs: 1799 vCPUs => /5 = 360 cores vRAM GB => x 0,7 = GB Galois Fermat Galois Fermat blades bullion 10 blades=> 10 ESXi 36 sockets (360 cores) 2 592 GB 32 U watts vSphere licenses (Entreprise+): 36 sockets x $4152 = $149,500 4 servers bullion => 4 ESXi 32 sockets (320 cores) 2 592 GB 24 U 7 200 watts (-29%) vSphere licenses (Entreprise+): 32 sockets x $4152 = $132, (- 12%)

34 Project example: 2nd evolution
Needs: 2847 vCPUs => /5 = 570 cores vRAM GB => x 0,7 = GB Need to add 2 extra chassis (+14U ) in order to add more than 2 CPUs +25% still available for future upgrade without adding servers +25% still available for future upgrade without adding servers blades bullion Galois Fermat Galois Fermat 4 servers bullion=> 4 ESXi 48 sockets (480 cores) 4 080 GB 36 U watts (-25%) vSphere licenses (Entreprise+): 48 sockets x $4152 = $199,327 (-17%) 16 blades => 16 ESXi 60 sockets (600 cores) 4 080 GB 32 U watts vSphere licenses (Entreprise+): 58 sockets x $4152 = $240,239

35 Example #2 Requirements (split in 2 datacenters for PRA):
428 VMs (spread among 10 application domains) 2 576 cores (/40 = 68,9) RAM GB Bullion scenarios: Scénario MONO  Total per DC Actual Total modules without VM consolidation 77 38,5 39 Total modules 78 Total RAM 9 832 RAM per serveur 126 128 Total cores 3 120 Scenario QUAD Total per DC Actual Total modules with VM consolidation 68,9 34,45 35 Total quad bullions per DC  18 9 Total modules 72 Total RAM  9 832 10 368 RAM per serveur 546 576 Total cores 2 880 6 modules less 14 modules less Scenario QUAD optimized (consolidation -7% ) Total per DC Actual Total modules 32 64 Total servers 8

36 TCO calculation Quantity of HW to install/maintain
TCO 3 years UCS B200 Bullion Capex Total Hardware $329,250 $368,760 Hardware Installation $3424 $1712 VMware licenses $136,178 $37,718 Opex Hardware administration $61,635 $20,545 Hardware Maintenance subscription $19.913 $16,352 ESXi Admin/Maintenance $51,363 $5136 Power supply $214,287 $65,619 Space use $11,853 Total $827,904 $527,697 savings with bullion = $300,207 36% Quantity of HW to install/maintain Nb of licenses based on nb of sockets Power consumption Space in Data Center Nb of VMware nodes

37 TCO calculator bullion (excel file)

38 Summary bullion : best performance & capacity (4110 SPECint*rate, 160 cores, 2 TB) => ideal for consolidation Consolidation: HW (sockets, memory, IO adapters) VMs Tools: Sizing tool (based on SPECint*rate and number of cluster nodes) TCO (comparison against competition: OPEX, CAPEX 3 & 5 years) ghfghgfhjg

39 Click to add comments


Download ppt "Sizing & TCO for bullion"

Similar presentations


Ads by Google