Presentation on theme: "Hardware Evolution in the Datacenter Rick Indyke AMD Business Development Mgr."— Presentation transcript:
Hardware Evolution in the Datacenter Rick Indyke AMD Business Development Mgr
2 IT Market Trends: Evolution of the Data Center Power and Cooling –High energy costs –Partially populated racks –Migration to dense form factors Compute Density –Weak performance scaling with additional processors –Low Server utilization / inefficient data center floor space usage –Grid/distributed computing Dynamic Datacenters –Provisioning on demand –Dynamic work load allocation Management Costs –Increasing percentage of TCO
3 Did you know? The combined total of data centers in California are estimated to require 250MW – 375MW of energy. Thats equivalent to 3,495 – 5,242 barrels of oil a day!
4 Effects of Power in the Data Center It adds up quick! More High-Voltage switch Equipment Requirements $$$ More High-Capacity CRAC units (air-conditioners) $$$ Lower Density/ Unusable Floor Space $$$ Data Center Expansion $$$ More UPS equipment requirements $$$ More Back-up Power Generator Requirements $$$
5 Direct Connect Architecture Enables better overall system performance because everything is directly connected Processors Cache Integrated Memory Controller System Request Interface Crossbar HyperTransport Technology
6 Legacy x86 Architecture 20-year old front-side bus architecture CPUs, memory, I/O all share a bus Traditional front-side bus creates bottleneck to performance AMD64 Technology with Direct Connect Architecture Industry-standard AMD64 technology AMDs revolutionary Direct Connect Architecture reduces bottlenecks inherent in traditional FSB architectures HyperTransport technology interconnect for high bandwidth and low latency Direct Connect Architecture Reduces architectural bottlenecks - 2P system comparison USB PCI 8 GB/S I/O Hub PCI-E Bridge SRQ Crossbar HT Mem.Ctrlr SRQ Crossbar HT Mem.Ctrlr I/O Hub CPU I/O Hub PCI-E Bridge Dual-Core Memory Controller Hub CORE Dual-Core 8 GB/S
7 Legacy x86 Architecture 20-year old front-side bus architecture CPUs, memory, I/O all share a bus Traditional front-side bus creates bottleneck to performance AMD64 Technology with Direct Connect Architecture Industry-standard AMD64 technology AMDs revolutionary Direct Connect Architecture reduces bottlenecks inherent in traditional FSB architectures HyperTransport technology interconnect for high bandwidth and low latency Direct Connect Architecture Balanced platform bandwidth – 4P system comparison USB PCI 8 GB/S I/O Hub PCI-E Bridge SRQ Crossbar HT Mem.Ctrlr SRQ Crossbar HT Mem.Ctrlr SRQ Crossbar HT Mem.Ctrlr SRQ Crossbar HT Mem.Ctrlr I/O Hub PCI-E Bridge USB PCI I/O Hub Memory Controller Hub CORE XMB
8 Quad-Core AMD Opteron Processors More than just four cores Significant CPU Core Enhancements Significant Cache Enhancements World-class performance Native Quad-Core –Faster data sharing between cores Enhanced AMD-V –Nested paging acceleration for virtual environments Reducing total cost of ownership Performance/Watt leadership –Consistent 95W thermal design point –Low power 68W solutions Drop-in upgrade –Socket F compatibility – BIOS upgrade –Leverage existing platform infrastructure Common Core Architecture –One core technology top-to-bottom –Top-to-bottom platform feature consistency
9 Native Quad-Core Benefit: Faster Data Sharing Core 1 L2 System Request Queue Crossbar Hyper Transport Memory Controller Native Quad-Core AMD Opteron L3 Core 2 Core 3 Core 4 1.Core 1 probes Core 3 cache, data is copied directly back to Core L2 Situation: Core 1 needs data in Core 3 cache … How Does it Get There? 1.Core 1 sends a request to the memory controller, which probes Core 3 cache 2.Core 3 sends data back to the memory controller, which forwards it to Core 1 Quad-Core Clovertown Core 1 Core 2 Core 3 Core 4 L2 Front-Side Bus Memory Controller Northbridge Result: Improved Quad-Core Performance Result: Reduced Quad-Core Performance This happens at processor frequency This happens at front-side bus frequency
10 Barcelona … Not Just Four Cores Comprehensive 128-bit SSE Upgrades Goal: Balanced SSE Execution Instruction Fetch Bandwidth Data Cache Bandwidth L2/NB Bandwidth 64-bit Platforms AMD Barcelona Intel Clovertown 1x 2x Barcelona doubles Instruction and Data delivery … Intels pipeline doesnt Helps keep our 128-bit SSE pipeline full for optimal performance Dedicated 36-entry floating-point scheduler helps reduce application latency Intels 32-entry scheduler is shared between floating-point and integer operations Over 80% performance boost, per core, on target applications!
11 Quad-Core Xeon Dempsey Xeon Wood- Crest Xeon Clover- Town Rev F Next-Generation Power Comparison In 2006 Next-Generation AMD Opteron Defined A New Standard In Performance-Per-Watt With Energy-Efficient DDR2 Memory and Improved AMD PowerNow! Capabilities In mid-2007 We Plan to Offer Quad-Core AMD Opteron in the Same DDR2-based Platforms at the Same Power Efficiency Wattage based on 2P systems with 8 DIMMs at max CPU wattage; Wattage for Dempsey, Woodcrest and Clovertown is estimated based on currently publicly available values (see, eg: ) and is subject to change. The examples contained herein are intended for informational purposes only. Other factors will affect real-world power consumption. Dual-Core Watts From: Memory CPU Northbridge
12 An Actual View of Power Idle & Load Measured at the Cord IDLE LOAD 80W TDP 65W TDP 95W TDP AMD measured results show the AMD Opteron processor-based system consumes less power even though processor TDP power is higher! Underlying processor architectures can affect overall platform power consumption Energy estimates include power input & cooling at 60%, Power Utility cost: $0.10/KW-hr, based on publicly available processor & chipset specifications and AMD internal estimates. The examples contained herein are intended for informational purposes only, actual results will vary. Other factors will affect real-world power consumption and cost. The system load used was a representative build of SPECint_base2000 for that system. Any SPEC performance metrics referenced are estimates. $456 per/year $227,760 per/year $360 per/year $180,106 per/year $436 per/year $217,949 per/year 26% More 21% More 66% More 64% More AMD PowerNow! technology enables lower power consumption during non-peak workloads, up to 75% savings at IDLE. IDLE LOAD IDLE LOAD IDLE LOAD 68W TDP $343 per/year (1 server) $171,696 per/year (500 servers)
13 Improving Processor Power Management with Enhanced AMD PowerNow! Technology GOOD GREAT IDLE MHz 75% IDLE MHz CORE 0 CORE 1 35% IDLE MHz 10% IDLE MHz CORE 2 CORE 3 1% IDLE MHz 75% IDLE MHz CORE 0 CORE 1 MHz is locked to highest utilized cores p-state MHz is independently adjusted separately per core according to utilization. Opteron (Rev F) Barcelona Native Quad-Core technology enables enhanced power management across all four cores 35%
14 AMD Opteron Processor Summary Evolving Direct Connect Architecture –For continued winning in the enterprise –Torrenza for Application Acceleration Advancing Performance-per-Watt leadership –Low-power, high-performing DDR2 memory –Consistent 95W standard power roadmap Reducing Total Cost of Ownership (TCO) –One transition to your next stable platform –Seamless Dual-Core to Quad-Core upgrade in same 95W infrastructure Extending our Lead in x86 Virtualization –Founded on Direct Connect Architecture –AMD Virtualization improves business functionality and flexibility
15 The Smarter Choice for IT x64 AMD64 Direct Connect Architecture AMD Opteron AMD Athlon 64 Native Dual-Core AMD Turion 64 Performance Per/watt Virtualization Native Quad-Core Accelerated computing (Torrenza & Stream) Trinity Raiden Integrated Memory Controller HT Fusion In 1999, AMD introduced a long-term solution that customers could grow with. In 2003, AMD permanently changed the IT landscape with the intro of the AMD Opteron processor. In 2005, AMD showed the industry how to make the transition from single-core to native dual-core. In 2007, the launch of Barcelona will have an even greater impact…