Presentation is loading. Please wait.

Presentation is loading. Please wait.

AMD Microprocessor Technologies Ben Sander AMD Principal Member of Technical Staff 06/21/06 2006.

Similar presentations


Presentation on theme: "AMD Microprocessor Technologies Ben Sander AMD Principal Member of Technical Staff 06/21/06 2006."— Presentation transcript:

1 AMD Microprocessor Technologies Ben Sander AMD Principal Member of Technical Staff 06/21/

2 06/21/06Ben Sander 2 Motivation : PC Jargon Demystified •“AMD Athlon™ * dual-core processor with 64-bit platform, Direct Connect Architecture and HyperTransport™ Technology for increased multitasking performance; improved security with Enhanced Virus Protection**; Cool'n'Quiet™ Technology to minimize heat and noise”

3 06/21/06Ben Sander 3 Talk Outline •Motivation •Recent innovations –Dual-core processors –Direct Connect Architecture TM and HyperTransport TM –Power-efficient design (and Cool’n’Quiet TM ) –AMD64 Architecture •What’s next? –Direct Connect Architecture TM enhancements –HTX “Accelerators” –Core enhancements –Virtualization and AMD-V •Summary and Conclusion

4 06/21/06Ben Sander 4 Dual-Core AMD Opteron™ Processor Design CPU0 1MB L2 Cache CPU1 System Request Interface Crossbar Switch Memory Controller 0 12 Existing AMD Opteron™ Processor Design 1MB L2 Cache •Two AMD Opteron™ processor cores on a single die –Each with 1MB L2 cache •Shared Northbridge –Three HyperTransport™ technology links –Dual-channel (128 bit) DDR interface •AMD Opteron processor designed as CMP from the start –2 nd port on SRI, request management, 2 APICs, clocking microcode •Two complete CPUs –Symmetric multiprocessor programming (SMP) model –Simpler, less restrictive programming model than ‘virtual CPU’ approach HyperTransport™

5 06/21/06Ben Sander 5 MPF AMD Dual-Core Processor Chip Integration: •Two 64-bit CPU cores •2MB L2 cache •On-chip Northbridge & Memory Controller Bandwidth: •Dedicated 64-bit L2 busses for each core •Dual channel DDR (128-bit) memory bus •3 HT links (16-bit each x 2 GT/sec x 2) Usability and Scalability: •Socket compatible: Platform and TDP! •Glueless SMP up to 4 sockets •Memory capacity & BW scale w/ CPUs Power Efficiency: •PowerNow! Optimized power management •Leadership system level power attributes

6 06/21/06Ben Sander 6 AMD64 Dual-Core Physical Design •90nm –Approximately same die size as 130nm single-core AMD Opteron™ processor –~205 million transistors •68/95 watt power envelope –Fits into 90nm power infrastructure •939/940 Socket compatible –Fits into existing sockets

7 06/21/06Ben Sander 7 Dual-Core : Customer Value •What is it? –Two processing cores on the same die •AMD: Clean single-core to multi-core upgrade path –Same pinout –Same power envelope! •Server customers –Server apps scale extremely well with increasing processors Transaction processing, web serving –Doubles compute density More compute power from the same motherboard More compute power in a server rack –More efficient software licensing •Consumers –Efficiently run multiple programs at the same time Operating system + background application Virus checker + photo-editing software –Significantly improves performance of threaded applications Video editing, MP3 encoding

8 06/21/06Ben Sander 8 Dual-Core AMD Opteron™ Processor Design CPU0 1MB L2 Cache CPU1 System Request Interface Crossbar Switch Memory Controller 0 12 Existing AMD Opteron™ Processor Design 1MB L2 Cache •Two AMD Opteron™ processor cores on a single die –Each with 1MB L2 cache •Shared Northbridge –Three HyperTransport™ technology links –Dual-channel (128 bit) DDR interface •AMD Opteron processor designed as CMP from the start –2 nd port on SRI, request management, 2 APICs, clocking microcode •Two complete CPUs –Symmetric multiprocessor programming (SMP) model –Simpler, less restrictive programming model than ‘virtual CPU’ approach •AMD Direct Connect Architecture –Everything connected directly to CPU –Reduces system architecture bottlenecks –Further reduces latency by directly connecting two cores on same die HyperTransport™

9 06/21/06Ben Sander 9 I/O Hub USB PCI PCIe TM Bridge I/O Hub 8 GB/S PCI-E Bridge PCIe TM Bridge USB PCI I/O Hub XMB SRQ Crossbar HT Mem.Ctrlr SRQ Crossbar HT Mem.Ctrlr SRQ Crossbar HT Mem.Ctrlr SRQ Crossbar HT Mem.Ctrlr Direct Connect : Advantages of good plumbing Memory Controller Hub MCP Legacy x86 Architecture •20-year old front-side bus (FSB) architecture •CPUs, Memory, I/O all share a bus •Major bottleneck to performance •Faster CPUs or more cores ≠ performance AMD64’s Direct Connect Architecture •Industry-standard technology •Direct Connect eliminates the FSB bottleneck •HyperTransport™ interconnect offers scalable high bandwidth and low latency Chip X Chip X Chip X Chip X Chip X Chip X Chip X Chip X

10 06/21/06Ben Sander 10 AMD Direct Connect : Customer Value •What is it? –Direct connection of cpu to the DRAM/memory –And cpu-to-cpu for multi-processor systems. •Increased performance –Reduced memory latency –Reduced chip communication latency •Reduced power –Reduced chip-count in system –Reduced external pin switching •Scalability –Unlocks the potential of faster CPUs and additional cores

11 06/21/06Ben Sander 11 What ’ s Consuming all the Power? Computer Room Air Conditioner power consumption 23% - 54% Battery Backup power consumption 6% - 13% Lighting power consumption 1% - 2% Server power consumption 38% - 63% Server Power Consumption Impacts Power throughout the Datacenter

12 06/21/06Ben Sander 12 I/O Hub USB PCI PCIe TM Bridge I/O Hub 8 GB/S USB PCI XMB SRQ Crossbar HT Mem.Ctrlr SRQ Crossbar HT Mem.Ctrlr SRQ Crossbar HT Mem.Ctrlr SRQ Crossbar HT Mem.Ctrlr System-level Power Consumption – Present Day 380 watts 8.5watts 8.5watts8.5watts 8.5watts Dual-Core Packages with legacy technology •692 watts for processors (173w each) •48 watts for external memory controller 95% More Power Dual-Core AMD Opteron™ processors •380 watts for processors (95w each) •Integrated memory controllers 740 watts 380 watts MCP Chip X Chip X Chip X Chip X Chip X Chip X Chip X Chip X 692 watts Source: Mixture of publicly available data sheets and AMD internal estimates. Actual system power measurements may vary based on configuration and components used I/O Hub Memory Controller Hub 14watts PCI-E Bridge PCIe TM Bridge

13 06/21/06Ben Sander 13 Reducing Power and Cooling Requirements with Processor Performance States P-State HIGH LOW P0 2600MHz 1.40V ~95watts P1 2400MHz 1.35V ~90watts P2 2200MHz 1.30V ~76watts P3 2000MHz 1.25V ~65watts P4 1800MHz 1.20V ~55watts P5 1000MHz 1.10V ~32watts PROCESSORUTILIZATION Up to 75% power savings! Average CPU Core Power (measured at CPU) Connections (~62% CPU Utilization) 5000 Connections (~40% CPU Utilization) Idle (in OS) Power (W) PowerNow! DISABLED PowerNow! ENABLED -33% -62% -75%

14 06/21/06Ben Sander 14 Power-efficient design : Customer Value •What is it? –PowerNow! Technology changes frequency in response to workload At lower frequencies, voltage is reduced as well –Power efficiency “designed-in” Appropriate frequency targets Integrate external chipset logic (aka Dirrect Connect) “Fine gating” and other design-for-power techniques •Customer value –Server: Save $$$ on server power and air conditioning –Desktop: Quieter operation via “Cool’n’Quiet™” technology –Notebook: Longer battery life

15 06/21/06Ben Sander 15 AMD64 : Evolutionary 64-bit ISA •What is it? –Evolutionary extension to support “64-bits” on x86 processors –Now an industry standard supported by other processor vendors •Why 64 bits? –Driven by apps needing large amounts of memory CAD tools, large databases, simulations –64-bit integer arithmetic Security and encryption applications •Why extend x86 to 64 bits? –X86 is the most widely installed instruction set in the world –Delivers 64-bit advantages while providing full x86 compatibility –Doesn’t require a completely new tool chain •User benefits from 64 bits: –Large-memory applications Some applications see 10x speedup from additional memory. 64-bit flat programming model massively easier for software developers –Some performance improvement from additional registers and wider data operations –AMD64: Backwards compatibility allows migration on customer’s timeframe

16 06/21/06Ben Sander 16 Design Goals for AMD64 Technology •Processor is fully compatible with existing x86 modes •Straightforward extensions for 64 bits –Minimize architectural divergences Maintain consistency with existing architecture –Minimize instruction set encoding changes –Straightforward implementation & verification •Double the number of Integer and SSE registers •Architectural support for 64 bits of virtual address space and 52 bits of physical address space –Implementations may support less •64-bit integer operations •Eliminate unused/underutilized arcane x86 features within the context of 64-bit mode

17 06/21/06Ben Sander 17 AMD64 Programmer’s Model RAX

18 06/21/06Ben Sander 18 REX prefix byte •Additional registers encoded without altering existing instruction format •Optional REX prefix specifies 64-bit operation size override –Plus 3 additional register encoding bits •REX is actually a family of 16 prefixes (40-4F) •Average instruction length in 64-bit mode increased by 0.4 bytes

19 06/21/06Ben Sander 19 Talk Outline •Motivation •Recent innovations –Dual-core processors –Direct Connect Architecture TM and HyperTransport TM –Power-efficient design (and Cool’n’Quiet TM ) –AMD64 Architecture •What’s next? –Direct Connect Architecture TM enhancements –HTX “Accelerators” –Core enhancements –Virtualization and AMD-V •Summary and Conclusion

20 06/21/06Ben Sander 20 Promising Concept Excellent way to get power-efficient performance boosts  Special-purpose, tuned solutions for common functions  Drop to low-power states when not in use  Enabled by Modern API’s Aligns with modularity imperative  Co-processor becomes another (optional) “IP block”  Micro-architecture: Command delivery, Synchronization, Streaming Many possible opportunities now, and/or in the future  Media processing  JVM/CLR runtime hosting  NIC integration (TOE, XML, SSL, etc) Co-processors and Accelerators

21 06/21/06Ben Sander 21 HyperTransport HTX TM Enables System-level Coprocessing Today

22 06/21/06Ben Sander 22 AMD’s Next Generation Processor Technology •Scalable performance and balance Faster HyperTransport links (up to 5.2 GT/sec) Additional bandwidth enhancements On-chip shared L3 cache •Maintain performance per watt leadership Independent NB and CPU power management Independent CPU P-state and C-state controls •Performance on diverse workloads Enhanced IPC CPU core; >2X FPU performance 48-bit virtual and physical address space 1GB large page support Platform support for co-processors •Compatibility DDR2 memory support with migration to DDR3 FBDIMM Gen1 and Gen2 at the appropriate time HT-1 backwards compatibility •Enhanced Virtualization I/O Virtualization Nested paging support •Enhanced RAS Memory mirroring Data poisoning support HT retry protocol support

23 06/21/06Ben Sander 23 AMD’s Next Generation Processor Technology Native quad core die Optimized for 65nm SOI and beyond Expandable shared L3 cache IPC enhanced CPU cores 32B instruction fetch Improved branch prediction Out-of-order load execution Up to 4 DP FLOPS/cycle Dual 128-bit SSE dataflow Dual 128-bit loads per cycle Improved core and Northbridge prefetchers Bit Manipulation extensions (LZCNT/POPCNT) SSE extensions (EXTRQ/INSERTQ, MOVNTSD/MOVNTSS) Enhanced Direct Connect Architecture and Northbridge HT-3 links (5.2GT/sec) Enhanced crossbar DDR2 with migration path to DDR3 FBDIMM when appropriate Enhanced power management Enhanced RAS

24 06/21/06Ben Sander 24 Virtualization is the pooling and abstraction of resources in a way that masks the physical nature and boundaries of those resources from the resource users

25 06/21/06Ben Sander 25 Virtualization: Customer Value •What it is? –Allows a single computer to efficiently run multiple guest Operating Systems and associated applications –AMD-V provides hardware acceleration for virtualization And simplfies the development process. •Benefits: –Consolidation More efficient use of compute resources Eliminate “single-application” servers Consolidate old unsupported servers onto newer hardware –Migration/reliability If a server fails, can easily move app to another server –Allows developers to easily test multiple OS environments on a single machine. –Upgrades can be tested on hardware before deployment

26 06/21/06Ben Sander 26 Virtualization Methods •Software-only virtualization –Software acts a translator between OS and hardware –No need to modify the operating system –Available today –Can be slow •OS-enabled virtualization –Host OS and virtualization software tightly integrated Offers improved performance But requires changes to OS •Processor-supported virtualization –Processor protects memory locations so that only virtualization software can access them –Processor provides hooks on all system-level instructions –Accelerated performance and better security

27 06/21/06Ben Sander 27 AMD-V: Overview •Virtualization is being used in several server scenarios today •AMD expects that virtualization will prove valuable for PC clients too •There are ways to modify the X86 architecture, so that virtualization is easier to accomplish, performs better, and provides more security •AMD’s AMD-V technology is being developed for future AMD64 CPUs for servers and clients •Key technologies include adding new instructions, supporting different methods of handling page tables, handle host and guest interrupts (including SMI/SMM), and provide DMA protection

28 06/21/06Ben Sander 28 Summary and Conclusion  AMD is focused on customer-centric innovation and value –Dual-core processors –Direct Connect Architecture and HyperTransport –Power-efficient design –AMD64 Architecture –And more!  AMD is investing heavily in extending our leadership –Next generation Direct Connect Architecture technology –Next generation CPU technology –AMD-V and hardware virtualization –Developing a fundamental understanding of important emerging trends

29 06/21/06Ben Sander 29 Thank you ! © 2006 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow, AMD Athlon, AMD Opteron and combinations thereof, are trademarks of Advanced Micro Devices, Inc. HyperTransport is a trademark of the HyperTransport Consortium PCI-X, PCIe and PCI Express are trademarks of PCI-SIG Other names used in this presentation are for informational purposes only and may be trademarks of their respective owners.

30 06/21/06Ben Sander 30 Backup

31 06/21/06Ben Sander 31 AMD Architectural Generations Coming Soon Extensions to AMD64 Multi-core Architecture Scalable SMP Architecture AMD-V Virtualization HyperTransport v3.0 DDR3, FBDIMM Partitioned PowerNow! Mainframe-class reliability System Perf. / Watt Future FPU Extensions to AMD64 Throughput Architecture On-chip Coprocessors Secure Execution HyperTransport v4.0 DDR4, FBD2 System Resource Mgmnt Best-in-class Reliability Throughput / Watt / $$ AMD64 Architecture Dual Core Architecture Direct Connect Architecture Enhanced Virus Protection HyperTransport™ v1.0, v2.0 DDR, DDR2 AMD PowerNow!™ Technology High Reliability RAS System Performance Now


Download ppt "AMD Microprocessor Technologies Ben Sander AMD Principal Member of Technical Staff 06/21/06 2006."

Similar presentations


Ads by Google