Presentation is loading. Please wait.

Presentation is loading. Please wait.

INTEL HYPER THREADING TECHNOLOGY

Similar presentations


Presentation on theme: "INTEL HYPER THREADING TECHNOLOGY"— Presentation transcript:

1 INTEL HYPER THREADING TECHNOLOGY
By Suhas Gowda.S MS Embedded Systems Manipal University 3/28/2011 INTEL-HT

2 Outline INTRODUCTION Backgrounds Performance Vs Cost Multithreading
HYPER THEARDING TECHNOLOGY Duplication by HT HT Goals Trace cache miss and hit HT Execution OS support by HT Applications Conclusion 3/28/2011 INTEL-HT

3 BACKGROUNDS The amazing growth of the Internet and telecommunications is powered by ever-faster systems demanding increasingly higher levels of processor performance. Micro architecture techniques used to achieve past processor performance improvement have made microprocessors increasingly more complex, have more transistors, and consume more power But, transistor counts and power are increasing at rates greater than processor performance. 3/28/2011 INTEL-HT

4 Single-stream performance vs. cost
3/28/2011 INTEL-HT

5 Why Thread Your Application?
Increased responsiveness and worker productivity Increased application responsiveness when different tasks run in parallel. Improved performance in parallel environments When running computations on multiple processors. More computation per cubic foot of data center Web-based apps are often multi-threaded by nature. Performance + responsiveness makes it easier to add new features Taking full advantage of Multi-Core hardware requires, Multi-threaded softwares. 3/28/2011 INTEL-HT

6 Multithreading Time that processors wasted running single tasks while waiting for certain events to complete, software developers began wondering if the processor could be doing some other work at the same time. To arrive at a solution, software architects began writing operating systems that supported running pieces of programs, called threads. Threads are small tasks that can run independently. Each thread gets its own time slice, so each thread represents one basic unit of processor utilization. Threads are organized into processes, which are composed of one or more threads. All threads in a process share access to the process resources. 3/28/2011 INTEL-HT

7 These multithreading operating systems made it possible for one thread to run while another was waiting for something to happen. To benefit from multithreading, programs need to possess executable sections that can run in parallel. 3/28/2011 INTEL-HT

8 Hyper Threading Technology
Presents software with two logical processors even though only one physical processor is present. Effectively doubling the number of CPUs seen by the OS. The OS is tricked into seeing two processors because a HT processor has two sets of architectural sate resources. It is still technically only a single processor because the compute resources (execution units) are not doubled. 3/28/2011 INTEL-HT

9 Each logical processor Has its own architecture state
Executes its own code stream concurrently Can be interrupted and halted independently The two logical processors share the same Execution engine and the caches Firmware and system bus interface 3/28/2011 INTEL-HT

10 3/28/2011 INTEL-HT

11 Duplication By HT Replicated Resources
Control registers (Architectural Registers AR) 8 general purpose registers (AR) Machine state registers (AR) Debug registers (AR) Instruction pointers(IP) Register renaming tables(RNT) Return stack predictor (RSP) 3/28/2011 INTEL-HT

12 Replicated Resources AR’s are used by the operating system and application code to control program behavior and store data for computations. IP and RNT are replicated for simultaneously track execution and state changes of the two logical processors. The RSP is replicated to improve branch prediction of return instructions. 3/28/2011 INTEL-HT

13 Partitioned Resources
operational fairness permitting the ability to allow operations from one logical processor to bypass operations of the other logical processor that may have stalled. For example: a cache miss, a branch misprediction, or instruction dependencies may prevent a logical processor from making forward progress for some number of cycles. The partitioning prevents the stalled logical processor from blocking forward progress. 3/28/2011 INTEL-HT

14 Shared Resources Caches: trace cache, L1, L2, L3 Execution Units
They are fully shared to improve the dynamic utilization of the resource. 3/28/2011 INTEL-HT

15 Hyper-Threading Goals
Minimize die area cost for implementing Ensure forward progress by at least one logical processor Maintain single-threaded performance 3/28/2011 INTEL-HT

16 Trace Cache Hit 3/28/2011 INTEL-HT

17 Trace Cache Miss 3/28/2011 INTEL-HT

18 ITLB and Branch Prediction (I)
If there is a TC miss, bytes need to be loaded from L2 cache and decoded into TC ITLB gets the “instruction deliver” request ITLB translates next Pointer address to the physical address ITLBs are duplicated for processors L2 cache arbitrates on first-come first-served basis while always reserve at least one slot for each processor Branch prediction structures are either duplicated or shared If shared owner tags should be included 3/28/2011 INTEL-HT

19 Hyper-Threaded Execution
3/28/2011 INTEL-HT

20 Allocator Allocates many of the key machine buffers;
126 re-order buffer entries 128 integer and floating-point registers 48 load, 24 store buffer entries Resources shared equal between processors Limitation of the key resource usage, we enforce fairness and prevent deadlocks over the Arch. For every clock cycle, allocator switches between uop queues If there is stall or HALT, there is no need to alternate between processors 3/28/2011 INTEL-HT

21 Register Rename Involves with mapping shared registers names for each processor Each processor has its own Register Alias Table (RAT) Uops are stored in two different queues; Memory Instruction Queue (Load/Store) General Instruction Queue (Rest) Queues are partitioned among PUs 3/28/2011 INTEL-HT

22 Instruction Scheduling
Schedulers are at the heart of the out-of-order execution engine There are five schedulers which have queues of size 8-12 Scheduler is oblivious when getting and dispatching uops It ignores the owner of the uops It only considers if input is ready or not It can get uops from different PUs at the same time To provide fairness and prevent deadlock, some entries are always assigned to specific PUs 3/28/2011 INTEL-HT

23 Execution Units & Retirement
Execution Units are oblivious when getting and executing uops Since resource and destination registers were renamed earlier, during/after the execution it is enough to access physical registries After execution, the uops are placed in the re-order buffer which decouples the execution stage from retirement stage The re-order buffer is partitioned between Pus Uop retirement commits the architecture state in program order Once stores have retired, the store data needs to be written into L1 data-cache, immediately 3/28/2011 INTEL-HT

24 Performance 3/28/2011 INTEL-HT

25 Operating Systems Support for HT
Native HT Support Windows XP Pro Edition Windows XP Home Edition Windows 7 Linux v 2.4.x (and higher) Compatible with HT Windows 2000 (all versions) Windows NT 4.0 (limited driver support) No HT Support Windows ME Windows 98 (and previous versions) 3/28/2011 INTEL-HT

26 Applications It is used in Xeon processor’s to build Data centers.
To run Multithreaded Applications It is used to in processor to handle large workloads of server. HT technology gives Better Gaming Performance. It is also used for increases the speed of communication. 3/28/2011 INTEL-HT

27 Conclusion Intel’s Hyper-Threading Technology brings the concept of simultaneous multi-threading to the Intel Architecture. It will become increasingly important going forward as it adds a new technique for obtaining additional performance for lower transistor and power costs. The goal was to implement the technology at minimum cost while ensuring forward progress on logical processors, even if the other is stalled, and to deliver full performance even when there is only one active logical processor. 3/28/2011 INTEL-HT

28 References Break away with Intel Atom processors by Lori M. matassa and Max Domeika 3/28/2011 INTEL-HT

29 Thank U Questions?? 3/28/2011 INTEL-HT


Download ppt "INTEL HYPER THREADING TECHNOLOGY"

Similar presentations


Ads by Google