Presentation is loading. Please wait.

Presentation is loading. Please wait.

COMP25212 CPU Multi Threading Learning Outcomes: to be able to: –Describe the motivation for multithread support in CPU hardware –To distinguish the benefits.

Similar presentations


Presentation on theme: "COMP25212 CPU Multi Threading Learning Outcomes: to be able to: –Describe the motivation for multithread support in CPU hardware –To distinguish the benefits."— Presentation transcript:

1 COMP25212 CPU Multi Threading Learning Outcomes: to be able to: –Describe the motivation for multithread support in CPU hardware –To distinguish the benefits and implementations of coarse grain, fine grain and simultaneous multithreading –To explain when multithreading is inappropriate –To be able to describe a multithreading implementations –To be able to estimate performance of these implementations –To be able to state important assumptions of this performance model

2 Revision: Increasing CPU Performance Data Cache Fetch Logic Decode LogicFetch LogicExec LogicFetch LogicMem LogicWrite Logic Inst Cache How can throughput be increased? Clock a c b d f e

3 Increasing CPU Performance a)By increasing clock frequency b)By increasing Instructions per Clock c)Minimizing memory access impact – data cache d)Maximising Inst issue rate – branch prediction e)Maximising Inst issue rate – superscalar f)Maximising pipeline utilisation – avoid instruction dependencies – out of order execution g)(What does lengthening pipeline do?)

4 Increasing Program Parellelism –Keep issuing instructions after branch? –Keep processing instructions after cache miss? –Process instructions in parallel? –Write register while previous write pending? Where can we find additional independent instructions? –In a different program!

5 Revision – Process States Terminated Running on a CPU Blocked waiting for event Ready waiting for a CPU New Dispatch (scheduler) Needs to wait (e.g. I/O) I/O occurs Pre-empted (e.g. timer)

6 Revision – Process Control Block Process ID Process State PC Stack Pointer General Registers Memory Management Info Open File List, with positions Network Connections CPU time used Parent Process ID

7 Revision: CPU Switch Process P 0 Process P 1 Operating System Save state into PCB 0 Load state fromPCB 1 Save state into PCB 0 Load state fromPCB 1

8 What does CPU load on dispatch? Process ID Process State PC Stack Pointer General Registers Memory Management Info Open File List, with positions Network Connections CPU time used Parent Process ID

9 What does CPU need to store on deschedule? Process ID Process State PC Stack Pointer General Registers Memory Management Info Open File List, with positions Network Connections CPU time used Parent Process ID

10 CPU Support for Multithreading Data Cache Fetch Logic Decode LogicFetch LogicExec LogicFetch LogicMem LogicWrite Logic Inst CachePC A PC B VA Mapping A VA Mapping B Address Translation GPRs A GPRs B

11 How Should OS View Extra Hardware Thread? A variety of solutions Simplest is probably to declare extra CPU Need multiprocessor-aware OS

12 CPU Support for Multithreading Data Cache Fetch Logic Decode LogicFetch LogicExec LogicFetch LogicMem LogicWrite Logic Inst CachePC A PC B VA Mapping A VA Mapping B Address Translation GPRs A GPRs B Design Issue: when to switch threads

13 Coarse-Grain Multithreading Switch Thread on “expensive” operation: –E.g. I-cache miss –E.g. D-cache miss Some are easier than others!

14 Switch Threads on Icache miss 1234567 Inst aIFIDEXMEMWB Inst bIFIDEXMEMWB Inst cIF MISSIDEXMEMWB Inst dIFIDEXMEM Inst eIFIDEX Inst fIFID Inst X Inst Y Inst Z ----

15 Performance of Coarse Grain Assume (conservatively) – 1GHz clock (1nS clock tick!), 20nS memory ( = 20 clocks) – 1 i-cache miss per 100 instructions – 1 instruction per clock otherwise Then, time to execute 100 instructions without multithreading – 100 + 20 clock cycles – Inst per Clock = 100 / 120 = 0.83. With multithreading: time to exec 100 instructions: – 100 [+ 1] – Inst per Clock = 100 / 101 = 0.99..

16 Switch Threads on Dcache miss 1234567 Inst aIFIDEXM-MissWB Inst bIFIDEXMEMWB Inst cIFIDEXMEMWB Inst dIFIDEXMEM Inst eIFIDEX Inst fIFID MISS --- --- --- Inst X Inst Y Performance: similar calculation (STATE ASSUMPTIONS!) Where to restart after memory cycle? I suggest instruction “a” – why? Abort these


Download ppt "COMP25212 CPU Multi Threading Learning Outcomes: to be able to: –Describe the motivation for multithread support in CPU hardware –To distinguish the benefits."

Similar presentations


Ads by Google