A quick review… Pipelining: Divide datapath into nearly equal tasks, to be performed serially and requiring non-overlapping resources. One instruction is executed per clock period. Throughput = Number of instructions executed in a clock period = 1 CPI = 1 2
Instruction level parallelism (ILP) 3 Pipelining exploits the potential parallelism among instructions. This parallelism is called Instruction level parallelism (ILP). Two methods to increase ILP: a) Increase the depth of pipelining b) Replicate the internal components of the computer so that it can launch multiple instructions in every pipeline stage. Consider the Automobile Assembly line example taught in the class..
* Figures are borrowed from Dr. Vishwani Agrawal’s slides for the sake of continuity. 4 Automobile Assembly Line Task 1 1 hour Task 2 1 hour Task 3 1 hour Task 4 1 hour First car assembled in 4 hours (pipeline latency) thereafter 1 car per hour 21 cars on first day, thereafter 24 cars per day 717 cars per month 8,637 cars per year MecahnicalElectricalPaintingTesting
Longer Assembly Line 5 Task 3 1/2 hour Task 4 1/2 hour Task 5 1/2 hour Task 8 1/2 hour First car assembled in 4 hours (pipeline latency) thereafter 1 car per half an hour 41 cars on first day, thereafter 48 cars per day 1433 cars per month 17,273 cars per year Mechanical Mechanical Electrical Electrical Painting Painting Testing Testing ( Chassis) (Seats) (Lighting) (Battery) (Body) (Rust proof ) (Electrical ) (Mechanical) Task 1 1/2 hour Task 2 1/2 hour Task 6 1/2 hour Task 7 1/2 hour
6 Multiple Assembly Line Task 1 1 hour Task 2 1 hour Task 3 1 hour Task 4 1 hour Two cars are assembled in 4 hours (pipeline latency) thereafter 2 cars per hour 42 cars on the first day and thereafter 48 cars per day 1,432 cars per month 17,272 cars per year MecahnicalElectricalPaintingTesting
Throughput: Multiple Assembly Line 7 Mechanical Electrical Painting Testing Car 1 Car 2 Car 3 Car 4. Car 1 and 2 complete Car 3 and 4 complete Throughput= m(1- (n - 1)/ T) cars per unit time Throughput = m as T→∞ Time (T) Mechanical Electrical Painting Testing
Comparison Throughput of Single cycle1/n Throughput of Pipelining1 Throughput of Superscalarsm n -Time unit (clock period). m -Number of parallel datapaths. 8
9 Comparison Single cycle: CPI = 1 Multi-cycle: CPI > 1 Pipelining: CPI = 1 Multiple issue pipelines: CPI < 1 – Sometimes Instructions per Cycle (IPC) is used. –Today’s high-end processors attempt to issue 3 to 8 instructions in every clock cycle.
10 Issues 1.How does the processor determine how many instructions and which instructions to execute in parallel. ex: – lw$t0, 1200 ($t1) sw$t2, 1200 ($t1) 2.Dealing with data and control hazards.
11 Definitions Multiple Issue: – A scheme whereby multiple instructions are launched in 1 clock cycle. Static multiple Issue: – An approach to implementing a multiple-issue processor where many decisions are made by the compiler before execution. Dynamic multiple Issue: – An approach to implementing a multiple-issue processor where many decisions are made during execution by the processor. These are also called Superscalars. Although these are considered two distinct approaches, in reality techniques from one approach are often borrowed by the other.
References Computer Organization & Design: The Hardware/Software Interface, 3 rd Ed., D.A. Patterson & J.L. Hennessy, Morgan Kaufmann Publishers (Elsevier), 2005. Computer architecture : a quantitative approach / David A. Patterson, John L. Hennessy ; with a contribution by David Goldberg. http://en.wikipedia.org/wiki/Superscalar. 13