Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.

Similar presentations


Presentation on theme: "CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson."— Presentation transcript:

1 CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson & Hennessy, ©2005 25

2 Recap: Single Cycle Datapath Uses the clock cycle inefficiently – the clock cycle must be timed to accommodate the slowest instruction –especially problematic for more complex instructions like floating point multiply Clk lwsw Waste Cycle 1Cycle 2

3 Recap: Multicycle Datapath Address Read Data (Instr. or Data) Memory PC Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU Write Data IR MDR A B ALUout Sign Extend Shift left 2 ALU control Shift left 2 ALUOp Control IRWrite MemtoReg MemWrite MemRead IorD PCWrite PCWriteCond RegDst RegWrite ALUSrcA ALUSrcB zero PCSource 1 1 1 1 1 1 0 0 0 0 0 0 2 2 3 4 Instr[5-0] Instr[25-0] PC[31-28] Instr[15-0] Instr[31-26] 32 28 IFIDEXMEMWB

4 Single Cycle vs. Multiple Cycle Multiple Cycle Implementation: Clk Cycle 1 IFIDEXMEMWB Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9Cycle 10 IFIDEXMEM lwsw IF R-type Clk Single Cycle Implementation: lwsw Waste Cycle 1Cycle 2

5 Gotta Do Laundry Michael, Conan, Jimmy, Pat each have one load of clothes to wash, dry, fold, and put away –Washer takes 30 minutes –Dryer takes 30 minutes –“Folder” takes 30 minutes –“Stasher” takes 30 minutes to put clothes into drawers MCJP

6 Sequential laundry takes 8 hours for 4 loads TaskOrderTaskOrder C J P M 30 Time 30 6 PM 7 8 9 10 11 12 1 2 AM Sequential Laundry

7 Pipelined laundry takes 3.5 hours for 4 loads! TaskOrderTaskOrder C J P M 12 2 AM 6 PM 7 8 9 10 11 1 Time 30 Pipelined Laundry

8 General Definitions Latency: time to completely execute a certain task –E.g., time to read a sector from disk is disk access time or disk latency Throughput: amount of work that can be done over a period of time

9 Pipelining doesn’t help latency of single task, it helps throughput of entire workload Multiple tasks operating simultaneously using different resources Potential speedup = Number pipe stages Time to “fill” pipeline and time to “drain” it reduces speedup: 2.3X v. 4X in this example 6 PM 789 Time C J P M 30 TaskOrderTaskOrder Pipelining Lessons

10 Suppose new Washer takes 20 minutes, new Stasher takes 20 minutes. How much faster is pipeline? –Pipeline rate limited by slowest pipeline stage –Unbalanced lengths of pipe stages reduces speedup

11 A Pipelined MIPS Processor Start the next instruction before the current one has completed –improves throughput –instruction latency is not reduced –clock cycle (pipeline stage time) limited by slowest stage –for some instructions, some stages are wasted cycles Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5 IFIDEXMEMWB lw Cycle 7Cycle 6Cycle 8 sw IFIDEXMEMWB R-type IFIDEXMEMWB

12 Single Cycle vs. Multiple Cycle vs. Pipelined Multiple Cycle Implementation: Clk Cycle 1 IFIDEXMEMWB Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9Cycle 10 IFIDEXMEM lwsw IF R-type Clk Single Cycle Implementation: lwsw Waste Cycle 1Cycle 2 lw IFIDEXMEMWB Pipeline Implementation: IFIDEXMEMWB sw IFIDEXMEMWB R-type

13 Single Cycle vs. Pipelined Example: Compare average time between lw instructions of a single cycle implementation to a pipelined implementation. Assume following operation times for major functional units –200 ps for memory access –200 ps for ALU operation –100 ps for register file read or write (DONE IN CLASS)

14 Simplified MIPS Pipelined Datapath Can you foresee any problems with these right-to-left flows? Why are we duplicating some functional units?

15 Pipeline registers Need registers between stages –To hold information produced in previous cycle

16 IF

17 ID

18 EX for Load

19 MEM for Load

20 WB for Load Wrong register number There is a BUG here

21 Corrected Datapath for Load


Download ppt "CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson."

Similar presentations


Ads by Google