Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 14: Processors CS 2011 Fall 2014, Dr. Rozier.

Similar presentations


Presentation on theme: "Lecture 14: Processors CS 2011 Fall 2014, Dr. Rozier."— Presentation transcript:

1 Lecture 14: Processors CS 2011 Fall 2014, Dr. Rozier

2 BOMB LAB STATUS

3 MP2

4 Lab Phases: Recursive Phase 1 – Factorial Phase 2 - Fibonacci

5 Lab Phases: Arrays Phase 4 – Sum Array Phase 5 – Find Item Phase 6 – Bubble Sort

6 Lab Phases: Trees Array representation: [1,2,3,4,5,6,7,0,0,0,0,0,0,0,0] Phase 7 – Tree Height Phase 8 – Tree Traversal [1,2,5,0,0,4,0,0,3,6,0,0,7,0,0] 1 2 3 4 5 6 7

7 PROCESSORS

8 What needs to be done to “Process” an Instruction? Check the PC Fetch the instruction from memory Decode the instruction and set control lines appropriately Execute the instruction – Use ALU – Access Memory – Branch Store Results PC = PC + 4, or PC = branch target

9 CPU Overview

10 Chapter 4 — The Processor — 10 Can’t just join wires together Use multiplexers

11 CPU + Control

12 Logic Design Basics Information encoded in binary – Low voltage = 0, High voltage = 1 – One wire per bit – Multi-bit data encoded on multi-wire buses Combinational element – Operate on data – Output is a function of input State (sequential) elements – Store information

13 Combinational Elements AND-gate – Y = A & B A B Y I0 I1 Y MuxMux S Multiplexer Y = S ? I1 : I0 A B Y + A B Y ALU F Adder Y = A + B Arithmetic/Logic Unit Y = F(A, B)

14 Storing Data?

15 S-R Latch S – set R – reset Feedback keeps the bit “trapped”.

16 S-R Latch Characteristic TableExcitation Table SRQ_nextActionQQ_nextSR 00Qhold000X 010reset0110 101set1001 11XN/A11X0

17 D Flip-Flop We can note in the S-R Latch that S is the complement of R in state changes

18 D Flip-Flop Feed D and ~D to a gated S-R Latch to create a one input synchronous SR-Latch We’ll call it a D Flip-Flop, just to be difficult.

19 D Flip-Flop D – input signal E – enable signal, sometimes called clock or control E/CDQ~QNotes 0XQ_prev~Q_prev 1001 1110

20 D Flip-Flop D – input signal E – enable signal, sometimes called clock or control E/CDQ~QNotes 0XQ_prev~Q_prevNo change 1001Reset 1110Set

21 Adding the Clock

22 More Realistic

23 Register File

24 Sequential Elements Register: stores data in a circuit – Uses a clock signal to determine when to update the stored value – Edge-triggered: update when Clk changes from 0 to 1 D Clk Q D Q

25 Sequential Elements Register with write control – Only updates on clock edge when write control input is 1 – Used when stored value is required later D Clk Q Write D Q Clk

26 Clocking Methodology Combinational logic transforms data during clock cycles – Between clock edges – Input from state elements, output to state element – Longest delay determines clock period

27 Building a Datapath Datapath – Elements that process data and addresses in the CPU Registers, ALUs, mux’s, memories, … We will build a MIPS datapath incrementally – Refining the overview design

28 Pipeline Fetch Decode Issue Integer Multiply Floating Point Load Store Write Back

29 Instruction Fetch 32-bit register Increment by 4 for next instruction

30 ALU Read two register operands Perform arithmetic/logical operation Write register result

31 Load/Store Instructions Read register operands Calculate address Load: Read memory and update register Store: Write register value to memory

32 Branch Instructions?

33 Datapath With Control

34 ALU Instruction

35 Load Instruction

36 Branch-on-Equal Instruction

37 Performance Issues Longest delay determines clock period – Critical path: load instruction – Instruction memory  register file  ALU  data memory  register file Not feasible to vary period for different instructions Violates design principle – Making the common case fast We will improve performance by pipelining

38 Pipelining Analogy Pipelined laundry: overlapping execution – Parallelism improves performance §4.5 An Overview of Pipelining Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup = 2n/0.5n + 1.5 ≈ 4 = number of stages

39 MIPS Pipeline Five stages, one step per stage 1.IF: Instruction fetch from memory 2.ID: Instruction decode & register read 3.EX: Execute operation or calculate address 4.MEM: Access memory operand 5.WB: Write result back to register

40 Pipeline Performance Assume time for stages is – 100ps for register read or write – 200ps for other stages Compare pipelined datapath with single-cycle datapath InstrInstr fetchRegister read ALU opMemory access Register write Total time lw200ps100 ps200ps 100 ps800ps sw200ps100 ps200ps 700ps R-format200ps100 ps200ps100 ps600ps beq200ps100 ps200ps500ps

41 Pipeline Performance Single-cycle (T c = 800ps) Pipelined (T c = 200ps)

42 Pipeline Speedup If all stages are balanced – i.e., all take the same time – Time between instructions pipelined = Time between instructions nonpipelined Number of stages If not balanced, speedup is less Speedup due to increased throughput – Latency (time for each instruction) does not decrease

43 WRAP UP

44 For next time Homework Exercises: 3.4.2, 3.4.4 3.10.1 – 3.10.5 Due Tuesday 11/4 Read Chapter 4.1-4.4


Download ppt "Lecture 14: Processors CS 2011 Fall 2014, Dr. Rozier."

Similar presentations


Ads by Google