Lecture 14: Processors CS 2011 Fall 2014, Dr. Rozier.

Slides:



Advertisements
Similar presentations
Lecture 4: CPU Performance
Advertisements

Lecture Objectives: 1)Define pipelining 2)Calculate the speedup achieved by pipelining for a given number of instructions. 3)Define how pipelining improves.
CMPT 334 Computer Organization
Chapter 1 — Computer Abstractions and Technology — 1 Lecture 7 Carry look ahead adders, Latches, Flip-flops, registers, multiplexors, decoders Digital.
ECE 15B Computer Organization Spring 2010 Dmitri Strukov Lecture 2: Overview of Computer Organization Partially adapted from Computer Organization and.
Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Great Ideas in Computer Architecture Pipelining Hazards 1.
Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.
CS-447– Computer Architecture Lecture 12 Multiple Cycle Datapath
1  1998 Morgan Kaufmann Publishers Chapter Five The Processor: Datapath and Control.
Lec 17 Nov 2 Chapter 4 – CPU design data path design control logic design single-cycle CPU performance limitations of single cycle CPU multi-cycle CPU.
Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr CS-447– Computer Architecture.
The Processor 2 Andreas Klappenecker CPSC321 Computer Architecture.
Chapter Five The Processor: Datapath and Control.
Computer ArchitectureFall 2007 © October 31, CS-447– Computer Architecture M,W 10-11:20am Lecture 17 Review.
Processor I CPSC 321 Andreas Klappenecker. Midterm 1 Thursday, October 7, during the regular class time Covers all material up to that point History MIPS.
Fall 2007 L16: Memory Elements LECTURE 16: Clocks Sequential circuit design The basic memory element: a latch Flip Flops.
The Processor Andreas Klappenecker CPSC321 Computer Architecture.
Designing a Simple Datapath Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Revised 9/12/2013.
COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections
Lecture 9. MIPS Processor Design – Instruction Fetch Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System Education &
Chapter 4 CSF 2009 The processor: Building the datapath.
Lecture 8: Processors, Introduction EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014,
Fall EE 333 Lillevik 333f06-l8 University of Portland School of Engineering Computer Organization Lecture 8 Detailed MIPS datapath Timing overview.
Lec 15Systems Architecture1 Systems Architecture Lecture 15: A Simple Implementation of MIPS Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan Some.
Pipeline Computer Organization II 1 Pipelining Analogy Pipelined laundry: overlapping execution – Parallelism improves performance Four loads: – Speedup.
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell CS352H: Computer Systems Architecture Topic 8: MIPS Pipelined.
Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy  Pipelined laundry: overlapping execution l Parallelism improves performance §4.5.
Morgan Kaufmann Publishers
Chapter 4 CSF 2009 The processor: Pipelining. Performance Issues Longest delay determines clock period – Critical path: load instruction – Instruction.
Chapter 4 The Processor. Chapter 4 — The Processor — 2 Introduction We will examine two MIPS implementations A simplified version A more realistic pipelined.
Chapter 4 The Processor CprE 381 Computer Organization and Assembly Level Programming, Fall 2012 Revised from original slides provided by MKP.
Analogy: Gotta Do Laundry
Computer Organization CS224 Fall 2012 Lesson 22. The Big Picture  The Five Classic Components of a Computer  Chapter 4 Topic: Processor Design Control.
ECE 445 – Computer Organization
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /19/2013 Lecture 17: The Processor - Overview Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER.
IT253: Computer Organization Lecture 9: Making a Processor: Single-Cycle Processor Design Tonga Institute of Higher Education.
1 Pipelining Part I CS What is Pipelining? Like an Automobile Assembly Line for Instructions –Each step does a little job of processing the instruction.
CSCI 6307 Foundation of Systems Review: Midterm Exam Xiang Lian The University of Texas – Pan American Edinburg, TX 78539
CPU Overview Computer Organization II 1 February 2009 © McQuain & Ribbens Introduction CPU performance factors – Instruction count n Determined.
Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Great Ideas in Computer Architecture Pipelining Hazards 1.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor.
Morgan Kaufmann Publishers The Processor
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Appendix C Basics of Logic Design. Appendix C — Logic Basic — 2 Logic Design Basics §4.2 Logic Design Conventions Objective: To understand how to build.
CS161 – Design and Architecture of Computer Systems
Computer Organization
Processor Design & Implementation
Morgan Kaufmann Publishers
Morgan Kaufmann Publishers The Processor
Introduction CPU performance factors
Morgan Kaufmann Publishers The Processor
Pipelining.
Morgan Kaufmann Publishers
Morgan Kaufmann Publishers The Processor
Pipeline Implementation (4.6)
Processor (I).
Design of the Control Unit for Single-Cycle Instruction Execution
Single-Cycle DataPath
Morgan Kaufmann Publishers The Processor
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
Pipelining.
Morgan Kaufmann Publishers The Processor
Chapter 4 The Processor Part 2
Design of the Control Unit for One-cycle Instruction Execution
Serial versus Pipelined Execution
Morgan Kaufmann Publishers The Processor
Rocky K. C. Chang 6 November 2017
The Processor Lecture 3.1: Introduction & Logic Design Conventions
Single Cycle Datapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
Morgan Kaufmann Publishers The Processor
Presentation transcript:

Lecture 14: Processors CS 2011 Fall 2014, Dr. Rozier

BOMB LAB STATUS

MP2

Lab Phases: Recursive Phase 1 – Factorial Phase 2 - Fibonacci

Lab Phases: Arrays Phase 4 – Sum Array Phase 5 – Find Item Phase 6 – Bubble Sort

Lab Phases: Trees Array representation: [1,2,3,4,5,6,7,0,0,0,0,0,0,0,0] Phase 7 – Tree Height Phase 8 – Tree Traversal [1,2,5,0,0,4,0,0,3,6,0,0,7,0,0]

PROCESSORS

What needs to be done to “Process” an Instruction? Check the PC Fetch the instruction from memory Decode the instruction and set control lines appropriately Execute the instruction – Use ALU – Access Memory – Branch Store Results PC = PC + 4, or PC = branch target

CPU Overview

Chapter 4 — The Processor — 10 Can’t just join wires together Use multiplexers

CPU + Control

Logic Design Basics Information encoded in binary – Low voltage = 0, High voltage = 1 – One wire per bit – Multi-bit data encoded on multi-wire buses Combinational element – Operate on data – Output is a function of input State (sequential) elements – Store information

Combinational Elements AND-gate – Y = A & B A B Y I0 I1 Y MuxMux S Multiplexer Y = S ? I1 : I0 A B Y + A B Y ALU F Adder Y = A + B Arithmetic/Logic Unit Y = F(A, B)

Storing Data?

S-R Latch S – set R – reset Feedback keeps the bit “trapped”.

S-R Latch Characteristic TableExcitation Table SRQ_nextActionQQ_nextSR 00Qhold000X 010reset set XN/A11X0

D Flip-Flop We can note in the S-R Latch that S is the complement of R in state changes

D Flip-Flop Feed D and ~D to a gated S-R Latch to create a one input synchronous SR-Latch We’ll call it a D Flip-Flop, just to be difficult.

D Flip-Flop D – input signal E – enable signal, sometimes called clock or control E/CDQ~QNotes 0XQ_prev~Q_prev

D Flip-Flop D – input signal E – enable signal, sometimes called clock or control E/CDQ~QNotes 0XQ_prev~Q_prevNo change 1001Reset 1110Set

Adding the Clock

More Realistic

Register File

Sequential Elements Register: stores data in a circuit – Uses a clock signal to determine when to update the stored value – Edge-triggered: update when Clk changes from 0 to 1 D Clk Q D Q

Sequential Elements Register with write control – Only updates on clock edge when write control input is 1 – Used when stored value is required later D Clk Q Write D Q Clk

Clocking Methodology Combinational logic transforms data during clock cycles – Between clock edges – Input from state elements, output to state element – Longest delay determines clock period

Building a Datapath Datapath – Elements that process data and addresses in the CPU Registers, ALUs, mux’s, memories, … We will build a MIPS datapath incrementally – Refining the overview design

Pipeline Fetch Decode Issue Integer Multiply Floating Point Load Store Write Back

Instruction Fetch 32-bit register Increment by 4 for next instruction

ALU Read two register operands Perform arithmetic/logical operation Write register result

Load/Store Instructions Read register operands Calculate address Load: Read memory and update register Store: Write register value to memory

Branch Instructions?

Datapath With Control

ALU Instruction

Load Instruction

Branch-on-Equal Instruction

Performance Issues Longest delay determines clock period – Critical path: load instruction – Instruction memory  register file  ALU  data memory  register file Not feasible to vary period for different instructions Violates design principle – Making the common case fast We will improve performance by pipelining

Pipelining Analogy Pipelined laundry: overlapping execution – Parallelism improves performance §4.5 An Overview of Pipelining Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup = 2n/0.5n ≈ 4 = number of stages

MIPS Pipeline Five stages, one step per stage 1.IF: Instruction fetch from memory 2.ID: Instruction decode & register read 3.EX: Execute operation or calculate address 4.MEM: Access memory operand 5.WB: Write result back to register

Pipeline Performance Assume time for stages is – 100ps for register read or write – 200ps for other stages Compare pipelined datapath with single-cycle datapath InstrInstr fetchRegister read ALU opMemory access Register write Total time lw200ps100 ps200ps 100 ps800ps sw200ps100 ps200ps 700ps R-format200ps100 ps200ps100 ps600ps beq200ps100 ps200ps500ps

Pipeline Performance Single-cycle (T c = 800ps) Pipelined (T c = 200ps)

Pipeline Speedup If all stages are balanced – i.e., all take the same time – Time between instructions pipelined = Time between instructions nonpipelined Number of stages If not balanced, speedup is less Speedup due to increased throughput – Latency (time for each instruction) does not decrease

WRAP UP

For next time Homework Exercises: 3.4.2, – Due Tuesday 11/4 Read Chapter