Lecturer PSOE Dan Garcia

Slides:



Advertisements
Similar presentations
CS152 Lec9.1 CS152 Computer Architecture and Engineering Lecture 9 Designing Single Cycle Control.
Advertisements

CS61C L15 Representations of Combinatorial Logic Circuits (1) Garcia, Fall 2005 © UCB Lecturer PSOE, new dad Dan Garcia
361 datapath Computer Architecture Lecture 8: Designing a Single Cycle Datapath.
CS61C L19 CPU Design : Designing a Single-Cycle CPU (1) Beamer, Summer 2007 © UCB Scott Beamer Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
CS61C L26 Single Cycle CPU Datapath II (1) Garcia © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
CS61C L20 Single-Cycle CPU Control (1) Beamer, Summer 2007 © UCB Scott Beamer Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
CS152 / Kubiatowicz Lec8.1 9/26/01©UCB Fall 2001 CS152 Computer Architecture and Engineering Lecture 8 Designing Single Cycle Control September 26, 2001.
CS61C L17 Single Cycle CPU Datapath (1) Garcia, Fall 2005 © UCB Lecturer PSOE, new dad Dan Garcia inst.eecs.berkeley.edu/~cs61c.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use.
Savio Chau Single Cycle Controller Design Last Time: Discussed the Designing of a Single Cycle Datapath Control Datapath Memory Processor (CPU) Input Output.
Processor II CPSC 321 Andreas Klappenecker. Midterm 1 Tuesday, October 5 Thursday, October 7 Advantage: less material Disadvantage: less preparation time.
Ceg3420 control.1 ©UCB, DAP’ 97 CEG3420 Computer Design Lecture 9.2: Designing Single Cycle Control.
CS61C L17 Combinatorial Logic Blocks (1) Beamer, Summer 2007 © UCB Scott Beamer, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 25 CPU design (of a single-cycle CPU) Sat Google in Mountain.
CS 61C L35 Single Cycle CPU Control II (1) Garcia, Spring 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
CS 61C L34 Single Cycle CPU Control I (1) Garcia, Spring 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
Microprocessor Design
CS61C L25 Single Cycle CPU Datapath (1) Garcia © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
EECC250 - Shaaban #1 lec #22 Winter The Von-Neumann Computer Model Partitioning of the computing engine into components: –Central Processing.
CS61C L18 Combinational Logic Blocks, Latches (1) Chae, Summer 2008 © UCB Albert Chae, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures.
CS61CL L10 CPU II: Control & Pipeline (1) Huddleston, Summer 2009 © UCB Jeremy Huddleston inst.eecs.berkeley.edu/~cs61c CS61CL : Machine Structures Lecture.
ECE 232 L13. Control.1 ©UCB, DAP’ 97 ECE 232 Hardware Organization and Design Lecture 13 Control Design
CS152 / Kubiatowicz Lec8.1 2/22/99©UCB Spring 1999 CS152 Computer Architecture and Engineering Lecture 8 Designing Single Cycle Control Feb 22, 1999 John.
CS61C L25 CPU Design : Designing a Single-Cycle CPU (1) Garcia, Fall 2006 © UCB T-Mobile’s Wi-Fi / Cell phone  T-mobile just announced a new phone that.
CS 61C L17 Control (1) A Carle, Summer 2006 © UCB inst.eecs.berkeley.edu/~cs61c/su06 CS61C : Machine Structures Lecture #17: CPU Design II – Control
CS151B Computer Systems Architecture Winter 2002 TuTh 2-4pm BH Instructor: Prof. Jason Cong Lecture 8 Designing a Single Cycle Control.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Fall 2006 © UCB Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 25 CPU design (of a single-cycle CPU) Intel is prototyping circuits that.
CS61C L25 CPU Design : Designing a Single-Cycle CPU (1) Garcia, Spring 2007 © UCB Google Summer of Code  Student applications are now open (through );
CS 61C L29 Single Cycle CPU Control II (1) Garcia, Fall 2004 © UCB Andrew Schultz inst.eecs.berkeley.edu/~cs61c-tb inst.eecs.berkeley.edu/~cs61c CS61C.
EECC550 - Shaaban #1 Lec # 4 Winter Major CPU Design Steps 1Using independent RTN, write the micro- operations required for all target.
CS 61C discussion 11 (1) Jaein Jeong 2002 Draw the data path: ADD or SUB Clk 555 RwRaRb bit Registers Extender Clk WrEn Adr Data Memory ALU Instruction.
EEM 486: Computer Architecture Lecture 3 Designing a Single Cycle Datapath.
CS61C L27 Single-Cycle CPU Control (1) Garcia, Spring 2010 © UCB inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 27 Single-cycle.
CS 61C L16 Datapath (1) A Carle, Summer 2004 © UCB inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #16 – Datapath Andy.
CS61C L20 Single Cycle Datapath, Control (1) Chae, Summer 2008 © UCB Albert Chae, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
361 control Computer Architecture Lecture 9: Designing Single Cycle Control.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2010 © UCB inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures.
ECE 232 L12.Datapath.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 12 Datapath.
ELEN 350 Single Cycle Datapath Adapted from the lecture notes of John Kubiatowicz(UCB) and Hank Walker (TAMU)
CS 61C L28 Single Cycle CPU Control I (1) Garcia, Fall 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
CS61C L27 Single Cycle CPU Control (1) Garcia, Fall 2006 © UCB Wireless High Definition?  Several companies will be working on a “WirelessHD” standard,
CS3350B Computer Architecture Winter 2015 Lecture 5.6: Single-Cycle CPU: Datapath Control (Part 1) Marc Moreno Maza [Adapted.
Instructor: Sagar Karandikar
Computer Organization CS224 Fall 2012 Lesson 26. Summary of Control Signals addsuborilwswbeqj RegDst ALUSrc MemtoReg RegWrite MemWrite Branch Jump ExtOp.
EEM 486: Computer Architecture Designing Single Cycle Control.
Computer Organization CS224 Fall 2012 Lesson 22. The Big Picture  The Five Classic Components of a Computer  Chapter 4 Topic: Processor Design Control.
Designing a Single Cycle Datapath In this lecture, slides from lectures 3, 8 and 9 from the course Computer Architecture ECE 201 by Professor Mike Schulte.
EEM 486: Computer Architecture Designing a Single Cycle Datapath.
CPE 442 single-cycle datapath.1 Intro. To Computer Architecture CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath.
CS3350B Computer Architecture Winter 2015 Lecture 5.7: Single-Cycle CPU: Datapath Control (Part 2) Marc Moreno Maza [Adapted.
Computer Organization CS224 Chapter 4 Part a The Processor Spring 2011 With thanks to M.J. Irwin, T. Fountain, D. Patterson, and J. Hennessy for some lecture.
Csci 136 Computer Architecture II –Single-Cycle Datapath Xiuzhen Cheng
EEM 486: Computer Architecture Lecture 3 Designing Single Cycle Control.
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single-Cycle CPU Datapath & Control Part 2 Instructors: Krste Asanovic & Vladimir Stojanovic.
CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually build a CPU Questions on CS140 ? Computer Arithmetic ?
Single Cycle Controller Design
CS 110 Computer Architecture Lecture 11: Single-Cycle CPU Datapath & Control Instructor: Sören Schwertfeger School of Information.
Revisão de Circuitos Lógicos PARTE III. Review Use this table and techniques we learned to transform from 1 to another.
(Chapter 5: Hennessy and Patterson) Winter Quarter 1998 Chris Myers
There is one handout today at the front and back of the room!
Exhausted TA Ben Sussman
Lecturer PSOE Dan Garcia
Inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 20 CPU Design: Control II & Pipelining I TA Noah Johnson Greet class.
Instructors: Randy H. Katz David A. Patterson
CS152 Computer Architecture and Engineering Lecture 8 Designing a Single Cycle Datapath Start: X:40.
COMS 361 Computer Organization
Instructors: Randy H. Katz David A. Patterson
COMS 361 Computer Organization
What You Will Learn In Next Few Sets of Lectures
Presentation transcript:

Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 28 – Single Cycle CPU Control II Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia DARPA $s drying up…ouch!  Greet class “I'm worried and depressed,” – David Patterson, president of the ACM. There is a significant shift of $ from “Blue Sky” research to military contractors. This is a significant shift, mostly for the worse. www.nytimes.com/2005/04/02/technology/02darpa.html?

Review: Single cycle datapath 5 steps to design a processor 1. Analyze instruction set => datapath requirements 2. Select set of datapath components & establish clock methodology 3. Assemble datapath meeting the requirements 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic Control is the hard part MIPS makes that easier Instructions same size Source registers always in same place Immediates same size, location Operations always on registers/immediates Processor Input Control Memory Datapath Output

A Summary of the Control Signals (2/2) See func 10 0000 10 0010 We Don’t Care :-) Appendix A op 00 0000 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010 add sub ori lw sw beq jump RegDst ALUSrc MemtoReg RegWrite MemWrite nPCsel Jump ExtOp ALUctr<2:0> 1 x Add Subtract Or Here is a table summarizing the control signals setting for the seven (add, sub, ...) instructions we have looked at. Instead of showing you the exact bit values for the ALU control (ALUctr), I have used the symbolic values here. The first two columns are unique in the sense that they are R-type instrucions and in order to uniquely identify them, we need to look at BOTH the op field as well as the func fiels. Ori, lw, sw, and branch on equal are I-type instructions and Jump is J-type. They all can be uniquely idetified by looking at the opcode field alone. Now let’s take a more careful look at the first two columns. Notice that they are identical except the last row. So we can combine these two rows here if we can “delay” the generation of ALUctr signals. This lead us to something call “local decoding.” +3 = 42 min. (Y:22) op target address rs rt rd shamt funct 6 11 16 21 26 31 immediate R-type I-type J-type add, sub ori, lw, sw, beq jump

The Single Cycle Datapath during Jump op target address 26 31 J-type jump 25 New PC = { PC[31..28], target address, 00 } Jump= Instruction<31:0> Instruction Fetch Unit nPC_sel= Rd Rt <21:25> <16:20> <11:15> <0:15> <0:25> RegDst = Clk 1 Mux Rs Rt ALUctr = Rt Rs Rd Imm16 TA26 RegWr = 5 5 5 MemtoReg = busA Zero MemWr = Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 So how does the branch instruction work? As far as the main datapath is concerned, it needs to calculate the branch condition. That is, it subtracts the register specified in the Rt field from the register specified in the Rs field and set the condition Zero accordingly. In order to place the register values on busA and busB, we need to feed the Rs and Rt fields of the instruction to the Ra and Rb ports of the register file and set ALUSrc to 0. Then we have to instruction the ALU to perform the subtract (ALUctr = sub) operation and set the Zero bit accordingly. The Zero bit is sent to the Instruction Fetch Unit. I will show you the internal of the Instruction Fetch Unit in a second. But before we leave this slide, I want you to notice that ExtOp, MemtoReg, and RegDst are don’t cares but RegWr and MemWr have to be ZERO to prevent any write to occur. And finally, the controller needs to set the Branch signal to 1 so the Instruction Fetch Unit knows what to do. So now let’s take a look at the Instruction Fetch Unit. +2 = 33 min. (Y:13) Clk 32 Mux Mux 32 WrEn Adr 1 1 Data In 32 Data Memory imm16 Extender 32 16 Clk ALUSrc = ExtOp =

The Single Cycle Datapath during Jump op target address 26 31 J-type jump 25 New PC = { PC[31..28], target address, 00 } Jump=1 Instruction<31:0> Instruction Fetch Unit nPC_sel=0 Rd Rt <21:25> <16:20> <11:15> <0:15> <0:25> RegDst = x Clk 1 Mux Rs Rt ALUctr =x Rt Rs Rd Imm16 TA26 RegWr = 0 5 5 5 MemtoReg = x busA Zero MemWr = 0 Rw Ra Rb busW 32 32 32-bit Registers 32 ALU busB 32 So how does the branch instruction work? As far as the main datapath is concerned, it needs to calculate the branch condition. That is, it subtracts the register specified in the Rt field from the register specified in the Rs field and set the condition Zero accordingly. In order to place the register values on busA and busB, we need to feed the Rs and Rt fields of the instruction to the Ra and Rb ports of the register file and set ALUSrc to 0. Then we have to instruction the ALU to perform the subtract (ALUctr = sub) operation and set the Zero bit accordingly. The Zero bit is sent to the Instruction Fetch Unit. I will show you the internal of the Instruction Fetch Unit in a second. But before we leave this slide, I want you to notice that ExtOp, MemtoReg, and RegDst are don’t cares but RegWr and MemWr have to be ZERO to prevent any write to occur. And finally, the controller needs to set the Branch signal to 1 so the Instruction Fetch Unit knows what to do. So now let’s take a look at the Instruction Fetch Unit. +2 = 33 min. (Y:13) Clk 32 Mux Mux 32 WrEn Adr 1 1 Data In 32 Data Memory imm16 Extender 32 16 Clk ALUSrc = x ExtOp = x

Instruction Fetch Unit at the End of Jump op target address 26 31 J-type jump 25 New PC = { PC[31..28], target address, 00 } Adr Inst Memory Jump Instruction<31:0> nPC_sel Zero nPC_MUX_sel 4 How do we modify this to account for jumps? Adder Let’s look at the interesting case where the branch condition Zero is true (Zero = 1). Well, if Zero is not asserted, we will have our boring case where PC + 1 is selected. Anyway, with Branch = 1 and Zero = 1, the output of the second adder will be selected. That is, we will add the seqential address, that is output of the first adder, to the sign extended version of the immediate field, to form the branch target address (output of 2nd adder). With the control signal Jump set to zero, this branch target address will be written into the Program Counter register (PC) at the end of the clock cycle. +2 = 35 min. (Y:15) Mux 00 PC Adder 1 Clk imm16

Instruction Fetch Unit at the End of Jump op target address 26 31 J-type jump 25 New PC = { PC[31..28], target address, 00 } Adr Inst Memory Jump Instruction<31:0> nPC_sel Zero Query Can Zero still get asserted? Does nPC_sel need to be 0? If not, what? nPC_MUX_sel 00 4 Mux Adder TA26 1 00 Let’s look at the interesting case where the branch condition Zero is true (Zero = 1). Well, if Zero is not asserted, we will have our boring case where PC + 1 is selected. Anyway, with Branch = 1 and Zero = 1, the output of the second adder will be selected. That is, we will add the seqential address, that is output of the first adder, to the sign extended version of the immediate field, to form the branch target address (output of 2nd adder). With the control signal Jump set to zero, this branch target address will be written into the Program Counter register (PC) at the end of the clock cycle. +2 = 35 min. (Y:15) Mux PC 4 (MSBs) Adder Clk 1 imm16

Build CL to implement Jump on paper now Inst31 Inst30 Inst29 Inst28 Inst27 Inst26 Inst25 Jump Inst01 Inst00

Build CL to implement Jump on paper now Inst31 Inst30 Inst29 A Inst28 Inst27 Inst26 2-input 6-bit-wide XNOR Inst25 6-input AND Jump B Ai Bi XNOR 1 1 Inst01 Inst00

Administrivia HW7 out today, due in a week

Review: Finite State Machine (FSM) States represent possible output values. Transitions represent changes between states based on inputs. Implement with CL and clocked register feedback.

Finite State Machines extremely useful! They define How output signals respond to input signals and previous state. How we change states depending on input signals and previous state The output signals could be our familiar control signals Some control signals may only depend on CL, not on state at all… We could implement very detailed FSMs w/Programmable Logic Arrays

Taking advantage of sum-of-products Since sum-of-products is a convenient notation and way to think about design, offer hardware building blocks that match that notation One example is Programmable Logic Arrays (PLAs) Designed so that can select (program) ands, ors, complements after you get the chip Late in design process, fix errors, figure out what to do later, …

Programmable Logic Arrays Pre-fabricated building block of many AND/OR gates “Programmed” or “Personalized" by making or breaking connections among gates Programmable array block diagram for sum of products form Or Programming: How to combine product terms? How many outputs? • • • inputs AND array • • • outputs OR array product terms And Programming: How many inputs? How to combine inputs? How many product terms?

Shared product terms among outputs Enabling Concept Shared product terms among outputs F0 = A + B' C' F1 = A C' + A B F2 = B' C' + A B F3 = B' C + A example: input side: 3 inputs 1 = uncomplemented in term 0 = complemented in term – = does not participate personality matrix Product inputs outputs term A B C F0 F1 F2 F3 AB 1 1 – 0 1 1 0 B'C – 0 1 0 0 0 1 AC' 1 – 0 0 1 0 0 B'C' – 0 0 1 0 1 0 A 1 – – 1 0 0 1 output side: 4 outputs 1 = term connected to output 0 = no connection to output reuse of terms; 5 product terms

Before Programming All possible connections available before “programming”

After Programming Unwanted connections are "blown" Fuse (normally connected, break unwanted ones) Anti-fuse (normally disconnected, make wanted connections) A B C F1 F2 F3 F0 AB B'C AC' B'C'

Alternate Representation Short-hand notation--don't have to draw all the wires X Signifies a connection is present and perpendicular signal is an input to gate notation for implementing F0 = A B + A' B' F1 = C D' + C' D A B C D AB AB+A'B' A'B' CD' CD'+C'D C'D

Peer Instruction ABC 1: SRF 2: SRT 3: SEF 4: SET 5: BRF 6: BRT 7: BEF 32 ALUctr Clk busW RegWr busA busB 5 Rw Ra Rb 32 32-bit Registers Rs Rt Rd RegDst Extender Mux 16 imm16 ALUSrc ExtOp MemtoReg Data In WrEn Adr Data Memory MemWr ALU Instruction Fetch Unit Zero Instruction<31:0> 1 <21:25> <16:20> <11:15> <0:15> Imm16 nPC_sel ABC 1: SRF 2: SRT 3: SEF 4: SET 5: BRF 6: BRT 7: BEF 8: BET MemToReg=‘x’ & ALUctr=‘sub’. SUB or BEQ? ALUctr=‘add’. Which 1 signal is different for all 3 of: ADD, LW, & SW? RegDst or ExtOp? “Don’t Care” signals are useful because we can simplify our PLA personality matrix. F / T? Answer: [correct=5, TFF] Individual (6, 3, 6, 3, 34, 37, 7, 4)% & Team (9, 4, 4, 4, 39, 34, 0, 7)% A: T [18/82 & 25/75] (The game breaks up very quickly into separate, independent games -- big win to solve separated & put together) B: F [80/20 & 86/14] (The game tree grows exponentially! A better static evaluator can make all the difference!) C: F [53/57 & 52/48] (If you're just looking for the best move, just keep theBestMove around [ala memoizing max])

And in Conclusion… Single cycle control 5 steps to design a processor 1. Analyze instruction set => datapath requirements 2. Select set of datapath components & establish clock methodology 3. Assemble datapath meeting the requirements 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic Control is the hard part MIPS makes that easier Instructions same size Source registers always in same place Immediates same size, location Operations always on registers/immediates Processor Input Control Memory Datapath Output