John Lazzaro (www.cs.berkeley.edu/~lazzaro)

Slides:



Advertisements
Similar presentations
CS152 Lec9.1 CS152 Computer Architecture and Engineering Lecture 9 Designing Single Cycle Control.
Advertisements

EECC550 - Shaaban #1 Lec # 4 Summer Major CPU Design Steps 1Using independent RTN, write the micro- operations required for all target ISA.
361 datapath Computer Architecture Lecture 8: Designing a Single Cycle Datapath.
CS61C L19 CPU Design : Designing a Single-Cycle CPU (1) Beamer, Summer 2007 © UCB Scott Beamer Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
CS61C L26 Single Cycle CPU Datapath II (1) Garcia © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
CS61C L20 Single-Cycle CPU Control (1) Beamer, Summer 2007 © UCB Scott Beamer Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
CS152 / Kubiatowicz Lec8.1 9/26/01©UCB Fall 2001 CS152 Computer Architecture and Engineering Lecture 8 Designing Single Cycle Control September 26, 2001.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use.
Savio Chau Single Cycle Controller Design Last Time: Discussed the Designing of a Single Cycle Datapath Control Datapath Memory Processor (CPU) Input Output.
Processor II CPSC 321 Andreas Klappenecker. Midterm 1 Tuesday, October 5 Thursday, October 7 Advantage: less material Disadvantage: less preparation time.
Ceg3420 control.1 ©UCB, DAP’ 97 CEG3420 Computer Design Lecture 9.2: Designing Single Cycle Control.
EECC550 - Shaaban #1 Lec # 4 Winter CPU Organization Datapath Design: –Capabilities & performance characteristics of principal Functional.
CS 61C L35 Single Cycle CPU Control II (1) Garcia, Spring 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
CS 61C L34 Single Cycle CPU Control I (1) Garcia, Spring 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
Lecturer PSOE Dan Garcia
Microprocessor Design
CS61C L25 Single Cycle CPU Datapath (1) Garcia © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
EECC250 - Shaaban #1 lec #22 Winter The Von-Neumann Computer Model Partitioning of the computing engine into components: –Central Processing.
ECE 232 L13. Control.1 ©UCB, DAP’ 97 ECE 232 Hardware Organization and Design Lecture 13 Control Design
CS152 / Kubiatowicz Lec8.1 2/22/99©UCB Spring 1999 CS152 Computer Architecture and Engineering Lecture 8 Designing Single Cycle Control Feb 22, 1999 John.
CS61C L25 CPU Design : Designing a Single-Cycle CPU (1) Garcia, Fall 2006 © UCB T-Mobile’s Wi-Fi / Cell phone  T-mobile just announced a new phone that.
CS 61C L17 Control (1) A Carle, Summer 2006 © UCB inst.eecs.berkeley.edu/~cs61c/su06 CS61C : Machine Structures Lecture #17: CPU Design II – Control
CS151B Computer Systems Architecture Winter 2002 TuTh 2-4pm BH Instructor: Prof. Jason Cong Lecture 8 Designing a Single Cycle Control.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Fall 2006 © UCB Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 25 CPU design (of a single-cycle CPU) Intel is prototyping circuits that.
CS 61C L29 Single Cycle CPU Control II (1) Garcia, Fall 2004 © UCB Andrew Schultz inst.eecs.berkeley.edu/~cs61c-tb inst.eecs.berkeley.edu/~cs61c CS61C.
EECC550 - Shaaban #1 Lec # 4 Winter Major CPU Design Steps 1Using independent RTN, write the micro- operations required for all target.
CS 61C discussion 11 (1) Jaein Jeong 2002 Draw the data path: ADD or SUB Clk 555 RwRaRb bit Registers Extender Clk WrEn Adr Data Memory ALU Instruction.
EEM 486: Computer Architecture Lecture 3 Designing a Single Cycle Datapath.
CS61C L27 Single-Cycle CPU Control (1) Garcia, Spring 2010 © UCB inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 27 Single-cycle.
CS 61C L16 Datapath (1) A Carle, Summer 2004 © UCB inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #16 – Datapath Andy.
CS61C L20 Single Cycle Datapath, Control (1) Chae, Summer 2008 © UCB Albert Chae, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
361 control Computer Architecture Lecture 9: Designing Single Cycle Control.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2010 © UCB inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures.
ECE 232 L12.Datapath.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 12 Datapath.
ELEN 350 Single Cycle Datapath Adapted from the lecture notes of John Kubiatowicz(UCB) and Hank Walker (TAMU)
CS 61C L28 Single Cycle CPU Control I (1) Garcia, Fall 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
CS61C L27 Single Cycle CPU Control (1) Garcia, Fall 2006 © UCB Wireless High Definition?  Several companies will be working on a “WirelessHD” standard,
CS3350B Computer Architecture Winter 2015 Lecture 5.6: Single-Cycle CPU: Datapath Control (Part 1) Marc Moreno Maza [Adapted.
Instructor: Sagar Karandikar
Computer Organization CS224 Fall 2012 Lesson 26. Summary of Control Signals addsuborilwswbeqj RegDst ALUSrc MemtoReg RegWrite MemWrite Branch Jump ExtOp.
EEM 486: Computer Architecture Designing Single Cycle Control.
Computer Organization CS224 Fall 2012 Lesson 22. The Big Picture  The Five Classic Components of a Computer  Chapter 4 Topic: Processor Design Control.
Designing a Single Cycle Datapath In this lecture, slides from lectures 3, 8 and 9 from the course Computer Architecture ECE 201 by Professor Mike Schulte.
EEM 486: Computer Architecture Designing a Single Cycle Datapath.
Computer Architecture and Design – ECEN 350 Part 6 [Some slides adapted from A. Sprintson, M. Irwin, D. Paterson and others]
CPE 442 single-cycle datapath.1 Intro. To Computer Architecture CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath.
CS3350B Computer Architecture Winter 2015 Lecture 5.7: Single-Cycle CPU: Datapath Control (Part 2) Marc Moreno Maza [Adapted.
Computer Organization CS224 Chapter 4 Part a The Processor Spring 2011 With thanks to M.J. Irwin, T. Fountain, D. Patterson, and J. Hennessy for some lecture.
Designing a Single- Cycle Processor 國立清華大學資訊工程學系 黃婷婷教授.
CS4100: 計算機結構 Designing a Single-Cycle Processor 國立清華大學資訊工程學系 一零零學年度第二學期.
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture The Processor: Datapath & Control.
Csci 136 Computer Architecture II –Single-Cycle Datapath Xiuzhen Cheng
EEM 486: Computer Architecture Lecture 3 Designing Single Cycle Control.
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single-Cycle CPU Datapath & Control Part 2 Instructors: Krste Asanovic & Vladimir Stojanovic.
CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually build a CPU Questions on CS140 ? Computer Arithmetic ?
Single Cycle Controller Design
EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor, Control and Multi-cycle Datapath [Adapted from.
CS 110 Computer Architecture Lecture 11: Single-Cycle CPU Datapath & Control Instructor: Sören Schwertfeger School of Information.
(Chapter 5: Hennessy and Patterson) Winter Quarter 1998 Chris Myers
(Chapter 5: Hennessy and Patterson) Winter Quarter 1998 Chris Myers
CS152 Computer Architecture and Engineering Lecture 8 Designing a Single Cycle Datapath Start: X:40.
John Lazzaro ( CS152 – Computer Architecture and Engineering Lecture 7 – (Design Notebook+) Single Cycle Control
COMS 361 Computer Organization
Dave Patterson ( CS152 – Computer Architecture and Engineering Lecture 7 – Single Cycle Control Dave Patterson.
Instructors: Randy H. Katz David A. Patterson
COMS 361 Computer Organization
What You Will Learn In Next Few Sets of Lectures
Processor: Datapath and Control
Presentation transcript:

John Lazzaro (www.cs.berkeley.edu/~lazzaro) CS152 – Computer Architecture and Engineering Lecture 7 – (Design Notebook+) Single Cycle Control 2004-09-21 John Lazzaro (www.cs.berkeley.edu/~lazzaro) Dave Patterson (www.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs152/ Greet class

Review 5 steps to design a processor MIPS makes it easier 1. Analyze instruction set => datapath requirements 2. Select set of datapath components & establish clock methodology 3. Assemble datapath meeting the requirements 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic (This Lecture) MIPS makes it easier Instructions same size; Source registers, immediates always in same place Operations always on registers/immediates Single cycle datapath => CPI=1, CCT => long

Why should you keep a design notebook? Keep track of the design decisions and the reasons behind them Otherwise, it will be hard to debug and/or refine the design Write it down so that can remember in long project: 2 weeks ->2 yrs Others can review notebook to see what happened Record insights you have on certain aspect of the design as they come up Record of the different design & debug experiments Memory can fail when very tired Industry practice: learn from others mistakes Well, the goal of this part of the lecture is to convince EACH of you should keep your OWN design note book. Why? Well, first of all, you need to keep track of all the design decisions you made and may be more importantly, the reasons behind your design decisions. This may not be that important when your project life span is only a few weeks but after you graduate, you will work on projects that last for 2 to 3 years. And if you don’t write things down, you may not remember how you do certain things and why and you may find it very hard to debug and refine your design. Also, sometimes when you are working on certain part of the design, you may suddenly get some insights on another part of the design. You may not have time to follow up your insights immediately and if you don’t write them down, you may never be able to reconstruct them later when you have time. Finally, it is very important for you to write down everything you see on the tests or experiments you run when you are debugging your design. +2 = 59 min. (Y:39)

Why do we keep it on-line? You need to force yourself to take notes! Open a window and leave an editor running while you work 1) Acts as reminder to take notes 2) Makes it easy to take notes 1) + 2) => will actually do it Take advantage of the window system’s “cut and paste” features It is much easier to read your typing than your writing Also, paper log books have problems Limited capacity => end up with many books May not have right book with you at time vs. networked screens Can use computer to search files/index files to find what looking for The next question some of you may want to ask is, OK, I will keep a note book. But why should I keep it on line? Well, let’s be honest to ourselves. All of us need a little bit reminder to force ourselves to take notes while we work. One of the best reminder I find is the window system of modern PC. By keeping an extra window open and have an editor running, it makes taking notes very easy and the editor also serves as a constant reminder for you to take notes. Also by keeping your notebook on-line, you can take advantage of the window system’s cut and paste feature to drop important “print outs” into your note book. Finally, although you may be able to read your own handwriting much better than anybody else, it is still easier to read your own typing than your own writing. +2 = 61 min. (Y:41)

How should you do it? Keep it simple Separate the entries by dates DON’T make it so elaborate that you won’t use (fonts, layout, ...) Separate the entries by dates type “date” command in another window and cut&paste Start day with problems going to work on today Record output of simulation into log with cut&paste; add date May help sort out which version of simulation did what Record key email with cut&paste Record of what works & doesn’t helps team decide what went wrong after you left Index: write a one-line summary of what you did at end of each day How should you keep your on-line notebook? By all means, Keep It Simple. The on-line notebook should help you trace down and solve your problems. It should NOT become one of your problems. In order to keep the note book easy to read, you should separate your entries by dates. Furthermore, before you sign off each date, we should write a one-line summary of what you did and this will serve as the index to your notebook. Let me show you some examples. +2 = 63 min. (Y:43)

On-line Notebook Example Refer to the handout “Example of On-Line Log Book” on CS 152 home page: http://www-inst.eecs.berkeley.edu/~cs152/ handouts/online_notebook_example.html Spend 10 minutes on the notebook example: 6 minutes per page. +12 = 75 min. (Y:55)

Recap: Putting it All Together: 1 Cycle Datapath Adr Inst Memory Instruction<31:0> <21:25> <16:20> <11:15> <0:15> Rs Rt Rd Imm16 PCSrc RegDst ALUctr MemWr MemtoReg Zero Rd Rt 1 Rs Rt 4 Adder RegWr 5 5 5 busA Mux Rw Ra Rb 00 busW 32 32 32-bit Registers ALU 32 busB 32 PC Clk So here is the single cycle datapath we just built. If you push into the Instruction Fetch Unit, you will see the last slide showing the PC, the next address logic, and the Instruction Memory. Here I have shown how we can get the Rt, Rs, Rd, and Imm16 fields out of the 32-bit instruction word. The Rt, Rs, and Rd fields will go to the register file as register specifiers while the Imm16 field will go to the Extender where it is either Zero and Sign extended to 32 bits. The signals ExtOp, ALUSrc, ALUctr, MemWr, MemtoReg, RegDst, RegWr, Branch, and Jump are control signals. And I will show you how to generate them on Friday. +2 = 80 min. (Z:00) Adder 32 Mux Mux 32 WrEn Adr 1 Data In Extender 1 PC Ext imm16 Data Memory Clk 32 16 imm16 Clk ExtOp ALUSrc

Recap: The MIPS-lite Subset ADD and subtract add rd, rs, rt sub rd, rs, rt OR Imm: ori rt, rs, imm16 LOAD and STORE lw rt, rs, imm16 sw rt, rs, imm16 BRANCH: beq rs, rt, imm16 op rs rt rd shamt funct 6 11 16 21 26 31 6 bits 5 bits op rs rt immediate 16 21 26 31 6 bits 16 bits 5 bits +3 = 5min. (X:45)

Meaning of the Control Signals MemWr: 1  write memory MemtoReg: 0  ALU; 1  Mem RegDst: 0  “rt”; 1  “rd” RegWr: 1  write register ExtOp: “zero”, “sign” ALUsrc: 0  regB; 1  immed ALUctr: “add”, “sub”, “or” RegDst Zero ALUctr MemWr MemtoReg Rd Rt 1 Rs Rt RegWr 5 5 5 busA Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 Clk 32 Mux Mux 32 WrEn Adr 1 1 Data In imm16 Extender Data Memory 32 16 Clk ExtOp ALUSrc

Two equivalent ways to specify control (Rotate about 45degree axis) Book does left version (Fig 5.18, p. 308) Book combines all ALU instructions as “R-format” vs. separate instructions (add …) Good news: lecture different view than book We’ll do right by committee, 1 at a time

Setting PC Source Control Signal PCSrc: 0  PC <= PC + 4 1  PC <= PC + 4 + {SignExt(Im16), 2’b00} Later in lecture: higher-level connection between mux and branch cond PCSrc Inst Memory Adr 4 Adder Mux 00 PC Adder Clk imm16 PC Ext

Setting PC Source Control Signal PCSrc: 0  PC <= PC + 4 1  PC <= PC + 4 + {SignExt(Im16), 2’b00} Later in lecture: higher-level connection between mux and branch cond PCSrc Inst Memory Adr 4 Adder Mux 00 PC Adder Clk imm16 PC Ext

Meaning of the Control Signals MemWr: 1  write memory MemtoReg: 0  ALU; 1  Mem RegDst: 0  “rt”; 1  “rd” RegWr: 1  write register ExtOp: 0  “zero” ; 1  “sign” ALUsrc: 0  regB; 1  immed ALUctr: “add”, “sub”, “or” RegDst Zero ALUctr MemWr MemtoReg Rd Rt 1 Rs Rt RegWr 5 5 5 busA Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 32 Mux Clk Mux 32 WrEn Adr 1 1 Data In imm16 Extender Data Memory 32 16 Clk ExtOp ALUSrc

Specify ALU source mux Control ALUsrc: 0  reg as ALU B input; 1  immediate as ALU B input Rd Rt 1 An sw e r ? Ad d U S ubU OR I L W B E Q 1 2 Rs Rt 5 5 5 busA 3 1 4 5 X 6 Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 Mux Clk 32 1 Data In 7 1 1 1 1 1 X imm16 Extender 32 16 8 X X X X X 1 9 No n e of t h e ab o v e ExtOp ALUSrc

Specify ALU source mux Control ALUsrc: 0  reg as ALU B input; 1  immediate as ALU B input Rd Rt 1 An sw e r ? Ad d U S ubU OR I L W B E Q 1 2 Rs Rt 5 5 5 busA 3 1 4 5 X 6 Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 Mux Clk 32 1 Data In 7 1 1 1 1 1 X imm16 Extender 32 16 8 X X X X X 1 9 No n e of t h e ab o v e ExtOp ALUSrc

Lab 2 Verilog simulation Friday Find bugs in COD 3rd Edition? Administrivia COD Reading for next lecture: Sections 5.5 “Multicycle”, 5.6 “Microprogramming” (on CD), “Fallacies and Pitfalls” 5.10 Start Homework #2 Lab 2 Verilog simulation Friday Find bugs in COD 3rd Edition? $1 reward to first person to report a bug Send email to cod3bugs@mkp.com Include Page number, line number on page, BEFORE with bug, AFTER fix, why it’s a bug

Specify Immediate Extender Op Control ExtOp: 0  “zero extend immediate” ; 1  “sign extend imm.” Rd Rt 1 Rs Rt 5 5 5 busA Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 Mux Clk 32 1 Data In imm16 Extender 32 16 ExtOp ALUSrc

Specify Immediate Extender Op Control ExtOp: 0  “zero extend immediate” ; 1  “sign extend imm.” Rd Rt 1 Rs Rt 5 5 5 busA Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 Mux Clk 32 1 Data In imm16 Extender 32 16 ExtOp ALUSrc

Specify Register Write Control RegWr: 1  write register RegDst Rd Rt 1 Rs Rt RegWr 5 5 5 busA Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 32 Mux Clk 1 imm16 Extender 32 16 ExtOp ALUSrc

Specify Register Write Control RegWr: 1  write register RegDst Rd Rt 1 Rs Rt RegWr 5 5 5 busA Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 32 Mux Clk 1 imm16 Extender 32 16 ExtOp ALUSrc

Specify Register Destination Control RegDst: 0  “rt”; 1  “rd” op rs rt rd shamt funct 6 11 16 21 26 31 RegDst op rs rt immediate 16 21 26 31 Rd Rt 1 Rs Rt RegWr 5 5 5 busA Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 Mux Clk 1 imm16 Extender 32 16 ExtOp ALUSrc

Specify Register Destination Control RegDst: 0  “rt”; 1  “rd” op rs rt rd shamt funct 6 11 16 21 26 31 RegDst op rs rt immediate 16 21 26 31 Rd Rt 1 Rs Rt RegWr 5 5 5 busA Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 Mux Clk 1 imm16 Extender 32 16 ExtOp ALUSrc

Specify the Memory Write Control Signal MemWr: 1  write memory busW 32 ALUctr Clk RegWr busA busB 5 Rw Ra Rb 32 32-bit Registers Rs Rt Rd RegDst Extender Mux 16 imm16 ALUSrc ExtOp MemtoReg Data In WrEn Adr Data Memory MemWr ALU Zero 1

Specify the Memory Write Control Signal MemWr: 1  write memory busW 32 ALUctr Clk RegWr busA busB 5 Rw Ra Rb 32 32-bit Registers Rs Rt Rd RegDst Extender Mux 16 imm16 ALUSrc ExtOp MemtoReg Data In WrEn Adr Data Memory MemWr ALU Zero 1

Specify Memory To Register File Mux Control MemtoReg: 0  ALU; 1  Mem busW 32 ALUctr Clk RegWr busA busB 5 Rw Ra Rb 32 32-bit Registers Rs Rt Rd RegDst Extender Mux 16 imm16 ALUSrc ExtOp MemtoReg Data In WrEn Adr Data Memory MemWr ALU Zero 1

Specify Memory To Register File Mux Control MemtoReg: 0  ALU; 1  Mem busW 32 ALUctr Clk RegWr busA busB 5 Rw Ra Rb 32 32-bit Registers Rs Rt Rd RegDst Extender Mux 16 imm16 ALUSrc ExtOp MemtoReg Data In WrEn Adr Data Memory MemWr ALU Zero 1

Specify the ALU Control Signals ALUctr: 0  “add”, 1  “sub”, 2  “or” busW 32 ALUctr Clk RegWr busA busB 5 Rw Ra Rb 32 32-bit Registers Rs Rt Rd RegDst Extender Mux 16 imm16 ALUSrc ExtOp MemtoReg Data In WrEn Adr Data Memory MemWr ALU Zero 1

Specify the ALU Control Signals ALUctr: 0  “add”, 1  “sub”, 2  “or” busW 32 ALUctr Clk RegWr busA busB 5 Rw Ra Rb 32 32-bit Registers Rs Rt Rd RegDst Extender Mux 16 imm16 ALUSrc ExtOp MemtoReg Data In WrEn Adr Data Memory MemWr ALU Zero 1 MemtoReg

The Add Instruction add rd, rs, rt op rs rt rd shamt funct 6 11 16 21 26 31 6 bits 5 bits add rd, rs, rt mem[PC] Fetch the instruction from memory R[rd] <= R[rs] + R[rt] The actual operation PC <= PC + 4 Calculate the next instruction’s address OK, let’s get on with today’s lecture by looking at the simple add instruction. In terms of Register Transfer Language, this is what the Add instruction need to do. First, you need to fetch the instruction from Memory. Then you perform the actual add operation. More specifically: (a) You add the contents of the register specified by the Rs and Rt fields of the instruction. (b) Then you write the results to the register specified by the Rd field. And finally, you need to update the program counter to point to the next instruction. Now, let’s take a detail look at the datapath during various phase of this instruction. +2 = 10 min. (X:50)

Instruction Fetch Unit at the Beginning of Add Fetch the instruction from Instruction memory: Instruction <= mem[PC] This is the same for all instructions Adr Inst Memory Instruction<31:0> PCSrc 4 Adder Mux 00 PC Adder Clk imm16 PC Ext

Instruction Fetch Unit at the End of Branch op rs rt immediate 16 21 26 31 if (Zero == 1) PC = PC + 4 + {SignExt[imm16], 2’b00} ; else PC = PC + 4 Adr Inst Memory Instruction<31:0> PCSrc What is encoding of PCSrc? Direct MUX select? Branch / not branch Let’s choose second option Zero PCSrc 4 Adder Let’s look at the interesting case where the branch condition Zero is true (Zero = 1). Well, if Zero is not asserted, we will have our boring case where PC + 1 is selected. Anyway, with Branch = 1 and Zero = 1, the output of the second adder will be selected. That is, we will add the seqential address, that is output of the first adder, to the sign extended version of the immediate field, to form the branch target address (output of 2nd adder). With the control signal Jump set to zero, this branch target address will be written into the Program Counter register (PC) at the end of the clock cycle. +2 = 35 min. (Y:15) Mux 00 PC Adder 1 Clk imm16

The Single Cycle Datapath during Load op rs rt immediate 16 21 26 31 R[rt] <= Data Memory [R[rs] + SignExt[imm16]] Instruction<31:0> PCSrc<= +4 Instruction Fetch Unit Rd Rt <21:25> <16:20> <11:15> <0:15> RegDst = 0 Clk 1 Mux ALUctr <= Add Rs Rt Rt Rs Rd Imm16 RegWr <= 1 5 5 5 MemtoReg <= 1 busA Zero MemWr = 0 Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 Clk Let’s continue our lecture with the load instruction. What does the load instruction do? It first adds the contecnts of the register specified by the Rs field to the Sign Extended version of the Immediate field to form the memory address. Then it uses this memory address to access the memory and write the data back to the register specified by the Rt field of the instruction. Here is how the datapath works: first the Rs field is fed to the Register File’s Ra address port to place the register onto bus A. Then the ExtOp signal is set to 1 so that the immediate field is Sign Extended and we place this value (output of Extender) onto the ALU input by setting ALUsrc to 1. The ALU then add (ALUctr = add) the two together to form the memory address which is then placed onto the Data Memory’s address port. In order to place the Data Memory’s output bus onto the Register File’s input bus (busW), the control needs to set MemtoReg to 1. Similar to the OR immediate instruction I showed you earlier, the destination register here is specified by the Rt field. Therefore RegDst must be set to 0. Finally, RegWr must be set to 1 to completer the register write operation. Well, it should be obvious to you guys by now that we need to set Branch and Jump to 0 to make sure the Instruction Fetch Unit update the Program Counter correctly. +3 = 28 min. (Y:08) 32 Mux Mux WrEn Adr 1 1 Data In 32 imm16 Extender Data Memory 32 32 16 Clk ALUSrc = 1 ExtOp <= 1

The Single Cycle Datapath during Store op rs rt immediate 16 21 26 31 Data Memory [R[rs] + SignExt[imm16]] <= R[rt] Instruction<31:0> PCSrc <= Instruction Fetch Unit Rd Rt <21:25> <16:20> <11:15> <0:15> RegDst <= Clk 1 Mux Rs Rt Rt Rs Rd Imm16 ALUctr <= RegWr <= 5 5 5 MemtoReg <= busA Zero MemWr <= Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 Clk 32 Mux The store instruction performs the inverse function of the load. Instead of loading data from memory, the store instruction sends the contents of register specified by Rt to data memory. Similar to the load instruction, the store instruction needs to read the contents of register Rs (points to Ra port) and add it to the sign extended verion of the immediate filed (Imm16, ExtOp = 1, ALUSrc = 1) to form the data memory address (ALUctr = add). However unlike the Load instructoion where busB is not used, the store instruction will use busB to send the data to the Data memory. Consequently, the Rt field of the instruction has to be fed to the Rb port of the register file. In order to write the Data Memory properly, the MemWr signal has to be set to 1. Notice that the store instruction does not update the register file. Therefore, RegWr must be set to zero and consequently control signals RegDst and MemtoReg are don’t cares. And once again we need to set the control signals Branch and Jump to zero to ensure proper Program Counter updataing. Well, by now, you are probably tied of these boring stuff where Branch and Jump are zero so let’s look at something different--the bracnh instruction. +3 = 31 min. (Y:11) Mux 32 WrEn Adr 1 1 Data In 32 Data Memory imm16 Extender 32 16 Clk ALUSrc <= ExtOp <=

The Single Cycle Datapath during Store op rs rt immediate 16 21 26 31 Data Memory [R[rs] + SignExt[imm16]] <= R[rt] Instruction<31:0> PCSrc<= +4 Instruction Fetch Unit Rd Rt <21:25> <16:20> <11:15> <0:15> RegDst <= x Clk 1 Mux ALUctr <= Add Rs Rt Rt Rs Rd Imm16 RegWr <= 0 5 5 5 MemtoReg <= x busA Zero MemWr <= 1 Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 Clk The store instruction performs the inverse function of the load. Instead of loading data from memory, the store instruction sends the contents of register specified by Rt to data memory. Similar to the load instruction, the store instruction needs to read the contents of register Rs (points to Ra port) and add it to the sign extended verion of the immediate filed (Imm16, ExtOp <= 1, ALUSrc = 1) to form the data memory address (ALUctr = add). However unlike the Load instructoion where busB is not used, the store instruction will use busB to send the data to the Data memory. Consequently, the Rt field of the instruction has to be fed to the Rb port of the register file. In order to write the Data Memory properly, the MemWr signal has to be set to 1. Notice that the store instruction does not update the register file. Therefore, RegWr must be set to zero and consequently control signals RegDst and MemtoReg are don’t cares. And once again we need to set the control signals Branch and Jump to zero to ensure proper Program Counter updataing. Well, by now, you are probably tied of these boring stuff where Branch and Jump are zero so let’s look at something different--the bracnh instruction. +3 = 31 min. (Y:11) 32 Mux Mux 32 WrEn Adr 1 1 Data In 32 Data Memory imm16 Extender 32 16 Clk ALUSrc <= 1 ExtOp <= 1

The Single Cycle Datapath during Branch op rs rt immediate 16 21 26 31 if (R[rs] - R[rt] == 0) Zero <= 1 ; else Zero <= 0 Instruction<31:0> PCSrc<= “Br” Instruction Fetch Unit Rd Rt <21:25> <16:20> <11:15> <0:15> RegDst <= x Clk 1 Mux Rs Rt ALUctr <=Sub Rt Rs Rd Imm16 RegWr <= 0 5 5 5 MemtoReg <= x busA Zero MemWr <= 0 Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 Clk So how does the branch instruction work? As far as the main datapath is concerned, it needs to calculate the branch condition. That is, it subtracts the register specified in the Rt field from the register specified in the Rs field and set the condition Zero accordingly. In order to place the register values on busA and busB, we need to feed the Rs and Rt fields of the instruction to the Ra and Rb ports of the register file and set ALUSrc to 0. Then we have to instruction the ALU to perform the subtract (ALUctr = sub) operation and set the Zero bit accordingly. The Zero bit is sent to the Instruction Fetch Unit. I will show you the internal of the Instruction Fetch Unit in a second. But before we leave this slide, I want you to notice that ExtOp, MemtoReg, and RegDst are don’t cares but RegWr and MemWr have to be ZERO to prevent any write to occur. And finally, the controller needs to set the Branch signal to 1 so the Instruction Fetch Unit knows what to do. So now let’s take a look at the Instruction Fetch Unit. +2 = 33 min. (Y:13) 32 Mux Mux 32 WrEn Adr 1 1 Data In 32 Data Memory imm16 Extender 32 16 Clk ALUSrc <= 0 ExtOp <= x

Step 4: Given Datapath: RTL -> Control Instruction<31:0> Inst Memory <21:25> <21:25> <16:20> <11:15> <0:15> Adr Op Fun Rt Rs Rd Imm16 Control PCSrc RegWr RegDst ExtOp ALUSrc ALUctr MemWr MemtoReg Zero DATA PATH

A Summary of Control Signals inst Register Transfer ADD R[rd] <= R[rs] + R[rt]; PC <= PC + 4 ALUsrc = RegB, ALUctr = “add”, RegDst = rd, RegWr, PCSrc = “+4” SUB R[rd] <= R[rs] – R[rt]; PC <= PC + 4 ALUsrc = RegB, ALUctr = “sub”, RegDst = rd, RegWr, PCSrc = “+4” ORi R[rt] <= R[rs] + zero_ext(Imm16); PC <= PC + 4 ALUsrc = Im, Extop = “Z”, ALUctr = “or”, RegDst = rt, RegWr, PCSrc = “+4” LOAD R[rt] <= MEM[ R[rs] + sign_ext(Imm16)]; PC <= PC + 4 ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemtoReg, RegDst = rt, RegWr, PCSrc = “+4” STORE MEM[ R[rs] + sign_ext(Imm16)] <= R[rs]; PC <= PC + 4 ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemWr, PCSrc = “+4” BEQ if ( R[rs] == R[rt] ) then PC <= PC +4 + {sign_ext(Imm16)], 00’b2} else PC <= PC + 4 PCSrc = “Br”, ALUctr = “sub”

A Summary of the Control Signals See func 10 0000 10 0010 We Don’t Care :-) Appendix A op 00 0000 00 0000 00 1101 10 0011 10 1011 00 0100 add sub ori lw sw beq RegDst 1 1 x x ALUSrc 1 1 1 MemtoReg 1 x x RegWrite 1 1 1 1 MemWrite 1 PCSrc 1 ExtOp x x 1 1 x ALUctr<2:0> Add Subtract Or Add Add Subtract Here is a table summarizing the control signals setting for the seven (add, sub, ...) instructions we have looked at. Instead of showing you the exact bit values for the ALU control (ALUctr), I have used the symbolic values here. The first two columns are unique in the sense that they are R-type instrucions and in order to uniquely identify them, we need to look at BOTH the op field as well as the func fiels. Ori, lw, sw, and branch on Zero are I-type instructions and Jump is J-type. They all can be uniquely idetified by looking at the opcode field alone. Now let’s take a more careful look at the first two columns. Notice that they are identical except the last row. So we can combine these two rows here if we can “delay” the generation of ALUctr signals. This lead us to something call “local decoding.” +3 = 42 min. (Y:22) op target address rs rt rd shamt funct 6 11 16 21 26 31 immediate R-type I-type J-type add, sub ori, lw, sw, beq jump

How to implement control in Verilog? Need to set control lines based on instruction Which statement in Verilog good for doing different operations depending on value in a field of a word? case (selector) item {, item} : statement; item {, item} : statement; default : statement endcase

The Single Cycle Datapath during Or Immediate op rs rt immediate 16 21 26 31 R[rt] <= R[rs] or ZeroExt[Imm16] Instruction<31:0> PCSrc <= +4 Instruction Fetch Unit Rd Rt <21:25> <16:20> <11:15> <0:15> RegDst <= 0 Clk 1 Mux Rs Rt Rt Rs Rd Imm16 ALUctr <= Or RegWr <= 1 5 5 5 MemtoReg <= 0 busA Zero MemWr <= 0 Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 Clk Now let’s look at the control signals setting for the Or immediate instruction. The OR immediate instruction OR the content of the register specified by the Rs field to the Zero Extended Immediate field and write the result to the register specified in Rt. This is how it works in the datapath. The Rs field is fed to the Ra address port to cause the contents of register Rs to be placed on busA. The other operand for the ALU will come from the immediate field. In order to do this, the controller need to set ExtOp to 0 to instruct the extender to perform a Zero Extend operation. Furthermore, ALUSrc must set to 1 such that the MUX will block off bus B from the register file and send the zero extended version of the immediate field to the ALU. Of course, the ALUctr has to be set to OR so the ALU can perform an OR operation. The rest of the control signals (MemWr, MemtoReg, Branch, and Jump) are the same as theAdd and Subtract instructions. One big difference is the RegDst signal. In this case, the destination register is specified by the instruction’s Rt field, NOT the Rd field because we do not have a Rd field here. Consequently, RegDst must be set to 0 to place Rt onto the Register File’s Rw address port. Finally, in order to accomplish the register write, RegWr must be set to 1. +3 = 20 min. (X:60) 32 Mux Mux 32 WrEn Adr 1 1 Data In 32 Data Memory imm16 Extender 32 16 Clk ALUSrc <= 1 ExtOp <= 0

Example for OR immediate case case (Instruction<31:25>) 13 /* ORi */ : begin RegDst = 0; ALUSrc = 1; MemtoReg = 0; RegWrite = 1; MemWrite = 0; PCSrc = 0; ExtOp = 0; ALUctr = 2’b10; end … default : statement endcase

Specify all control in one assignment case (Instruction<31:25>) 13 /* ORi */ : {RegDst,ALUSrc,MemtoReg, RegWrite,MemWrite,PCSrc,Jump, ExtOp,ALUctr} = {1’b0, 1’b1, 1’b0, 1’b1, 1’b0, 1’b0, 1’b0, 2’b01}; … default : statement endcase

Better way than specify as 0s and 1s Hard to read and understand Easy to make mistakes; Alternative? Associate Names with control values! parameter RegDstRt=0, RegDstRd=1, ALUSrcBReg = 0, ALUSrcBImm = 1, RegValALU = 0, RegValMem = 1, RegWr = 1, NoRegWr = 0, MemWr = 1, NoMemWr = 0, PCSrc4 = 0, PCSrcBr = 1, ZeroExt = 0, SignExt = 1, Add = 0, Sub = 1, Or = 2;

Specify all control symbolically case (Instruction<31:25>) 13 /* ORi */ : {RegDst,ALUSrc,MemtoReg, RegWrite,MemWrite,PCSrc,Jump, ExtOp,ALUctr} = {RegDstRt, ALUSrcBImm, RegValALU, RegWr, NoMemWr, PCSrc4, ZeroExt, Or}; … default : statement endcase

Local Decoding: R-type v. Add + Sub op 00 0000 00 1101 10 0011 10 1011 00 0100 R-type ori lw sw beq RegDst 1 x x ALUSrc 1 1 1 MemtoReg 1 x x RegWrite 1 1 1 MemWrite 1 Branch 1 Jump ExtOp x 1 1 x ALUop<N:0> “R-type” Or Add Add Subtract That is, instead of asking the Main Control to generates the ALUctr signals directly (see the diagram with the ALU), the main cotrol will generate a set of signals called ALUop. For all I and J type instructions, ALUop will tell the ALU Control exatly what the ALU needs to do (Add, Subtract, ...) . But whenever the Main Control sees a R-type instructions, it simply throws its hands up and say: “Wow, I don’t know what the ALU has to do but I know it is a R-type instruction” and let the Local Control Block, ALU Control to take care of the rest. Notice that this save us one column from the table we had on the last slide. But let’s be honest, if one column is the ONLY thing we save, we probably will not do it. But when you have to design for the entire MIPS instruction set, this column will used for ALL R-type instructions, which is more than just Add and Subtract I showed you here. Another advantage of this table over the last one, besides being smaller, is that we can uniquely identify each column by looking at the Op field only. Therefore, as I will show you later, the Main Control ONLY needs to look at the Opcode field. How many bits do we need for ALUop? +3 = 45 min. (Y:25) func ALU Control (Local) ALUctr op Main Control 6 3 ALUop 6 N ALU

The Encoding of ALUop Main Control op 6 ALU (Local) func N ALUop ALUctr 3 In this exercise, ALUop has to be 2 bits wide to represent: (1) “R-type” instructions “I-type” instructions that require the ALU to perform: (2) Or, (3) Add, and (4) Subtract To implement more of MIPS ISA, ALUop has to be bigger to represent more (4 bits in book to add NOR): (2) Or, (3) Add, (4) Subtract, and (5) Nor (Example: nor) R-type ori lw sw beq ALUop (Symbolic) “R-type” Or Add Subtract ALUop<1:0> 11 10 00 01 Well the answer is 2 because we only need to represent 4 things: “R-type,” the Or operation, the Add operation, and the Subtract operation. If you are implementing the entire MIPS instruction set, then ALUop has to be 3 bits wide because we will need to repreent 5 things: R-type, Or, Add, Subtract, and AND. Here I show you the bit assignment I made for the 3-bit ALUop. With this bit assignment in mind, let’s figure out what the local control ALU Control has to do. +1 = 26 min. (Y:26)

Drawback of this Single Cycle Processor Long cycle time: Cycle time must be long enough for the load instruction: PC’s Clock -to-Q + Instruction Memory Access Time + Register File Access Time + ALU Delay (address calculation) + Data Memory Access Time + Register File Setup Time + Clock Skew Cycle time for load is much longer than needed for all other instructions Well, the last slide pretty much illustrate one of the biggest disadvantage of the single cycle implementation: it has a long cycle time. More specifically, the cycle time must be long enough for the load instruction which has the following components: Clock to Q time of the PC, .... Having a long cycle time is a big problem but not the the only problem. Another problem of this single cycle implementation is that this cycle time, which is long enough for the load instruction, is too long for all other instructions. We will show you why this is bad and what we can do about it in the next few lectures. That’s all for today. +2 = 79 min (Y:59)

Next Time: MultiCycle Data Path Preview Next Time: MultiCycle Data Path CPI  1, CycleTime much shorter (~1/5 of time)

Summary Single cycle datapath => CPI=1, CCT => long 5 steps to design a processor 1. Analyze instruction set => datapath requirements 2. Select set of datapath components & establish clock methodology 3. Assemble datapath meeting the requirements 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic Control is the hard part MIPS makes control easier Instructions same size Source registers always in same place Immediates same size, location Operations always on registers/immediates Processor Input Control Memory Datapath Output

Where to get more information? Chapter 5.1 to 5.4 of your text book: David Patterson and John Hennessy, “Computer Organization & Design: The Hardware / Software Interface,” Third Edition, Morgan Kaufman Publishers, San Mateo, California, 2003. One of the best PhD thesis on processor design: Manolis Katevenis, “Reduced Instruction Set Computer Architecture for VLSI,” PhD Dissertation, EECS, U C Berkeley, 1982. For a reference on the MIPS architecture: Gerry Kane, Joe Heinrich “MIPS RISC Architecture,” Prentice Hall, 2nd edition, 1992 If you want to find out more information on this topic, you should read Section 5.1 to 5.3 of your text book. One of the best book on RISC processor design is Manolis’ PHD thesis and you should be able to get a copy from the CS department here. Finally, if you want the official reference on the MIPS architecture, here is the book. OK, see you guys next Friday and good luck on the mid-term. +1 = 80 min. (Z:00)

Bonus Slides The following slides show how to go from tables that describe inputs and control lines into gates Can be done by CAD tools instead of by hand for 2-level logic equations very efficiently

The Truth Table for ALUctr funct<3:0> Instruction Op. 0000 add R-type ori lw sw beq ALUop (Symbolic) “R-type” Or Add Subtract ALUop<2:0> 1 00 0 10 0 00 0 01 0010 subtract 0100 and 0101 or 1010 set-on-less-than ALUop func bit<2> bit<1> bit<0> bit<3> x ALUctr ALU Operation Add 1 Subtract Or And Set on < That is, whenever ALUop is 000, we don’t care anything about the func field because we know we need the ALU to do an ADD operation (point to Add column). Whenever the ALUop bit<2> is 0 and bit<0> is 1, we know we want the ALU to perform a Subtract regarless of what func field is. Bit<1> is a don’t care because for our encoding here, ALUop<1> will never be Zero to 1 whenever bit<0> is 1 and bit<2> is 0. Similarly, whenever ALUop bit<2> is 0 and bit<1> is 1, we need the ALU to perform Or. The tricky part occrus when the ALUOp bit<2> Zeros to 1. In that case, we have a R-type instrution and we need to look at the Func field. In any case, once we have this Symbolic column, we can get this actual bit columns by referring to our ALU able on the last slide (use the last slide if time permit). +2 = 30 min. (Y:30)

The Logic Equation for ALUctr<2> ALUop func bit<2> bit<1> bit<0> bit<3> bit<2> bit<1> bit<0> ALUctr<2> x 1 x x x x 1 1 x x 1 1 1 x x 1 1 1 This makes func<3> a don’t care (ALUctr<2> = ! ALUop<2>) & ALUop<0> + ALUop<2> & !func<2> & func<1> & !func<0> From the truth table we had before the break, we can derive the logic equation for ALUctr bit 2 but collecting all the rows that has ALUCtr bit 2 Zeros to 1 and this table is the result. Each row becomes a product term and we need to OR the prodcut terms together. Notice that the last row are identical except the bit<3> of the func fields. One is zero and the other is one. Together, they make bit<3> a don’t care term. With all these don’t care terms, the logic equation is rather simple. The first prodcut term is: not ALUOp<2> and ALUOp<0>. The second product term, after we making Func<3> a don’t care becomes ... +2 = 57 min. (Y:37)

The Logic Equation for ALUctr<1> ALUop func bit<2> bit<1> bit<0> bit<3> bit<2> bit<1> bit<0> ALUctr<1> x x x x 1 x 1 x x x x 1 1 x x 1 1 x x 1 1 1 x x 1 1 1 (ALUctr<1> = !ALUop<2>) & !ALUop<1> + ALUop<2> & !func<2> & !func<0> Here is the truth table when we collect all the rows whereALCctr bit<1> Zeros to 1. Once again, we can simplify the table by noticing that the first two rows are different only at the ALUop bit<0> position. We can make ALUop bit<0> into a don’t care. Similarly, the last three rows can be combined to make Func bit<3> and bit<1> into don’t cares. Consequently, the logic equation for ALUctr bit<1> becomes ... +2 = 59 min. (Y:39)

The Logic Equation for ALUctr<0> ALUop func bit<2> bit<1> bit<0> bit<3> bit<2> bit<1> bit<0> ALUctr<0> 1 x x x x x 1 1 x x 1 1 1 1 x x 1 1 1 ALUctr<0> = !ALUop<2> & ALUop<1> + ALUop<2> & !func<3> & func<2> & !func<1> & func<0> + ALUop<2> & func<3> & !func<2> & func<1> & !func<0> Finally, after we gather all the rows where ALUctr bit 0 are 1’s, we have this truth table. Well, we are out of luck here. I don’t see any simple way to simplify these product terms by just looking at them. There may be some if you draw out the 7 dimension K map but I am not going to try it. So I just write down the logic equations as it is. +2 = 61 min. (Y:41)

The ALU Control Block ALU Control (Local) func 3 6 ALUop ALUctr ALUctr<2> = !ALUop<2> & ALUop<0> + ALUop<2> & !func<2> & func<1> & !func<0> ALUctr<1> = !ALUop<2> & !ALUop<1> + ALUop<2> & !func<2> & !func<0> ALUctr<0> = !ALUop<2> & ALUop<1> + ALUop<2> & !func<3> & func<2> & !func<1> & func<0> + ALUop<2> & func<3> & !func<2> & func<1> & !func<0> With all the logic equations available, you should be able to implement this logic block without any problem. In your next homework assignment, all your control logic will be done in VHDL: you just describe your control logic as if you are writing a C program. It will be much easier and less error prone then what I show you here. Your TA will have a VHDL tutorial ready for you and it is very easy to lern. +1 = 62 min. (Y:42)

Step 5: Logic for each control signal PCSrc <= (OP == `BEQ) ? `Br : `plus4; ALUsrc <= (OP == `Rtype) ? `regB : `immed; ALUctr <= (OP == `Rtype`) ? funct : (OP == `ORi) ? `ORfunction : (OP == `BEQ) ? `SUBfunction : `ADDfunction; ExtOp <= _____________ MemWr <= _____________ MemtoReg <= _____________ RegWr: <=_____________ RegDst: <= _____________

Step 5: Logic for each control signal PCSrc <= (OP == `BEQ) ? `Br : `plus4; ALUsrc <= (OP == `Rtype) ? `regB : `immed; ALUctr <= (OP == `Rtype`) ? funct : (OP == `ORi) ? `ORfunction : (OP == `BEQ) ? `SUBfunction : `ADDfunction; ExtOp <= (OP == `ORi) : `ZEROextend : `SIGNextend; MemWr <= (OP == `Store) ? 1 : 0; MemtoReg <= (OP == `Load) ? 1 : 0; RegWr: <= ((OP == `Store) || (OP == `BEQ)) ? 0 : 1; RegDst: <= ((OP == `Load) || (OP == `ORi)) ? 0 : 1;

The “Truth Table” for the Main Control op 6 ALU (Local) func 3 ALUop ALUctr RegDst ALUSrc : op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010 R-type ori lw sw beq jump RegDst 1 x x x ALUSrc 1 1 1 x MemtoReg 1 x x x RegWrite 1 1 1 MemWrite 1 PCSrc 1 Now that we have taken care of the Local Control (ALU Control), let’s refocus our attention to the Main Controller. The job of the Main Control is to look at the Opcode field of the instruction and generate these control signals for the datapath (RegDst, ... ExtOp) as well as the 3-bit ALUop field for the ALU Control. Here, I have shown you the symbolic value of the ALUop field as well as the actual bit assignment. For example here (2nd column), the R-type ALUop is encode as 100 and the Add operation (3rd column) is encoded as 000.. This is call a quote “Truth Table” unquote because if you think about it, this is like having the truth table rotates 90 degrees. Let me show you what I mean by that. +3 = 65 min. (Y:45) Jump 1 ExtOp x 1 1 x x ALUop (Symbolic) “R-type” Or Add Add Subtract xxx ALUop <2> 1 x ALUop <1> 1 x ALUop <0> 1 x

The “Truth Table” for RegWrite op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010 R-type ori lw sw beq jump RegWrite 1 1 1 RegWrite = R-type + ori + lw = !op<5> & !op<4> & !op<3> & !op<2> & !op<1> & !op<0> (R-type) + !op<5> & !op<4> & op<3> & op<2> & !op<1> & op<0> (ori) + op<5> & !op<4> & !op<3> & !op<2> & op<1> & op<0> (lw) op<0> op<5> . <0> R-type ori lw sw beq jump For example, consider the control signal RegWrite. If we treat all the don’t cares as zeros, this row here means RegDest has to be Zero to one whenever we have a R-type, or an OR immediate, or a load instruction. Since we can determine whether we have any of these instructions (point to the column headers) by looking at the bits in the “OP” field, we can transform this symbolic equation to this binary logic equation. For example, the first product term here say we have a R-type instruction whenever all the bits in the “OP” field are zeros. So each of these big AND gates implements one of the columns (R-type, ori, ...) in our table. Or in more technical terms, each AND gate implements a product term. In order to finish implementing this logic equation, we have to OR the proper terms together. In the case of the RegWrite signal, we need to OR the R-type, ORi, and load terms together. +2 = 67 min. (Y:47) RegWrite

PLA Implementation of the Main Control op<0> op<5> . <0> R-type ori lw sw beq jump RegWrite ALUSrc RegDst MemtoReg MemWrite Similarly, for ALUSrc, we need to OR the ori, load, and store terms together because we need to assert the ALUSrc signals whenever we have the Ori, load, or store instructions. The RegDst, MemtoReg, MemWrite, Branch, and Jump signals are very simple. They don’t need to OR any product terms together because each is asserted for only one instruction. For example, RegDst is asserted ONLY for R-type instruction and MemtoReg is asserted ONLY for load instruction. ExtOp, on the other hand, needs to be set to 1 for both the load and store instructions so the immediate field is sign extended properly. Therefore, we need to OR the load and store terms together to form the signal ExtOp. Finally, we have the ALUop signals. But clever encoding of the ALUop field, we are able to keep them simple so that no OR gates is needed. If you don’t already know, this regular structure with an array of AND gates followed by another array of OR gates is called a Programmable Logic Array, or PLA for short. It is one of the most common ways to implement logic function and there are a lot of CAD tools available to simplify them. +3 = 70 min. (Y:50) Branch Jump ExtOp ALUop<2> ALUop<1> ALUop<0>

A Real MIPS Datapath (CNS T0) So that’s all for today. See you guys Friday.

Putting it All Together: A Single Cycle Processor ALUop ALU Control ALUctr 3 RegDst func op Main Control 3 Instr<5:0> 6 ALUSrc 6 : Instr<31:26> Instruction<31:0> PCSrc Instruction Fetch Unit Rd Rt <21:25> <16:20> <11:15> <0:15> RegDst Clk 1 Mux Rs Rt Rt Rs Rd Imm16 RegWr ALUctr 5 5 5 busA MemtoReg Zero MemWr Rw Ra Rb busW 32 32 32-bit Registers ALU 32 busB 32 Clk OK, now that we have the Main Control implemented, we have everything we needed for the single cycle processor and here it is. The Instruction Fetch Unit gives us the instruction. The OP field is fed to the Main Control for decode and the Func field is fed to the ALU Control for local decoding. The Rt, Rs, Rd, and Imm16 fields of the instruction are fed to the data path. Bsed on the OP field of the instruction, the Main Control of will set the control signals RegDst, ALUSrc, .... etc properly as I showed you earlier using separate slides. Furthermore, the ALUctr is use the ALUop from the Main conrol and the func field of the instruction to generate the ALUctr signals to ask the ALU to do the right thing: Add, Subtract, Or, and so on. This processor will execute each of the MIPS instruction in the subset in one cycle. +2 = 72 min (Y:52) 32 Mux Mux 32 WrEn Adr 1 1 Data In 32 Extender Data Memory imm16 32 16 Instr<15:0> Clk ALUSrc ExtOp