ECE 353 ECE 353 Fall 2009 Lab C Pipeline Simulator October 22, 2009.

Slides:



Advertisements
Similar presentations
CMPE 421 Advanced Parallel Computer Architecture Pipeline datapath and Control.
Advertisements

Mehmet Can Vuran, Instructor University of Nebraska-Lincoln Acknowledgement: Overheads adapted from those provided by the authors of the textbook.
COMP 4211 Seminar Presentation Based On: Computer Architecture A Quantitative Approach by Hennessey and Patterson Presenter : Feri Danes.
CIS 314 Fall 2005 MIPS Datapath (Single Cycle and Multi-Cycle)
Lecture Objectives: 1)Define pipelining 2)Calculate the speedup achieved by pipelining for a given number of instructions. 3)Define how pipelining improves.
CMPT 334 Computer Organization
1 COMS 361 Computer Organization Title: Instructions Date: 9/28/2004 Lecture Number: 10.
ECE 353 Lab C Pipeline Simulator. Aims Further experience in C programming Handling strings Further experience in the use of assertions Reinforce concepts.
Instruction-Level Parallelism (ILP)
1 A few words about the quiz Closed book, but you may bring in a page of handwritten notes. –You need to know what the “core” MIPS instructions do. –I.
ECE 353 ECE 353 Fall 2011 Lab C Pipeline Simulator October 20, 2011.
Chapter 12 Pipelining Strategies Performance Hazards.
EECS 470 Pipeline Hazards Lecture 4 Coverage: Appendix A.
CSCE 121, Sec 200, 507, 508 Fall 2010 Prof. Jennifer L. Welch.
Computer Architecture CPSC 321 E. J. Kim. Overview Logical Instructions Shifts.
Fall 2007 MIPS Datapath (Single Cycle and Multi-Cycle)
Pipeline Exceptions & ControlCSCE430/830 Pipeline: Exceptions & Control CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng.
CSCE 212 Quiz 9 – 3/30/11 1.What is the clock cycle time based on for single-cycle and for pipelining? 2.What two actions can be done to resolve data hazards?
Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr CS-447– Computer Architecture.
Computer Structure - Datapath and Control Goal: Design a Datapath  We will design the datapath of a processor that includes a subset of the MIPS instruction.
The Processor 2 Andreas Klappenecker CPSC321 Computer Architecture.
Chapter 12 CPU Structure and Function. Example Register Organizations.
Appendix A Pipelining: Basic and Intermediate Concepts
Processor I CPSC 321 Andreas Klappenecker. Midterm 1 Thursday, October 7, during the regular class time Covers all material up to that point History MIPS.
ECE 353 ECE 353 Fall 2007 Lab 3 Machine Simulator November 1, 2007.
The Processor Andreas Klappenecker CPSC321 Computer Architecture.
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
MIPS Instruction Set Advantages
Chapter 2 Summary Classification of architectures Features that are relatively independent of instruction sets “Different” Processors –DSP and media processors.
CMPE 421 Parallel Computer Architecture
ECE 353 Lab 2 Pipeline Simulator. Aims Further experience in C programming Handling strings Further experience in the use of assertions Reinforce concepts.
ECE 353 Lab 2 Pipeline Simulator Additional Material.
Microarchitecture. Outline Architecture vs. Microarchitecture Components MIPS Datapath 1.
5/13/99 Ashish Sabharwal1 Pipelining and Hazards n Hazards occur because –Don’t have enough resources (ALU’s, memory,…) Structural Hazard –Need a value.
LECTURE 7 Pipelining. DATAPATH AND CONTROL We started with the single-cycle implementation, in which a single instruction is executed over a single cycle.
Simple ALU How to perform this C language integer operation in the computer C=A+B; ? The arithmetic/logic unit (ALU) of a processor performs integer arithmetic.
PROCESSOR PIPELINING YASSER MOHAMMAD. SINGLE DATAPATH DESIGN.
COM181 Computer Hardware Lecture 6: The MIPs CPU.
LECTURE 10 Pipelining: Advanced ILP. EXCEPTIONS An exception, or interrupt, is an event other than regular transfers of control (branches, jumps, calls,
MIPS Processor.
Simulator Outline of MIPS Simulator project  Write a simulator for the MIPS five-stage pipeline that does the following: Implements a subset of.
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
Interstage Buffers 1 Computer Organization II © McQuain Pipeline Timing Issues Consider executing: add $t2, $t1, $t0 sub $t3, $t1, $t0 or.
CS161 – Design and Architecture of Computer Systems
ECE 353 Lab 3 Pipeline Simulator
MIPS Instruction Set Advantages
CSCI206 - Computer Organization & Programming
ECE 353 Lab 3 Pipeline Simulator
MIPS Processor.
CS/COE0447 Computer Organization & Assembly Language
Design of the Control Unit for Single-Cycle Instruction Execution
Pipelining: Advanced ILP
Single-cycle datapath, slightly rearranged
Single-Cycle CPU DataPath.
Current Design.
Design of the Control Unit for One-cycle Instruction Execution
CSCE Fall 2013 Prof. Jennifer L. Welch.
MIPS Processor.
CSCI206 - Computer Organization & Programming
Control unit extension for data hazards
Guest Lecturer TA: Shreyas Chand
Instruction Execution Cycle
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
Computer Architecture
CSCE Fall 2012 Prof. Jennifer L. Welch.
ECE 353 Lab 3 Pipeline Simulator
Control unit extension for data hazards
MIPS assembly.
Control unit extension for data hazards
MIPS Processor.
Presentation transcript:

ECE 353 ECE 353 Fall 2009 Lab C Pipeline Simulator October 22, 2009

2 ECE 353 Aims of Lab C  Reinforce your understanding of pipelining  Provide additional experience in C programming Managing queues  Introduce you to time-driven simulation

3 ECE 353 Outline of Lab  Write a simulator for the MIPS five-stage pipeline (covered in ECE 232) that does the following: Implements a subset of the instruction set (LW, SW, BEQ, ADD, SUB, MUL, ADDI) Reads from a file an assembly language program Simulates, cycle by cycle, the activity in all registers associated with that program Displays the values of all the integer registers (except for register 0) and the PC Gives the user the option of stepping through the simulation, cycle by cycle, or just checking the registers at the end of the program and the utilization of each stage

4 ECE 353 Types of Simulator  Event-Driven Simulator: Identifies events of interest in the system being simulated, orders them in time, and moves down the ordered list of events, simulating the machine as it goes. Example: Queuing network simulator for a computer network.  Time-Driven Simulator: Has a central clock and moves from one clock cycle to the next, simulating all the activity of interest in that clock cycle. Example: Architecture simulator

5 ECE 353 Five-Stage Pipeline  IF Stage: Fetches instructions If a conditional branch is encountered, the fetch unit stops fetching until that branch is resolved. The IF stage places fetched instructions in an instruction queue, to be consumed by the ID stage. The instruction queue can hold up to q instructions, where q is an input to the simulator. Assume the PC has its own dedicated adder to increment the byte address by 4 (or word address by 1).  ID Stage: Decodes the instructions in the instruction queue, one by one. Immediate operands are sign-extended to 32 bits Makes operands available to the EX stage Generates control signals

6 ECE 353 Five-Stage Pipeline (contd.)  EX Stage: Executes arithmetic/logical instructions BEQ instructions are resolved in this stage  MEM Stage: Carries out access to the data memory  WB Stage: Writes back into the register file (if necessary)

7 ECE 353 Simplifying Assumptions  Separate instruction and data memories; all memory accesses are hits. No need to model cache misses.  There is just one ALU and it is NOT pipelined. It takes m cycles for MUL n cycles for all other arithmetic and logic operations m and n are input parameters to the simulator Integer operations only: FP is not implemented  No forwarding is available in this pipeline  Ignore all interrupts  Register writes are done in the first half of a clock cycle; register reads in the second

8 ECE 353 Interaction between IF and ID stages  IF must not overrun the ID unit: if the instruction queue fills up, IF must stop fetching until there is at least one empty slot in the queue  ID must not overrun the IF unit

9 ECE 353 Parsing Instructions  The assembly language program must be read by the simulator You can use a case statement associated with each possible instruction, which tells the simulator what to do with that instruction.  When an instruction reaches the ID stage, controls governing the rest of its activity will be generated.

10 ECE 353 A Straightforward Approach  Mimic within the simulator what happens in the MIPS pipeline Generate the control signals associated with each instruction in the ID stage Pass these signals along, stage by stage, along with the instruction. Don’t forget to check for data hazards Use a branch_pending signal to guide the IF unit.  Every clock tick, mimic what is supposed to happen in each of the five stages and collect statistics on which stages are doing anything useful

11 ECE 353 Maintain Data Memory State  Keep track of the contents of the data memory  Each data memory access takes c cycles  Some data (as specified by the lab handout) will already be in the data memory when the program starts Make any reasonable assumption about the addresses of your data  No virtual memory: assume all addresses are physical  Data and instruction address spaces are separate

12 ECE 353 Instruction Memory  The program starts at location 0 of the instruction memory  The PC will point to this location at the beginning of the simulation  All addresses are physical: no virtual addresses (no need for TLBs or page tables)  Each memory access takes c cycles

13 ECE 353 Test Program 1  Multiplication of two 10X10 matrices: C=A X B.  Starting address of matrices A, B, and C are already in registers $1, $2, and $3 when execution begins (since you don’t have a virtual memory management system, don’t worry about how these addresses were loaded into these registers): pick any appropriate starting data addresses  The values of A and B for the test program are: A[i][j] = B[i][j] = i*2+j. Initialize your simulator by loading these values into the data memory: should be a separate function called d_initialize.

14 ECE 353 Simplifications  To make it easier for you to read in the assembly language program: The assembly language is redefined to do away with commas; spaces will do instead. No register names will be used: register numbers only. Example: add means add register 1 contents to those of register 2 and put the result in register 3. The SW and LW instructions are redefined to remove their offset field Example: lw 5 3 means to load into register 5 from the memory at the location pointed to by register 3.  For 10% extra credit, remove all these limitations and enable the simulator to accept the standard MIPS assembly language format.

15 ECE 353 Simplifications (contd.)  Don’t use labels in your program: instead the beq instruction will identify its target by an offset. Remember from ECE 232 that the offset is with respect to the instruction following beq and is in units of words, not bytes Example: beq means that if the contents of registers 4 and 5 are identical, we branch to the instruction immediately preceding beq.  For 5% extra credit, endow the simulator with the ability to handle instruction labels.

16 ECE 353 Reading the Assembly Language Program  C provides a number of ways in which to read input. For example, fscanf(fp, “%s %d %d %d”, opcode, &field2, &field3, &field4); where fp is a file pointer, opcode is a character array, and field2, field3, and field4 are integers

17 ECE 353 Parsing the Assembly Language Program  Some useful library functions (remember to #include ): strcpy: Copies one string into another strcmp: Compares two strings. Note that this will give you an output of 0 if the two strings match.

18 ECE 353 Test Program 2  Use quasi-instructions: these don’t do anything other than take up time in the pipeline stages.  A quasi-instruction: Consumes one cycle to decode and takes one cycle to pass through each of the MEM and WB stages. Takes a random amount of time in the EX stage: picks any of the integers in {a, a+1, …, b} with equal probability and uses that as its execution time. Has no impact on the register contents.

19 ECE 353 Intermediate Checkpoints Show a TA the following items:  Code for reading the assembly program (by November 5 – carries deadline-sensitive 5% credit): Write assembly code to multiply two 10X10 matrices Demonstrate that your code for reading the assembly code from a file works correctly  Code for Instruction Queue Management: (by November 10 – carries deadline-sensitive 5% credit): Interaction between IF and ID units Handling unresolved branches Demonstrate that your code works correctly

20 ECE 353 Simulator Output  The simulator can be set to operate in either of two modes: single-step or batch  Output consists of Contents of each of the 31 registers $1 to $31 and that of the PC (in decimal) The number of clock cycles that have elapsed Utilization of each stage IF, ID, EX, MEM, WB. In single-step mode, this will be the utilization so far; in batch mode, this will be the utilization for the entire program The utilization of a stage is the number of cycles during which that stage did anything useful divided by the total number of cycles

21 ECE 353 Lab Report  Description of the structure of your simulator: Include How the IF stage manages branches and how it knows that branches have been resolved How hazards are handled How instructions are prevented from progressing if there’s a data or structural hazard How you implement single-step mode  Source code for your simulator Fully document your code Keep the code as well-structured as possible  Matrix multiplication program: For the parameter values specified in the lab document, plot the following: Assembly language code Total execution time Fraction of fetch buffer utilized Utilization of each pipeline stage  Your strategy for testing the simulator for correctness.

ECE 353 Doug Frazer Tips from 2008 TA

23 ECE 353 Tips from TA (Doug Frazer)  Start with the queue: Write enqueue and dequeue functions Test them thoroughly against a controlled set of instructions Be sure to test things such as: Calling dequeue on an empty queue Handling NULL next and prev pointers on dequeue and enqueue  Handle dependencies on queue  Handle different types of MIPS commands  Test with MIPS

24 ECE 353 Tips from TA  The assembly and C are relatively independent One person can work on each  If you need to test the C and do not have working assembly, try running it against MIPS from your hardware book.  Debug with printf() statements.

25 ECE 353 Intro to Structures and enums  Generic data structure: struct Define necessary information to be grouped together  Precompiler ‘labels’: enum Makes the code easier to read and debug by using BEQ instead of an integer to represent BEQ Gets compiled to an integer, first element is 0 and it increments by 1 unless you specify a different value  Useful for this lab: Define a struct for each instruction Define an enum for registers Define an enum for instruction type

26 ECE 353 Suggested struct  Suggested struct for this lab struct instruction { enum OPCODE opcode; // What instruction is being run enum REGISTERS field2; // Register to be written to enum REGISTERS field3; // First register being referenced enum REGISTERS field4; // Second register being referenced enum STAGE stage; // Current stage of execution of this instruction int resolved = 0; // Flag for “waiting for dependency” struct instruction * next; // Pointer to the next struct in queue struct instruction * prev; // Pointer to previous struct in queue };  MUL $t1 $t2 $t3