Presentation on theme: "ECE 353 Lab C Pipeline Simulator. Aims Further experience in C programming Handling strings Further experience in the use of assertions Reinforce concepts."— Presentation transcript:
Aims Further experience in C programming Handling strings Further experience in the use of assertions Reinforce concepts from pipelining Techniques for simulating parallel operations with a sequential program
Statement of Work Write a C program to simulate the register contents of a MIPS machine running an abbreviated instruction set Should be cycle-accurate with respect to register contents. Simulates a machine consisting of add, sub, addi, mul, lw, sw, beq instructions only. Will need to provide some error detection wrt the assembly program input.
Simulation Approaches State change in simulation Event-driven Simulation: Follows events as they occur. Time-driven Simulation: Driven by a clock; does something every clock transition. Computational Platform Sequential: Simpler to program and debug; slower. Parallel: More complicated program (need to take interactions into account); much higher throughput (potentially). Lab C involves a sequential, time-driven simulation.
Inputs to the Simulator MIPS assembly language program which will be in a file read by the simulator. Parameters of the simulated machine: Memory access time: c cycles. Multiply time: m cycles. All other EX operations: n cycles. ID and WB stages take one cycle to complete their tasks.
Memory Separate instruction and data memories (separate address spaces). Instruction Memory: Addressed by the PC. Size: 2K bytes (512 words). Data Memory: Accessible to the program. Size: 2K bytes (512 words). Implement both as linear arrays.
Processor Five stages: IF, ID, EX, MEM, WB Each stage follows the rules you learned in ECE 232, except that IF and EX can take multiple cycles to complete an operation.
IF Fetches from the Instruction Memory. Will have to freeze if there is an unresolved branch. Branches are detected in the ID stage. When a branch is detected, no further instructions are decoded until the branch is resolved. Branches are resolved in the EX stage where a subtraction operation is carried out and the result used to determine if the branch is taken or not.
ID Decodes instructions (takes 1 cycle to do this). Checks for data (RAW) and control hazards. Holds up further processing until these hazards have been resolved (no forwarding in this machine). Provides the EX stage with the operands it needs.
EX Executes the specified operation on the operands it receives from ID. Can take multiple cycles depending on the operation. EX is not itself pipelined; only one instruction can be in EX at any time. Use a counter to keep track of ongoing operations.
MEM Is only used by the lw and sw instructions. Receives the data memory address from EX and carries out the data memory read/write operation. Can take multiple cycles to do this (c cycles).
WB Writes back into the register file. Will have nothing to do if the instruction does not do a register write. Register reads/writes follow the approach in Hennessy & Patterson.
Simulator Modules progScanner(): Reads from the file provided by the user, containing the assembly language program. Note that this will have to flexible in dealing with empty spaces. parser(): From the instruction passed to it by progscanner(), break it up into its constituent fields (opcode, source registers, immediate field, destination register). Detect a certain subset of errors in the program : Illegal opcode Unrecognized register name Register number out of bounds Immediate field that is too large
Simulator Modules (contd.) Pipeline stage modules: IF ID EX MEM WB Pipeline stages will communicate via latches separating them. A latch can only hold information relating to a single instruction. Deal with structural hazards. Use valid/empty bits to specify if (a) the contents of a latch are valid for use by the stage to its right and (b) the latch is available to be written into by the stage to its left.
Simulator Execution We need to simulate parallel operations with a sequential program. How? Maintain a counter to represent the clock value. Each time the counter increments (representing an advance in time by one cycle), call the stages in reverse order: WB, MEM, EX, ID, IF. Why should this work?
Assertions Use assertions to help with correctness. Assertions should have been thought through and contribute meaningfully to the reliability of the program. Preconditions, postconditions, invariants can all be checked. For a good introduction, see http://ptolemy.eecs.berkeley.edu/~johnr/tutorials/assertions.html http://ptolemy.eecs.berkeley.edu/~johnr/tutorials/assertions.html
Some Design Issues Coordinating producer and consumer operation with respect to the latch between them. Multicycle instruction handling Reading assembly instruction lines and parsing them Information that needs to be carried along with the instruction down the pipeline Dealing with hazards: Structural: Can only move from a stage if the next stage is free. Data: RAW. Control: Pending branch freezes all subsequent ops until it is resolved.
Producer/Consumer Latch Management Rules for correct functioning: Producer stage should not load into a latch before the consumer has consumed the previous item it placed there. Consumer stage should not read from a latch before the producer has placed a new item there. Solution: Use a validItem bit, stored in the latch itself. Set by the producer when it places a new item in the latch Cleared by the consumer Must be done in the same cycle in which the consumer places its result in its output latch or writes back to the register file. This approach is only safe since there is no race condition between producer and consumer stages in the simulation. Do not try this if doing a hardware implementation.
Multicycle Operations IF takes c cycles to access instruction memory EX takes m cycles for multiply and n for other operations MEM takes c cycles to access data memory for lw and sw instructions Implementation: Use a counter initialized appropriately when an operation starts in a stage. The counter will be decremented each cycle.
Reading Assembly Code Need to find a suitable function from the standard C library for this. fscanf() is not the most convenient option: why? There are other options, e.g., char *fgets(char *string, int n, FILE *stream);
Parsing Assembly Code Break the code down into the opcode and other fields. Opcode: Is delimited to its right by at least one empty space. Other fields: Delimited by a comma. Empty spaces can be ignored. Programming Approaches: Walk down the string returned by fgets() looking for delimiters. Use a library function strtok(). Caution: strtok() modifies its first argument… Some useful functions: strcmp(); atoi(); isdigit(). Make sure you know what these do before using them!
Some error detection Illegal opcodes: Match the opcode field against all legal opcodes (the MIPS subset plus haltSimulation ). Register not recognized or number out of bounds: Registers can be specified either symbolically (e.g., $s0, $t5) or numerically (e.g., $8): check first character beyond $ to see which it is. If numerically specified, registers must be numbered 0-31. If symbolically specified, the character string must match a legal register name.
Error detection (continued) Immediate field must be representable in 16 bits (two’s complement). Check if the numerical quantity is within bounds. Memory accesses must be aligned to integer boundaries Data addresses must be a multiple of 4. Memory alignment problems are detected by MEM; other errors are detected by ID. When an error is detected, the simulator should print an error message specifying the error type and then stop.
Carrying information down the pipeline Since control signals are not required to be simulated, you can pass the entire instruction down the pipeline. To handle beq, carry the PC as well. exResult will go from EX to MEM and then to WB.
Hazards: Structural Each stage can be occupied by no more than one instruction at any time. A pipeline latch will be cleared by the consuming stage only in the cycle in which it fills the latch to its right, if any. A completed operation with an uncleared latch to its right has to wait until that latch has been cleared (by setting validItem=0 ) by the next stage. This can cause backpressure in the pipeline.
Hazards: Control beq is the only instruction in this set that causes a control hazard. When beq is decoded in ID: A branchPending flag is set by ID. branchPending=1 causes the instruction following the one decoded to be removed. IF idles while a branch is pending; it can only resume in the cycle after branchPending has been reset (be careful about how you implement this since you are calling IF() after EX() in your simulation!). The branch is resolved in EX. If branch is taken, PC is set to point to the target instruction, else to the next instruction immediately following the beq. EX resets branchPending.
Hazards: Data Three types of data hazard: RAW, WAR, WAW. Due to the linear nature of the simple MIPS pipeline, MIPS is only subject to RAW hazards. Detected in the ID stage. ID checks to see if any latch in front of it has an instruction producing data that will be consumed by the instruction it is decoding. If so, hold that instruction in ID until all such hazards resolve.
Statistics Stage Utilization: Fraction of cycles for which the stage is doing useful work. Useful work does NOT include waiting for some hazard to clear. Keep a counter in each function which counts the number of cycles the stage has been busy. Global. Static. Remember to initialize all counters!
Implementation: IF Stage Takes the PC value and fetches the appropriate instruction from the Instruction Memory (takes c cycles for this). Sends the instruction along with its PC value to the next stage. Freezes whenever instructed to (IF_ID latch full or branchPending set). Increments PC and starts fetching next instruction unless it is made to freeze. Increments usefulCyclesIF counter whenever it is doing useful work.
Implementation: ID Stage When IF_ID latch has a valid element, decodes the instruction. Detects all data hazards. If the instruction is beq, sets branchPending flag. Waits for any structural hazards to clear. Fetches the operands for use by the EX stage and places them in the ID_EX latch.
Implementation: EX Stage Does not need to worry about data or control hazards; these have been resolved by the time this instruction has got to EX. Calculates the result of the requested operation: this takes m cycles for MULT and n cycles for all other operations. If the EX_MEM latch is free, places the result in this latch and clears the ID_EX latch for use by ID; else waits for the EX_MEM latch to clear.
Implementation: MEM Stage All non-memory instructions simply pass through this stage. lw and sw instructions involve memory access which takes c cycles. sw instructions complete at MEM; the result for lw is passed to the WB stage along with the instruction. Does MEM have to worry about structural hazards?
Implementation: WB Stage Has no latch in front of it; hence no structural hazards can impede it. Takes just one cycle. Does nothing for beq and sw instructions.
Submission Summary sheet on Moodle Authorship/testing information. Call graph of code Testing procedures used Assertions used Upload program code on quark. Only a single file should be uploaded: no zipfiles.