Presentation is loading. Please wait.

Presentation is loading. Please wait.

DMY 16-bit RISC Microprocessor Cecilia Florescu Mojdeh Makabi Daniel Yee December 2, 2002 CS M152B.

Similar presentations


Presentation on theme: "DMY 16-bit RISC Microprocessor Cecilia Florescu Mojdeh Makabi Daniel Yee December 2, 2002 CS M152B."— Presentation transcript:

1 DMY 16-bit RISC Microprocessor Cecilia Florescu Mojdeh Makabi Daniel Yee December 2, 2002 CS M152B

2 DMY Overview n Purpose: Design a pipelined RISC microprocessor n Design Platform: Xilinx ISE 4.1, ModelSim 5.6, Visual C++ 6.0, Windows 2000 Professional

3 DMY Pipelining It acts like an assembly line Fords Auto Assembly Line Station 1 Station 2 Station 3 Station 4 Sequential Auto Production VS Pipelining Auto Production Auto Production Time Auto Production

4 DMY Pipelined RISC RISC is an acronym for Reduced Instruction Set Computer It has a reduced and simple instruction set It has a reduced and simple instruction set It has a large number of general-purpose registers It has a large number of general-purpose registers In our Pipelined RISC Processor: Each instruction takes 1 clock cycle for each stage Each instruction takes 1 clock cycle for each stage The processor can accept 1 new instruction per clock The processor can accept 1 new instruction per clock Instructions are processed in stages as they pass down Instructions are processed in stages as they pass down Multiple instructions in some phase of execution concurrently Multiple instructions in some phase of execution concurrently Pipelining doesn't improve the latency of instructions (each instruction still requires the same amount of time to complete) Pipelining doesn't improve the latency of instructions (each instruction still requires the same amount of time to complete) It does improve the overall throughput It does improve the overall throughput

5 DMY Pipelined RISC Design

6 DMY Instruction Fetch Stage

7 DMY Instruction Decode Stage

8 DMY Execution Stage

9 DMY Memory Access Stage

10 DMY Write Back Stage

11 DMY Modified Pipelined RISC Design n 16-bit ISA 16-bit fixed-length instructions, 16 registers 16-bit fixed-length instructions, 16 registers no funct field for R-type, only op field no funct field for R-type, only op field limited number of operations limited number of operations 4-bit opcode field => maximum 16 operations 4-bit opcode field => maximum 16 operations opcode opcode opcode rsrtrd rsrtaddress target address R-type I-type J-type opcodersrtrd SuggestedR-type funct

12 DMY Multiplier Algorithms n Pencil-and-paper method x x requires M cycles for one NxM multiplication requires M cycles for one NxM multiplication implemented with AND, adder, and shift register implemented with AND, adder, and shift register

13 DMY Multiplier Algorithms n Array Multiplier

14 DMY Multiplier Algorithms n Modified Booth Encoding (MBE) reduces number of partial products by N/2 for MxN multiplication reduces number of partial products by N/2 for MxN multiplication performs parallel encoding v. serial encoding in original Booth performs parallel encoding v. serial encoding in original Booth

15 DMY increases speed of summing by increases speed of summing by all bits of PP in each column are all bits of PP in each column are x-2 compressor composed of CSAs; x-2 compressor composed of CSAs; 3-2compressor3-2compressor3-2compressor 3-2compressor3-2compressor 4-2compressor P0jP0jP0jP0j P1jP1jP1jP1j P2jP2jP2jP2j P3jP3jP3jP3j P4jP4jP4jP4j P5jP5jP5jP5j P6jP6jP6jP6j P7jP7jP7jP7j P8jP8jP8jP8j c2jc2jc2jc2j c 3 j-1 c 2 j-1 c 1 j-1 c1jc1jc1jc1j c 4 j-1 c 5 j-1 c4jc4jc4jc4j c3jc3jc3jc3j c5jc5jc5jc5j c6jc6jc6jc6j c 6 j-1 Carry[j]Sum[j] Multiplier Algorithms n Wallace Tree increased parallelism added independently and simultaneously x := the number of PPs in column 9-2 Compressor

16 DMY Multiplier Design n Issues and Solutions limited opcode size limited opcode size made NOP instruction ADD $0, $0, $0 => freed one opcode made NOP instruction ADD $0, $0, $0 => freed one opcode ADD instruction doesnt change register $0 (constant zero value) ADD instruction doesnt change register $0 (constant zero value) latency v. simplicity latency v. simplicity multiplier lies in critical path; must calculate product in one cycle multiplier lies in critical path; must calculate product in one cycle algorithms trade simplicity of control and/or wiring for faster speed algorithms trade simplicity of control and/or wiring for faster speed multiplier latency not detrimental if n is small enough multiplier latency not detrimental if n is small enough => 8x8 multiplier => 8x8 multiplier negative and positive integer multiplication negative and positive integer multiplication 8 LSB of 16-bit operand taken as a twos complement number 8 LSB of 16-bit operand taken as a twos complement number sign detection unit detects signs operands and sets product sign sign detection unit detects signs operands and sets product sign

17 DMY Exception Managing Hardware n Pipeline Modifications EPC register tracks the problematic instructionEPC register tracks the problematic instruction EPC_2 register to hold the instruction to return to, if allowedEPC_2 register to hold the instruction to return to, if allowed Expansion of control unit to detect overflow signal and handle exceptionExpansion of control unit to detect overflow signal and handle exception

18 DMY Arithmetic Overflow Handler Software Support Assurance that MEM and WB stages of pipeline continue executionAssurance that MEM and WB stages of pipeline continue execution

19 DMY Arithmetic Overflow Handler Software Support Assurance that MEM and WB stages of pipeline continue executionAssurance that MEM and WB stages of pipeline continue execution Interruption of programInterruption of program

20 DMY Arithmetic Overflow Handler Software Support Assurance that MEM and WB stages of pipeline continue executionAssurance that MEM and WB stages of pipeline continue execution Interruption of programInterruption of program Request to involve the operating system Request to involve the operating system

21 DMY Arithmetic Overflow Handler Software Support Assurance that MEM and WB stages of pipeline continue executionAssurance that MEM and WB stages of pipeline continue execution Interruption of programInterruption of program Request to involve the operating system Request to involve the operating system Enhancement of ISA Enhancement of ISA MFCO - move from coprocessor MFCO - move from coprocessor JR - jump to address stored in reserved register JR - jump to address stored in reserved register

22 DMY Overflow Example Instruction stored at address 103: = Note: 2 16 = = < < 65559

23 DMY Conclusion n 16-bit processor, enhanced with a multiplier and able to detect arithmetic overflow n Harvard Architecture model for memory management n 14 multipurpose, 2 reserved registers n Advantages and disadvantages of designed 16-bit ISA

24 DMY References Boerger, Egon. Architecture Design and Validation Methods. New York Springer, Boerger, Egon. Architecture Design and Validation Methods. New York Springer, Carpinelli, John D. Computer Systems Organization and Architecture. Boston: Addison-Wesley, Carpinelli, John D. Computer Systems Organization and Architecture. Boston: Addison-Wesley, Cohen, Ben. VHDL Coding Styles and Methodologies. Boston: Kluwer Academic Publishers, Cohen, Ben. VHDL Coding Styles and Methodologies. Boston: Kluwer Academic Publishers, Dahan, David. 17x17-Bit, High-Performance, Fully Synthesizable Multiplier. Technology Licensing Division DSP Group Inc. Dahan, David. 17x17-Bit, High-Performance, Fully Synthesizable Multiplier. Technology Licensing Division DSP Group Inc. Ercegovac, Milos D., Thomas Lang, and Jaime H. Moreno. Introduction to Digital Systems. New York: John Wiley & Sons, Inc., Ercegovac, Milos D., Thomas Lang, and Jaime H. Moreno. Introduction to Digital Systems. New York: John Wiley & Sons, Inc., Hennessy, John L. and David A. Patterson. Computer Organization and Design. 2nd ed. San Francisco: Morgan Kaufmann Publishers Inc., Hennessy, John L. and David A. Patterson. Computer Organization and Design. 2nd ed. San Francisco: Morgan Kaufmann Publishers Inc., High Speed Parallel Multiplier For LEON Processor Algorithm. High Speed Parallel Multiplier For LEON Processor Algorithm. Lab #5: Implementation of a Multiplier. EE116L course, UCLA. Lab #5: Implementation of a Multiplier. EE116L course, UCLA. Nahata, Sunny and Rohit Madampath. 8 by 8 bit High Speed Multiplier Design Using (4,2) Counters Nahata, Sunny and Rohit Madampath. 8 by 8 bit High Speed Multiplier Design Using (4,2) Counters Smith, James E. The Microarchitecture of Superscalar Processors. New York: Madison, Smith, James E. The Microarchitecture of Superscalar Processors. New York: Madison, Stalling, William. Computer Organization and Architecture. 6th ed. Upper Saddle River: Prentice Hall, Stalling, William. Computer Organization and Architecture. 6th ed. Upper Saddle River: Prentice Hall, Sweetman, Dominic. See MIPS Run. San Francisco: Morgan Kaufmann Publishers Inc., Sweetman, Dominic. See MIPS Run. San Francisco: Morgan Kaufmann Publishers Inc., Tamir, Yuval. Computer Systems Architecture Notes. UCLA. Tamir, Yuval. Computer Systems Architecture Notes. UCLA. Yeh, Wen-Chang and Chein-Wei Jen. High-Speed Booth Encoded Parallel Multiplier Design. IEEE Transactions on Computers, Vol. 49, No. 7. July Yeh, Wen-Chang and Chein-Wei Jen. High-Speed Booth Encoded Parallel Multiplier Design. IEEE Transactions on Computers, Vol. 49, No. 7. July 2000.


Download ppt "DMY 16-bit RISC Microprocessor Cecilia Florescu Mojdeh Makabi Daniel Yee December 2, 2002 CS M152B."

Similar presentations


Ads by Google