Divide Calculation Latency

Slides:



Advertisements
Similar presentations
Multiplication and Division
Advertisements

Introduction So far, we have studied the basic skills of designing combinational and sequential logic using schematic and Verilog-HDL Now, we are going.
Arithmetic for Computers
Lab 10 : Arithmetic Systems : Adder System Layout: Slide #2 Slide #3 Slide #4 Slide #5 Arithmetic Overflow: 2’s Complement Conversions: 8 Bit Adder/Subtractor.
Datorteknik ArithmeticCircuits bild 1 Computer arithmetic Somet things you should know about digital arithmetic: Principles Architecture Design.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 8 - Multiplication.
1 CONSTRUCTING AN ARITHMETIC LOGIC UNIT CHAPTER 4: PART II.
THE ARITHMETIC-LOGIC UNIT. BINARY HALF-ADDER BINARY HALF-ADDER condt Half adder InputOutput XYSC
Division CPSC 321 Computer Architecture Andreas Klappenecker.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 3:
Princess Sumaya Univ. Computer Engineering Dept. Chapter 3: IT Students.
UNIVERSITY OF MASSACHUSETTS Dept
361 div.1 Computer Architecture ECE 361 Lecture 7: ALU Design : Division.
EE 141 Project 2May 8, Outstanding Features of Design Maximize speed of one 8-bit Division by: i. Observing loop-holes in 8-bit division ii. Taking.
Computer Structure - The ALU Goal: Build an ALU  The Arithmetic Logic Unit or ALU is the device that performs arithmetic and logical operations in the.
[M2] Traffic Control Group 2 Chun Han Chen Timothy Kwan Tom Bolds Shang Yi Lin Manager Randal Hong Wed. Oct. 27 Overall Project Objective : Dynamic Control.
IMPLEMENTATION OF µ - PROCESSOR DATA PATH
L10 – Multiplication Division 1 Comp 411 – Fall /19/2009 Binary Multipliers ×
Integer Multiplication and Division ICS 233 Computer Architecture and Assembly Language Dr. Aiman El-Maleh College of Computer Sciences and Engineering.
Computer Organization Multiplication and Division Feb 2005 Reading: Portions of these slides are derived from: Textbook figures © 1998 Morgan Kaufmann.
Lecture 8 Arithmetic Logic Circuits
Arithmetic-Logic Units CPSC 321 Computer Architecture Andreas Klappenecker.
1 ECE369 Chapter 3. 2 ECE369 Multiplication More complicated than addition –Accomplished via shifting and addition More time and more area.
COE 308: Computer Architecture (T041) Dr. Marwan Abu-Amara Integer & Floating-Point Arithmetic (Appendix A, Computer Architecture: A Quantitative Approach,
IT Systems Number Operations EN230-1 Justin Champion C208 –
Chapter 5 Arithmetic Logic Functions. Page 2 This Chapter..  We will be looking at multi-valued arithmetic and logic functions  Bitwise AND, OR, EXOR,
ECE 4110– Sequential Logic Design
CS1Q Computer Systems Lecture 9 Simon Gay. Lecture 9CS1Q Computer Systems - Simon Gay2 Addition We want to be able to do arithmetic on computers and therefore.
Team MUX Adam BurtonMark Colombo David MooreDaniel Toler.
Copyright 1995 by Coherence LTD., all rights reserved (Revised: Oct 97 by Rafi Lohev, Oct 99 by Yair Wiseman, Sep 04 Oren Kapah) IBM י ב מ 10-1 The ALU.
Chapter 6-1 ALU, Adder and Subtractor
07/19/2005 Arithmetic / Logic Unit – ALU Design Presentation F CSE : Introduction to Computer Architecture Slides by Gojko Babić.
1  1998 Morgan Kaufmann Publishers Arithmetic Where we've been: –Performance (seconds, cycles, instructions) –Abstractions: Instruction Set Architecture.
Multiplication of signed-operands
Lecture 6: Multiply, Shift, and Divide
1/8/ L3 Data Path DesignCopyright Joanne DeGroat, ECE, OSU1 ALUs and Data Paths Subtitle: How to design the data path of a processor.
Nov 10, 2008ECE 561 Lecture 151 Adders. Nov 10, 2008ECE 561 Lecture 152 Adders Basic Ripple Adders Faster Adders Sequential Adders.
1 Arithmetic Logic Unit ALU. 2 The Bus Concept 3 CPU Building Blocks  Registers (IR, PC, ACC)  Control Unit (CU)  Arithmetic Logic Unit (ALU)
Cs 152 l6 Multiply 1 DAP Fa 97 © U.C.B. ECE Computer Architecture Lecture Notes Multiply, Shift, Divide Shantanu Dutt Univ. of Illinois at.
DAT2343 Arithmetic Circuits For Unsigned Binary Values © Alan T. Pinck / Algonquin College; 2003.
CDA 3101 Fall 2013 Introduction to Computer Organization The Arithmetic Logic Unit (ALU) and MIPS ALU Support 20 September 2013.
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
IT253: Computer Organization
Csci 136 Computer Architecture II – Multiplication and Division
Mohamed Younis CMCS 411, Computer Architecture 1 CMSC Computer Architecture Lecture 11 Performing Division March 5,
CS/EE 3700 : Fundamentals of Digital System Design Chris J. Myers Lecture 5: Arithmetic Circuits Chapter 5 (minus 5.3.4)
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture Arithmetic: Part II.
Division Quotient Divisor Dividend – – Remainder.
Arithmetic-Logic Units. Logic Gates AND gate OR gate NOT gate.
Addition and multiplication Arithmetic is the most basic thing you can do with a computer, but it’s not as easy as you might expect! These next few lectures.
Number Representation (Part 2) Computer Architecture (Fall 2006)
By Wannarat Computer System Design Lecture 3 Wannarat Suntiamorntut.
Topic: N-Bit parallel and Serial adder
1 The ALU l ALU includes combinational logic. –Combinational logic  a change in inputs directly causes a change in output, after a characteristic delay.
Addition and multiplication1 Arithmetic is the most basic thing you can do with a computer, but it’s not as easy as you might expect! These next few lectures.
PICo Arithmetic and Logic Unit The Need for Speed (with minimal area and power)
Computer System Design Lecture 3
Multiplication and Division basics
Somet things you should know about digital arithmetic:
Integer Multiplication and Division
UNIVERSITY OF MASSACHUSETTS Dept
Swamynathan.S.M AP/ECE/SNSCT
Lecture 16 Arithmetic Circuits
Topic 3c Integer Multiply and Divide
Arithmetic Where we've been:
CSE Winter 2001 – Arithmetic Unit - 1
ARM implementation the design is divided into a data path section that is described in register transfer level (RTL) notation control section that is viewed.
COMS 361 Computer Organization
Combinational Circuits
CHAPTER 18 Circuits for Arithmetic Operations
Presentation transcript:

Divide Calculation Latency Outstanding Features of My Design 8-bit Divider Design Non-Restoring Divider Architecture 8-bit Carry Select Add/Subtract Unit 8 Cycles of Calculation + 1 Cycle of Initializing the ALU Logical Effort Techniques in Critical Path Optimization Chia-Chen Wang Yin Chun Yeung Summary of main results Critical Path Delay Divide Calculation Latency Clock Period Overall Delay 0.3 (nsec) 8 0.6 (nsec) ~4.8 (nsec)

Start: Place Dividend in Remainder Divider Architecture Introduction: The goal of this project is to design an 8-bit binary divider with minimum overall delay it takes to perform a divide operation. The delay is comprised of the number of clock cycles needed to perform a computation multiplied by the clock period, Delay = Ncycles·TClk. To find optimal compromise between Ncycles and TClk in such a way as to minimize the overall delay of the divide operation. Start: Place Dividend in Remainder 1. Subtract the Divisor register from 7-14 bit of the Remainder register and place the result to the Remainder register. 2a. Subtract the Divisor register from the 7-14 bit of the Remainder register and place the result to the Remainder register. 2b. Add the Divisor register from the 7-14 bit of the Remainder register and place the result to the Remainder register. Remainder >=0 Test Remainder Remainder < 0 Design Methodology: 1.Schematic: Normally, there are two dividing algorithm (Resorting and Non-Resorting) that we can apply to our design. As we want to minimize the overall delay that it takes to perform a division, we choose the Non-Resorting algorithm, which can give us fewer cycles and operation of calculation. 3a. Shift the Quotient register setting the rightmost bit to 1 and 0-6 bit of Remainder to the left setting the rightmost bit to 0. 3b. Shift the Quotient register setting the rightmost bit to 0 and 0-6 bit of Remainder to the left setting the rightmost bit to 0. nth repetition? nth repetition? No: < n repetitions No: < n repetitions Yes: n repetitions (n = 8 here) Yes: n repetitions (n = 8 here) Divisor Quotient Add/Subtract Shift Left Adder Shift Left Control DONE Remainder

Divider Schematic 8 Bit Shift Register w/ Reset (Quotient) 8 Bit Shift Register w/ Reset (Dividend bit 0-7) 8 Bit Register (Divisor) 1 Bit Register 8 Bit Register w/ Reset (Remainder bit 8-15) 2.Connection Design: As a given parameter that divisor is positive, we can reduce one cycle by connecting the 7-14bit of the remainder to the input of our 8bit adder. And, it turns out that we need 8 cycles to finish our 8bit/8bit calculation.

Detailed Schematics Carry Select Architecture We are using a Mirror Adder to form a 4-bit-adder. Exploiting the inverting property, we arrange the adder cells in the following way. To get a faster result of addition, we choose carry select to connect two 4-bit-adders to form an 8-bit-adder. 3.Adder Design: For a Non-Restarting schematic, we need both add and subtract operation. To build an 8bit Add/Subtract Unit, we use three 4bit adder. The first 4bit adder which inputs are connected by XOR(make is changeable from add/sub) generate SUM0 to SUM3 and COUT3. The 2nd 4bit adder used to generate SUMs that COUT3=1. The 3rd 4bit adder used to generate SUMs that COUT3=0. Finally, we select our output by using a MUX. 8 Bit Shift Register w/ Reset 4 Bit Ripple Adder Transmission Gate XOR Multiplexor 8 Bit Carry Select Adder

Functionality Verification Quotient Remainder Quotient Valid Result Remainder Valid Result

Critical Path Worst Case Input(Consider 4 LSBs): Remainder: 0000 0101 Divisor: 0000 0101 We have subtraction in the first cycle: 0000 0101 0000 0101 -) 0000 0101 1111 1010 +) 1 0000 0000 25 CUNIT MUX ADDER XOR MUX 4. Critical Path: Critical path exist between registers. We calculate the critical path by using logical effort. First, we assume the logical effort of our OXR is 3 and MUX is 3 for “reset” and 2 for input. By taking out the critical path and separate it into 9 stages(such as Full-Adder, OXR, INV, MUX) we calculate GHF and find that h=7.63. Using the load of 25unit load, we find the each size of the stages from the back.

Critical Path Sizing Path Logical Effort: G = G1 x G2 x … x Gn = 1 x 1 x 3 x 8 x 6 x 6 x 6 x 3 x 2 Branching Effort: B = B1 x B2 x … x Bn = 9 x 2 x 2 x 2 x 2 x 5 Path Electrical Effort: F = Cout/Cin = 25/1 = 25 Path Effort: H = G x B x F = 559872000 Total Stages: 10 Effective Fanout for each stage: h = H^(1/10) = 7.4956 7.4956 6.24275 15.5978 7.307 4.564 2.851 0.71235 0.8899 6.6705

Other Design Approach & Schematics 4 Adders in a Divider 2 Adders in a Divider Two Adders in a Divider 5. Extra Designs(Design Improvement): The above figures show 2 different improved design approach. There are 2 and 4 eight-bit-adders in the devices. The purpose of doing this is that we can eliminate the register delay time between each addition by providing a longer clock period. This method needs much longer clock period because of the loading of stacks of adders, but it can reduce the numbers of clock cycles and the treg. In the beginning of the design process, we used only one adder in the divider. However, we found out that we could do two or more additions in a longer clock cycle. Thus, there was only one adder in our first generation and two adders in the second generation, and finally, to the last design of 4 adders in one divider. (due to the limitation of 4 adder modules in the design)

Conclusion In this project, we’ve experienced the optimization process of the adder and the divider. We believe that every design always has a better solution. We first came up with the one adder in a divider, then 2 adders, and finally 4 adders. However, the critical path of the 4-adder-divider can not be easily analyzed. Thus, we back off to our original design. The 4-adder-divider indeed is faster, but it’s much more complicated than the 1-adder-divider that we think it’s not worth to tradeoff the simplicity to gain that little speed. If we have more time with the project, there’s always a better and faster design for the divider.