# Floating-Point Division and Square Root Implementation using a Taylor-Series Expansion Algorithm with Reduced Look-up Kwon, Jeff Draper,

## Presentation on theme: "Floating-Point Division and Square Root Implementation using a Taylor-Series Expansion Algorithm with Reduced Look-up Kwon, Jeff Draper,"— Presentation transcript:

Floating-Point Division and Square Root Implementation using a Taylor-Series Expansion Algorithm with Reduced Look-up Tables @Taek-Jun Kwon, Jeff Draper, University of Southern California / Information Sciences Institute Marina del Rey, CA 90292, USA {tjkwon, draper}@ISI.EDU, VOL. 27, NO. 12, DECEMBER 2008 ©2008 IEEE Vishesh Kalra EE800 11089943 Vishesh Kalra EE800 11089943

I NTRODUCTION Division and Square root are open considered as infrequent in general purpose applications. But in modern applications like CAD tools and 3D Graphics DIV/SQRT become performance bottlenecks. A fused Floating point multiply/square root/divide unit has been presented. Algorithm used- >Taylor Series Expansion with reduced lookup tables

DIVISION ALGORITHIM

STAGE DIAGRAM

PROPOSED METHOD FUSED MUL/DIV/SQRT ALGORITHM

STAGE DIAGRAM

STEPS INVOLVED

PIPELINE DIAGRAM

REDUCED LOOK UP TABLES For Yo, generally two look up tables are used based upon exponent value of input operand-even or odd. To generate IEEE 754 standard 8 bit seed is used when exponent is even and 9 bit if odd. In addition to this Yo2 values are used via a small multiplier(9b*9b)

REDUCED LOOK UP TABLES As a Result, three look up tables and a small Multiplier >Xo for division =1/b; >Yo_even and Yo_odd for 1/sqrt(b); However since all entries in each look-up table start with same constants (eg.Xo=0.1xxxxxxx, Yo_even=0.1xxxxxxx and Yo_odd=0.10xxxxxxx) Therefore Table sizes can be reduced by storing them. Table Sizes are >Xo=128*7b; >Yo_even=128*7b; >Yo_odd=256*7b;

PARALLEL POWERING UNITS(PPU) Lets Recall the Algorithm 3/8=1/8 + 2/8, that means 2 bit shifted and 3 bit shifted values of (1-bYo2)2 are accumulated and in case of other term 5/16 =2/8 + 1/8,that means 2 bit shifted and 4 bit shifted. So 2 bit shifted is common so accumulated by PPU.

STAGE DIAGRAM

RESULTS-Area Comparison Without using a Multiplier to generate constants for reduced lookup tables.

Conclusion As shown in Results, Additional Look up tables contribute to major area overhead. Also Pipelined Accumulators(Mux) contribute to area increase. To incorporate a Square root function with a modest 20 % increase is desirable and to incorporate it in entire FPU(Floating Point Unit) is a mere 10 %. Multiplier Approach for generating constants was done and resulted in area savings of 2.4 %.

Similar presentations