Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright 2008 Koren ECE666/Koren Part.9b.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.

Similar presentations


Presentation on theme: "Copyright 2008 Koren ECE666/Koren Part.9b.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer."— Presentation transcript:

1 Copyright 2008 Koren ECE666/Koren Part.9b.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer Arithmetic ECE 666 Part 9b Evaluation of Elementary Functions - II

2 Copyright 2008 Koren ECE666/Koren Part.9b.2 Inverse Trigonometric Functions  Multiplicative normalization  bi=(1+j s i 2 )  where  If g(b i )=  i =arctan(s i 2 ) & y 0 =0  x i complex variable = U i + jV i  Recursive formulas for real and imaginary parts:  (i) U i+1 + j V i+1 = (U i +jV i ) (1+j s i 2 )  (i) U i+1 = U i - s i 2 V i  (i)’ V i+1 = V i + s i 2 U i -i

3 Copyright 2008 Koren ECE666/Koren Part.9b.3 Final Values of Um and Vm  Relationship between U m, V m and y m :  Substituting b i and g(b i )  Replacing x m and x 0 by real and imaginary parts  (U m + jV m ) (cos  -j sin  ) = K (U 0 +jV 0 )  U m cos  +V m sin  =K U 0  -U m sin  +V m cos  =K V 0  Set V 0 =0 & U 0 =1/K : U m =cos  and V m =sin   K constant if s i  {-1, 1}

4 Copyright 2008 Koren ECE666/Koren Part.9b.4 Calculate  =arccos C for given C  Select s i 's - U m  C       obtained from y m = ,  Table of arctan(2 ) needed as for sine and cosine  To calculate  =arcsin D select s i 's so that V m  D and     arctan C: V m  0 - tan  =-V 0 /U 0 &  =-arctan(V 0 /U 0 )  Set V 0 =C, U 0 =1: arctan C=-   In summary  s i in {-1, 1} according to signs of V i and U i so that V i+1 is closer to 0 -i

5 Copyright 2008 Koren ECE666/Koren Part.9b.5 Example: arctan(1.0) in 10-bit Precision  Final result: y 11 =0.1100100100 2  Equal to “exact” result in 10-bit precision

6 Copyright 2008 Koren ECE666/Koren Part.9b.6 Approximation Error: Additive normalization  x 0 n-bit fraction - approaches x 0  Error - |  |  2  Evaluate y=F(x 0 ) by calculating  Taylor series expansion:   is of order 2 - last two terms negligible  Error  in y of order  dF/dx| x 0  Magnitude of error depends on function F(x 0 )  Example - exponential function F(x 0 )=e dF/dx| x 0 =e - thus |  |  2 e =2 for x 0  ln 2  Maximum error double size of error in argument  Can be reduced by increasing number of bits in xi -n -(n-1) ln 2 x0x0 x0x0

7 Copyright 2008 Koren ECE666/Koren Part.9b.7 Approx Error: Multiplicative Normalization  Instead of F(x 0 ) we calculate  Taylor series expansion  Natural logarithm F(x 0 )= ln x 0 Error is  y=F(x 0 )+  +0(  ²)  Error in y: |  |  |  |  2 - satisfactory precision -n-n

8 Copyright 2008 Koren ECE666/Koren Part.9b.8 Speed-up Techniques: Exponential Function  For i>n/2 - approximately s i 2  Smaller ROM & Replace last n/2 steps by a single operation  Previously canceled at least first (i-1) bits in x i :  z k (k=i, i+ 1, … ) - single bit  To cancel remaining nonzero bits in x i (i  n/2) select s k =z k (k>i)  All remaining s k 's can be predicted ahead of time  In last steps calculate  Since s k =z k - last term is (1+x i )  y m = y i (1+x i )  Called Termination Algorithm -i

9 Copyright 2008 Koren ECE666/Koren Part.9b.9 Termination Algorithm  Required - fast multiplier  If not available - can still benefit - perform all additions of products in carry-save manner  Also for F(x 0 )=ln x 0  x i has the form:  or  Formula for x i+1  For x i+1  1 select s k =z k for k  i  In parallel calculate  Based on Taylor series for ln (1+s k 2 ) - terminal approximation for i  n/2 : -k

10 Copyright 2008 Koren ECE666/Koren Part.9b.10 Accelerate Trigonometric Functions  Predict s i 's based on  To obtain constant K - s i restricted to {-1,1}  If replaced by  Deviation from exact value < 2  For i>n/2, may select s k (k  i) from {0,1} and predict s k =z k, where:  Can perform additions for x i, Zi and Wi in carry-save manner  Reexamining series expansion of arctan - may predict s l 's for i  l  3i at once  Corresponding terms added in carry-save adder  For l <n/2 - still use {-1,1} -n

11 Copyright 2008 Koren ECE666/Koren Part.9b.11 Speeding-Up First steps  0  i  n/2  Use radix > 2  Example: g(bi)=1+ s i r (r - power of 2, say r=2 ) s i  {-(r-1),-(r-2),…,-1,0,1,…,r-1}  q bits of x in a single step - but increased complexity of step - s i has larger range  Some improvement in speed is possible  Another method: allow s i to assume values in {-1, 0,1} - modify selection rule so that probability of s i =0 maximized  Similar to variations of SRT i q

12 Copyright 2008 Koren ECE666/Koren Part.9b.12 Other Techniques for Evaluating Elementary Functions  (1) Use rational approximations (commonly used in software)  When fast adder and multiplier available and high precision is required; e.g., extended double-precision IEEE floating-point  Hardware implementation of rational approximations can compete with continued summations and multiplications when execution time and chip area are considered  (2) Use polynomial approximations with look-up table

13 Copyright 2008 Koren ECE666/Koren Part.9b.13 Polynomial Approximations with look-up table  Domain of argument x of f(x) divided into smaller intervals (usually of equal size) - f(x i ) kept in look- up table  Value of f(x) at x calculated based on f(x i ), for x i closest to x, and a polynomial approximation p(x-x i ) for f(x-x i )  Distance (x-x i ) is small - simple polynomial requiring less time than a rational approximation  Overall algorithm has three steps:  (1) Find closest boundary point x i and calculate “reduction transformation” - usually distance d=x-x i  (2) Calculate p(d)  (3) Combine f(x i ) with p(d) to calculate f(x)

14 Copyright 2008 Koren ECE666/Koren Part.9b.14 Example - Calculating 2 in [-1,1]  32 boundary points x i =i/32, i=0,1,…,31  2 precalculated and stored in look-up table  Step 1: search for x i such that |x-m-x i |  1/64, where m=-1,0,1; calculate d=(x-m-x i ) ln 2  d satisfies e =2, and |d|  (ln 2)/64  Step 2: approximation for e -1 using a polynomial  p 2, p 3 … p k precalculated coefficients of polynomial approximation of e -1 on [-(ln 2)/64, (ln 2)/64 ]  Step 3: reconstruct 2 using  Error bounded by 0.556 ulp (ulp=2 ) x d x x-m-x i xi d d -52


Download ppt "Copyright 2008 Koren ECE666/Koren Part.9b.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer."

Similar presentations


Ads by Google