Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 16: Application-Driven Hardware Acceleration (1/4)

Similar presentations


Presentation on theme: "Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 16: Application-Driven Hardware Acceleration (1/4)"— Presentation transcript:

1 Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 16: Application-Driven Hardware Acceleration (1/4) Prof. Sherief Reda Division of Engineering, Brown University http://ic.engin.brown.edu

2 Reconfigurable Computing S. Reda, Brown University Fast Fourier transform One of the most important subroutines in scientific computing Used in many applications including: signal and image processing, solution of differential equations, multiplication of polynomial functions, data compression, …, etc One of the most widely implemented hardware accelerators

3 Reconfigurable Computing S. Reda, Brown University Discrete Fourier transform DFT Maps a set of input points to another set of output points. The operation is reversible.

4 Reconfigurable Computing S. Reda, Brown University Roots of the unity real imaginary (1, 0) (0, j) (-1, 0) (0, -j) What are the Nth roots of unity? If N = 8 then we have Define

5 Reconfigurable Computing S. Reda, Brown University Calculating the DFT How many arithmetic (+ and *) operations do we need to calculate the DFT?

6 Reconfigurable Computing S. Reda, Brown University Computing the DFT using the FFT How can we do better? Fast Fourier Transform (FFT) DFT of even indices DFT of odd indices The sum of N point DFT has been broken into two N/2 point DFTs

7 Reconfigurable Computing S. Reda, Brown University Example when N=8 Objective: Compute X 0, X 1, … X 7 given x 0, x 1, …, x 7 magic box magic box x0x0 x2x2 x4x4 x6x6 x1x1 x3x3 x5x5 x7x7 X0X0 X1X1 X2X2 X3X3 X4X4 X5X5 X6X6 X7X7 Note that

8 Reconfigurable Computing S. Reda, Brown University Now let’s apply the idea recursively x0x0 x4x4 x2x2 x6x6 x1x1 x5x5 x3x3 x7x7 X0X0 X1X1 X2X2 X3X3 X4X4 X5X5 X6X6 X7X7

9 Reconfigurable Computing S. Reda, Brown University One more time x0x0 x4x4 x2x2 x6x6 x1x1 x5x5 x3x3 x7x7 X0X0 X1X1 X2X2 X3X3 X4X4 X5X5 X6X6 X7X7 How many operations do we need now? What is the execution time on a general purpose CPU? What is the execution time on a FPGA? How many resources u need?

10 Reconfigurable Computing S. Reda, Brown University Another way to visualize FFT computations How can we determine the order of the first inputs? x0x0 x4x4 x2x2 x6x6 x1x1 x5x5 x3x3 x7x7 Butter fly Butter fly Butter fly Butter fly Butter fly Butter fly Butter fly Butter fly Butter fly Butter fly Butter fly Butter fly X0X0 X4X4 X1X1 X5X5 X2X2 X6X6 X3X3 X7X7

11 Reconfigurable Computing S. Reda, Brown University Application of FFT: faster multiplication of two polynomials Suppose we want to evaluate A(x) at x 0, how many operations do we need? Use Horner’s rule Suppose you have two polynomials represented by the coefficient vectors How many operations it takes to add these two polynomials? How many operations it takes to multiply these two polynomials?

12 Reconfigurable Computing S. Reda, Brown University Point value representation A point-value representation of a polynomial A(x) of degree-bound N is a set of N point-value pairs such that all of the x k are distinct and y k =A(x k ) for k=0, 1, …, N-1 How many operations do we need to compute the point representation of a polynomial? How can we do better?

13 Reconfigurable Computing S. Reda, Brown University Interpolation of polynomials from point-value representations Given the point representation of a polynomial, how can we inverse the evaluation, i.e., determine the coefficient form of a polynomial from a point representation? How can we find the a’s?

14 Reconfigurable Computing S. Reda, Brown University Adding and multiplying polynomials in point representation If polynomial C(x)=A(x)+B(x) then we can get point representation of C easily Polynomial A Polynomial B How many operations do we need? How about C(x)=A(x)*B(x)?

15 Reconfigurable Computing S. Reda, Brown University How can we convert a polynomial quickly from coefficient form to point-value and back? Evaluate O(N 2 ) Point-wise multiplication Interpolate O(N 2 ) Ordinary multiplication O(N 2 ) O(N) It does not make sense now. How can we evaluate and interpolate faster than O(N 2 )? Can we choose the evaluation points smartly?

16 Reconfigurable Computing S. Reda, Brown University Choosing the evaluation points smartly......

17 Reconfigurable Computing S. Reda, Brown University Finally multiplying polynomials in O(NlogN) FFT O(N log N) Point-wise multiplication Inverse FFT Ordinary multiplication O(N 2 ) O(N)

18 Reconfigurable Computing S. Reda, Brown University Back to signal processing Linear system with Impulse response (b 0, b 1, …, b N-1 ) (a 0, a 1, …, a N-1 ) T=0: a 0 b 0 T=1: a 0 b1+a 1 b 0 T=2: a 0 b 2 +a 1 b 1 +a 2 b 0 …. The response of the system to the input signal at different times is equal to the coefficients of the polynomial produced from multiplying the input signal polynomial with the impulse response polynomial? Commonly known as the convolution of the input and the system’s impulse response. How to do to find the output response faster than O(N 2 )?

19 Reconfigurable Computing S. Reda, Brown University Summary The lecture covered one of the most important hardware accelerators: FFT We have seen how it can be parallelized and speed up Examined some of the applications


Download ppt "Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 16: Application-Driven Hardware Acceleration (1/4)"

Similar presentations


Ads by Google