Fourier Transformations

Fourier Transformations
Grad Algorithms Fourier Transformations Fourier Transformation (sine) Fourier Transformation (JPEG) Change from Time to Polynomial Basis Evaluating & Interpolating FFT in nlogn Time Roots of Unity Same FFT Code & Butterfly Inverse FFT Sin and Cos Basis FFT Butterfly Polynomial Multiplication Integer Multiplication Jeff Edmonds York University COSC 6111

Sum of sine waves gives a square wave.

Changing Basis Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad]
Changes the basis used to describe an object. The standard basis of a vector space: A tuple <w1,w2,…,wd> of basis objects Linearly independent Spans the space uniquely "v $[a1,a2,…,ad], v = a1w1+a2w2 +… + adwd The new basis of a vector space: A tuple <W1,W2,…,Wd> of basis objects Linearly independent Spans the space uniquely "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd Use small letters aj for the coefficients in the standard basis and capital letters Ak for the coefficients in the new basis 3

[ ][ ] =[ ] Changing Basis
Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] v = v = [a1,a2] = [3,2] [A1,A2] = [11/5,32/5] T-1([A1,A2,…,Ad]) = [a1,a2,…,ad] ? [ ][ ] =[ ] a1 a2 A1 A2 4

[ ][ ] =[ ] [ ][ ] =[ ] Changing Basis
Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] v = -3/5 4/5 W1[1] W1[2] [a1,a2] = [4/5, -3/5] [A1,A2] = [1,0] T-1([A1,A2,…,Ad]) = [a1,a2,…,ad] [ ][ ] =[ ] ? [ ][ ] =[ ] 1 W1[1] W1[2] ? 4/5 -3/5 ? A1 A2 1 4/5 -3/5 a1 a2 5

[ ][ ] =[ ] [ ][ ] =[ ] Changing Basis
Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] v = W2[1] W2[2] 3/5 4/5 [a1,a2] = [3/5,4/5] [A1,A2] = [0,1] T-1([A1,A2,…,Ad]) = [a1,a2,…,ad] [ ][ ] =[ ] [ ][ ] =[ ] 4/5 -3/5 ? 3/5 4/5 ? A1 A2 1 3/5 4/5 a1 a2 W1[1] W1[2] W2[1] W2[2] 1 W1[1] W1[2] 6

[ ][ ] =[ ] [ ][ ] =[ ] Changing Basis
Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] [3,2] v = [a1,a2] = [11/5,32/5] [A1,A2] = T-1([A1,A2,…,Ad]) = [a1,a2,…,ad] [ ][ ] =[ ] [ ][ ] =[ ] 4/5 -3/5 3/5 4/5 11/5 32/5 A1 A2 a1 a2 W1[1] W1[2] W2[1] W2[2] 3 2 A1 A2 a1 a2 7

[ ][ ] =[ ] [ ] [ ] =[ ] Changing Basis
Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] W1[1] W1[2] W2[1] W2[2] [ ][ ] =[ ] W1[1] W1[2] W2[1] W2[2] a1 a2 A1 A2 [ ] [ ] =[ ] W1[1] W1[2] W2[1] W2[2] a1 a2 A1 A2 -1 8

[ ][ ]=[ ] [ ] = [ ] Changing Basis
Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] W1[1] W1[2] W2[1] W2[2] If the new basis vectors are orthogonal and of uniform length: |W1|2=n, then W1∙W1 = jW1[j]W1[j] = n W1W2, then W1∙W2 = jW1[j]W2[j] = 0 [ ][ ]=[ ] W1[1] W1[2] W2[1] W2[2] n [ ] = [ ] W1[1] W1[2] W2[1] W2[2] -1 1/n 9

[ ][ ] =[ ] [ ] [ ] =[ ] [ ][ ] =[ ] Changing Basis
Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] W1[1] W1[2] W2[1] W2[2] [ ][ ] =[ ] W1[1] W1[2] W2[1] W2[2] a1 a2 A1 A2 [ ] [ ] =[ ] W1[1] W1[2] W2[1] W2[2] a1 a2 A1 A2 -1 W1[1] W2[1] W1[2] W2[2] [ ][ ] =[ ] a1 a2 A1 A2 1/n 10

[ ][ ] =[ ] Changing Basis
Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] W1[1] W1[2] W2[1] W2[2] Viewed a different way: A1 v  v  W1 v a1 a2 A1 = |v|cos() = v∙W1 = j ajW1[j] W1[1] W2[1] W1[2] W2[2] [ ][ ] =[ ] a1 a2 A1 A2 |W1| cos() = v∙W1 |v||W1| This is the correlation between v and W1 11

Fourier Transformation
are a change of basis from the time basis to sine/cosine basis JPG or polynomial basis Applications Signal Processing Compressing data (eg images with .jpg) Multiplying integers in n logn loglogn time. …. Purposes: Some operations on the data are cheaper in new format Some concepts are easier to read from the data in new format Some of the bits of the data in the new format are less significant and hence can be dropped. Amazingly once you include complex numbers, the FFT code for sine/cosines and for polynomials are the SAME. The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D.

Sine &Cosine Basis A continuous periodic function t time y(t) Swings, capacitors, and inductors all resonate at a given frequency, which is how the circuit picks out the contribution of a given frequency. Find the contribution of each frequency If this is the dominate musical note of frequency  = 2/T, then all the other basis functions are its harmonics frequencies: Frequency: Note on the Piano: , 2, 3, 4, 5, 6, ... C G E

Sine &Cosine Basis y(x) = x y(x)  2 sin(x) - sin(2x) + 2/3 sin(3x) Surely this can’t be expressed as sum of sines and cosines.

Sine &Cosine Basis y(x) = x2 y(x)  -4 sin(x) + sin(2x) - 4/9 sin(3x)

Time Domain y Frequency Domain Y The value y[j] of the signal at each point in time j. The amount Y[f] of frequency f in the signal for each frequency f.

Change of Basis Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. The Time basis of a vector space: A tuple <w1,w2,…,wd> of basis objects Linearly independent Spans the space uniquely "v $[a1,a2,…,ad], v = a1w1+a2w2 +… + adwd The Fourier basis of a vector space: A tuple <W1,W2,…,Wd> of basis objects Linearly independent Spans the space uniquely "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd 17

Change of Basis Change of Basis: T(y[0],y[1],…,y[n-1]) = [YRe[0],…,YIm[n/2]] Changes the basis used to describe an object. "y $[y[0],y[1],…,y[n-1]], y = y[0]I1 +y[1]I2 +… + y[n-1]In The time basis j’ Ij[j’] zero one j Time Basis =[ , ] =[I1,I2,…] y = y[0]=3 y[1]=2 The value y[j] of the signal at each point in time j. A discrete periodic function j y[j] 18

Change of Basis Change of Basis: T(y[0],y[1],…,y[n-1]) = [YRe[0],…,YIm[n/2]] Changes the basis used to describe an object. "y $[y[0],y[1],…,y[n-1]], y = y[0]I1 +y[1]I2 +… + y[n-1]In y = YRe[0]∙c1+YIm[0]∙s1+ ,…,YRe[n/2]∙sn/2+YIm[n/2]∙sn/2 Time Basis =[ , ] =[I1,I2,…] y = Fourier Basis =[ , ] =[?,?] =[c1,s1,..] c1 sn/2 cn/2 s1 y = y[0]=3 y[1]=2 YRe[0] =11/5 YIm[0] =32/5 The amount Y[f] of frequency f in the signal for each frequency f. A discrete periodic function j y[j] 19

[ ] [ ] =[ ] [ ] [ ] =[ ] Fourier Transformation
Change of Basis Change of Basis: T(y[0],y[1],…,y[n-1]) = [YRe[0],…,YIm[n/2]] Changes the basis used to describe an object. "y $[y[0],y[1],…,y[n-1]], y = y[0]I1 +y[1]I2 +… + y[n-1]In y = YRe[0]∙c1+YIm[0]∙s1+ ,…,YRe[n/2]∙sn/2+YIm[n/2]∙sn/2 Time Basis =[ , ] =[I1,I2,…] y = Fourier Basis =[ , ] =[c1,s1,..] c1 sn/2 cn/2 s1 y = y[0]=3 y[1]=2 YRe[0] =11/5 YIm[0] =32/5 s1[1] s1[2] s2[1] s2[2] Y[1] Y[2] y[1] y[2] [ ] [ ] =[ ] s1[1] s2[1] s1[2] s2[2] Y[1] Y[2] [ ] [ ] =[ ] y[1] y[2] -1 20

[ ][ ]=[ ] [ ] = [ ] Fourier Transformation =[I1,I2,…] =[c1,s1,..]
Orthogonal Basis Time Basis =[ , ] =[I1,I2,…] Fourier Basis =[ , ] =[c1,s1,..] Sine and Cosines of different frequencies are orthogonal and of (almost) uniform length: [ ][ ]=[ ] s1[1] s1[2] s2[1] s2[2] n/2 [ ] = [ ] s1[1] s1[2] s2[1] s2[2] -1 2/n 21

[ ] [ ] =[ ] Fourier Transformation =[I1,I2,…] =[c1,s1,..] =[ , ]
Orthogonal Basis Time Basis =[ , ] =[I1,I2,…] Fourier Basis =[ , ] =[c1,s1,..] [ ] [ ] =[ ] s1[1] s1[2] s2[1] s2[2] Y[1] Y[2] 2/n y[1] y[2] Duality of FT: If Y=FT(y), then y=FT(Y) 22

Duality of FT Time Domain y Frequency Domain Y Cosine wave Cosine with f=4 Delta function Impulse at Yre[4] Delta function Impulse at y[4] Cosine wave Cosine with f=4 Duality of FT: If Y=FT(y), then y=FT(Y)

Duality of FT Time Domain y Frequency Domain Y Square wave Sinc function How do you get these corner? Sinc function Square wave ? Duality of FT: If Y=FT(y), then y=FT(Y)

Duality of FT Time Domain y Frequency Domain Y Gaussian Duality of FT: If Y=FT(y), then y=FT(Y)

Continuous Functions 26

FFT Butterfly Fast Fourier Transform takes O(nlogn) time! (See Recursive Slides) O(log(n)) levels

Radio Signals Time Domain y Frequency Domain Y Sound Signal ie how far out is the speaker drum at each point in time. Sound is low frequency High frequencies filtered out.

Radio Signals Time Domain y Frequency Domain Y Radio Carrier Signal ie A wave of magnetic field that can travel far. One high frequency signal

Radio Signals Time Domain y Frequency Domain Y Carrier signal Audio Signal (shifted) Audio Signal (shifted &flipped) Modulation: Their product y(i) = y1(i)  y2(i)

Linear Filter This system takes in a signal and outputs transformed signal. x[] y[]

h[] = Linear Filter h[] = In order understand this transformation, we put in a single pulse. [] = h[] This response h[] identifies the system.

h[] = Linear Filter Feed in any signal x[] = Sum of contributions from each separate pulse. h[] Computationally trying to figure out what this electronic system does to a signal takes O(nm) time. How can we do it faster?

 * Convolution Time Domain y Frequency Domain Y x[] = Input h[] = Impulse Response Oops Fourier Transform takes O(n2) time. X[] H[] Fast Fourier Transform takes O(nlogn) time! X[]H[] x[]*h[] = Output Multiplication takes O(n) time. Y = X H Product y = x*h Convolution

 * Convolution Time Domain y Frequency Domain Y Impulse Response h[] = H[] = Impulse Response x[] = Input X[] x[]*h[] = Output X[]H[] Multiplying zeros low and high frequencies in input. Filters out low and high frequencies in input. Not clear what system does to input Y = X H Product y = x*h Convolution

JPG Image Compression JPEG (Image Compression) JPEG is two dimensional Fourier Transform exactly as done before.

JPG Image Compression Each 88 block of values from the image is encoded separately.

JPG Image Compression Each basis function has a coefficient, giving the contribution of this basis function to the image. Each 88 block of values from the image is encoded separately. It is decomposed as a linear combination of basis functions.

JPG Image Compression Each 88 block of values from the image is encoded separately. It is decomposed as a linear combination of basis functions. Each of the 64 basis functions is a two dimensional cosine.

JPG Image Compression The first basis is constant. Its coefficient gives the average value in within block. Because many images have large blocks of the same colour, this one coefficient gives much of the key information!

JPG Image Compression The second basis “slopes” left to right Its (pos or neg) coefficient gives whether left to right the value tends to increase or decrease.

JPG Image Compression The second basis “slopes” left to right Because many images have have a gradual change in colour, this one coefficient gives more key information!

JPG Image Compression A similar basis for top to bottom.

JPG Image Compression The <0,2> basis is Its coefficient gives whether the value tends to be smaller in the middle. This helps display the horizontal lines in images

JPG Image Compression As seen, the low frequency components of a signal are more important. Removing 90% of the bits from the high frequency components might remove, only 5% of the encoded information.

Polynomial Basis Have you seen Taylor Expansions of a Function? They show that functions f(x) can be expressed by specifying the coefficients of the polynomial. F(x) = a0+a1x +a2x2 +a3x3 + … Eg: f(x) = 1/(1-x) F(x) = 1+x +x2 +x3 + …

Polynomial Basis Have you seen Taylor Expansions of a Function? They show that functions f(x) can be expressed by specifying the coefficients of the polynomial. F(x) = a0+a1x +a2x2 +a3x3 + …

Polynomial Basis Instead of using sine and cosines as the basis, 49

Polynomial Basis Instead of using sine and cosines as the basis, We will now use polynomials. 50

Polynomial Basis Change of Basis: T([y[0],y[1],…, y[n-1]]) = [a1,a2,…,an-1] Changes the basis used to describe an object. "f $[y[0],y[1],…,y[n-1]], f = y0 I0 +y1 I1 +… + yn-1 In-1 The time basis x Ij[x] zero one xj Time Basis =[ , ] =[I0,I1,…] f = A discrete function x f(x) y[0]=3 y[1]=2 y[j] = the value f(xj) of the function at xj. These xj are fixed values. For FFT, we set xj = e2i j/n x0 x1 x2 x3 x … xn-1 51

Polynomial Basis Change of Basis: T([y[0],y[1],…, y[n-1]]) = [a1,a2,…,an-1] Changes the basis used to describe an object. "f $[y[0],y[1],…,y[n-1]], f = y0 I0 +y1 I1 +… + yn-1 In-1 "f $[a0,a1,a2 ,…,an-1], f = a0+a1x +a2x2 + … + an-1xn-1 Time Basis =[ , ] =[I0,I1,…] y = Fourier Basis =[ , ] =[1,x,x2,x3..] f = A discrete function x f(x) y[0]=3 y[1]=2 The aj are the cooeficients of the polynomial. a1 a2 x0 x1 x2 x3 x … xn-1 52

Evaluating & Interpolating A Fourier Transform is a change in basis. It changes the representation of a function from the coefficients of the polynomial f(x) = a0+a1x +a2x2 + … + an-1xn-1 Evaluating f at these points. Interpolation x to the value f(xi) at key values xi. x0 x1 x2 x3 x … xn-1 y0 y1 y2 y3 y … yn-1 yi = f(xi) 53

Evaluating & Interpolating Given a set of n points in the plane with distinct x-coordinates, there is exactly one (n-1)-degree polynomial going through all these points. Evaluating f at these points. Interpolation x to the value f(xi) at key values xi. x0 x1 x2 x3 x … xn-1 y0 y1 y2 y3 y … yn-1 yi = f(xi) 54

Evaluating & Interpolating A Fourier Transform is a change in basis. It changes the representation of a function from the coefficients of the polynomial f(x) = a0+a1x +a2x2 + … + an-1xn-1 (x0)0 (x0)1 (x0)2 (x0)3 … (x0)n-1 a0 a1 a2 a3 … an-1 y0 y1 y2 y3 yn-1 = (x1)0 (x1)1 (x1)2 (x1)3 … (x1)n-1 (xn-1)0(xn-1)1(xn-1)2 (xn-1)3…(xn-1)n-1 (x2)0 (x2)1 (x2)2 (x2)3 … (x2)n-1 (x3)0 (x3)1 (x3)2 (x3)3 … (x3)n-1 yi = f(xi) Vandermonde matrix Invertible if xi distinct. O(n2) time 55

Fast Fourier Transformation
FFT nlogn Time The Fast Fourier Transform (FFT) is a very efficient algorithm for performing a discrete Fourier transform FFT principle first used by Gauss in 18?? (But was not interesting without computers) FFT algorithm published by Cooley & Tukey in 1965 In 1969, the 2048 point analysis of a seismic trace took 13 ½ hours. Using the FFT, the same task on the same machine took 2.4 seconds!

FFT nlogn Time Not only do you get faster speed + in place memory processing but fewer calculations means less round off errors Maybe I should take CSE6111 after all! 70 60 50 40 30 20 10 Error (ppm) diagram from Smith And as a final additional bonus ... DFT FFT 57

FFT nlogn Time N DFT (N2) FFT (1.5N log N) faster 32 1,024 240 4.3 64 4,096 576 7.1 128 16,384 1,344 12.2 256 65,536 3,072 21.3 512 262,144 6,912 37.9 1024 1,048,576 15,360 68.2 2048 4,194,304 33,792 124.1 4096 16,777,216 73,728 227.6 Based on experimental times on most commonly used N values (using real valued DFT with sine look up tables vs complex FFT) FFT is 10 to 100 times faster than DFT and this performance advantage increases at higher input samples If one considers complex multiplications: DFT: N^2 FFT: (N/2) logN FFT is even more faster: 227 becomes more than 600 At 150MHz, current 32 bit floating point DSP can do 1024 points in 69uS (Analog Devices ADSP-TS001 TigerSHARC) Discrete Fourier Transform is too slow for real time! 58

FFT nlogn Time Divide & Conquer - Friends - Recursion. Trust your friends to solve any subinstance: as long as smaller and is an instance to the same problem. My instance My friend’s Instance My friend’s Instance My friend’s Instance

FFT nlogn Time My output: f(x) = a0+a1x +a2x2 + … + an-1xn-1 My input: (start with one x) (a0,a1,a2,…,an-1) & x 1st friend’s input? feven: (a0,a2,a4,…,an-2) & ? 2nd friend’s input? fodd: (a1,a3,a5,…,an-1) & ? feven(z) = a0+a2z+a4z2+a6z3+ … + an-2zn/2-1 f(x) = a0+a1x +a2x2 + … + an-1xn-1 = a0+a2x2+a4x4 + … + an-2xn-2 + a1x+a3x3+a5x5 + … + an-1xn-1 = a0+a2x2+a4x4 + … + an-2xn-2 + x( a1+a3x2+a5x4 + … + an-1xn-2 ) = feven(x2) + x( fodd(x2) ) f(x) = feven(x2) + x fodd(x2)

FFT nlogn Time My output: f(x) = a0+a1x +a2x2 + … + an-1xn-1 My input: (start with one x) (a0,a1,a2,…,an-1) & x 1st friend’s input? feven: (a0,a2,a4,…,an-2) & x2 2nd friend’s input? fodd: (a1,a3,a5,…,an-1) & x2 2nd friend’s output: yodd = fodd(x2) 1st friend’s output: yeven = feven(x2) My output: f(x) = yeven + x yodd T(n) = 2 T(n/2) + O(1) = O(n) Ok. So it takes O(n) time to evaluate. f(x) = feven(x2) + x fodd(x2)

FFT nlogn Time My input: (a0,a1,a2,…,an-1) (x0,x1,x2,…,xn-1) My output: (y0,y1,y2,…,yn-1) yi = f(xi) 1st friend’s input? feven: (a0,a2,a4,…,an-2) (x02,x12,x22,…,xn-12) 2nd friend’s input? fodd: (a1,a3,a5,…,an-1) (x02,x12,x22,…,xn-12) 2nd friend’s output: i y<odd,i> = fodd(xi2) 1st friend’s output: i y<even,i> = feven(xi2) My output: i f(xi) = y<even,i> + xi y<odd,i> T(n) = 2 T(n/2) + O(n) = O(n log n) Wow! That was easy.

FFT nlogn Time My input: (a0,a1,a2,…,an-1) (x0,x1,x2,…,xn-1) n coefficients n values of x n/2 coefficients 1st friend’s input? feven: (a0,a2,a4,…,an-2) (x02,x12,x22,…,xn-12) Does not meet precondition! Oops

FFT nlogn Time My input: (a0,a1,a2,…,an-1) (x0,x1,x2,…,xn-1) My output: (y0,y1,y2,…,yn-1) yi = f(xi) 1st friend’s input? feven: (a0,a2,a4,…,an-2) (x02,x12,x22,…,xn/2-12) 2nd friend’s input? fodd: (a1,a3,a5,…,an-1) (x02,x12,x22,…,xn/2-12) 3rd friend’s input? feven: (a0,a2,a4,…,an-2) (xn/22,xn/2+12,…,xn-12) 4th friend’s input? fodd: (a1,a3,a5,…,an-1) (xn/22,xn/2+12,…,xn-12) That’s no good. My output: i f(xi) = y<even,i> + xi y<odd,i> T(n) = 4 T(n/2) + O(n) = O(n2)

Roots of Unity The values (x0,x1,x2,…,xn-1) are said to be special if: There are n distinct values. When you square each of them: The set collapses to n/2 distinct values. Eg: …, -3, -2, -1, 1, 2, 3, … square each of them …, 9, 4, 1, 1, 4, 9, … collapse the set 1, 4, 9, … half as many elements.

Roots of Unity My input: (a0,a1,a2,…,an-1) Special (x0,x1,x2,…,xn-1) My output: (y0,y1,y2,…,yn-1) yi = f(xi) 1st friend’s input? feven: (a0,a2,a4,…,an-2) (x02,x12,x22,…,xn-12) n/2 distinct values 2nd friend’s input? fodd: (a1,a3,a5,…,an-1) (x02,x12,x22,…,xn-12) n/2 distinct values 9 2nd friend’s output: i y<odd,i> = fodd(xi2) 1st friend’s output: i y<even,i> = feven(xi2) feven(9) fodd(9) My output: i f(xi) = y<even,i> + xi y<odd,i> T(n) = 2 T(n/2) + O(n) = O(n log n) That’s better 3 f(3) -3 f(-3) feven(9) fodd(9)

Roots of Unity My input: (a0,a1,a2,…,an-1) Special (x0,x1,x2,…,xn-1) My output: (y0,y1,y2,…,yn-1) yi = f(xi) To meet precondition these also need to be special 1st friend’s input? feven: (a0,a2,a4,…,an-2) (x02,x12,x22,…,xn-12) n/2 distinct values 2nd friend’s input? fodd: (a1,a3,a5,…,an-1) (x02,x12,x22,…,xn-12) n/2 distinct values

Roots of Unity The values (x0,x1,x2,…,xn-1) are said to be special if: There are n distinct values. When you square each of them: The set collapses to n/2 distinct values. Which are also special Eg: …, -3, -2, -1, 1, 2, 3, … square each of them …, 9, 4, 1, 1, 4, 9, … collapse the set 1, 4, 9, … square each of them 2, 16, 81, … But these are not special.

Roots of Unity The values (x0,x1,x2,…,xn-1) are said to be special if: There are n distinct values. When you square each of them: The set collapses to n/2 distinct values. Which are also special Eg: -i, -1, 1, i square each of them -1, 1, 1, -1  = -i,-1, 1, i are said to be 4th roots of unity Because 4 = 1 collapse the set -1, 1 square each of them 1, 1 collapse the set 1

Roots of Unity  is said to be an nth root of unity (in a field) if n = 1 (There should be n solutions of this polynomial) Fermat’s Little Theorem: b≠0 bp-1 =mod p says every nonzero element is an nth root of unity when n=p-1.

Roots of Unity  is said to be an nth root of unity (in a field) if n = 1  is said to be a generator of the field if the numbers 1,,2, …,n-1 are all distinct 1,,2, …,n-1 are then special (when n is even) 1st half 2nd half n/2+n/2-1,…,n/2+3,n/2+2,n/2+1,n/2+0 ,0,1,2,3,…,n/2-1 square each of them n+n-2, …, n+6, n+4, n+2, n+0, 0,2,4,6, …,n-2 use n = 1 n-2, …, 6, 4, 2, 0, 0,2,4, 6, …,n-2 collapse the set 0,2,4, 6, …,n-2 We need these to be n/2 special values.

Roots of Unity 16th roots of unity (n/4)2 = n/2 = -1 i These could be Z mod 17 or complex numbers × 164 165 163 166 162 rr 167 161 (n/2)2 = 1 r θ 168 -1 = 160 = 1616 = 1 1615 169 1614 1610 1613 1611 1612 reθi = rcosθ + irsinθ (3n/4)2 = n/2 = -1 -i

Roots of Unity g(θ) = rcosθ + irsinθ f(θ) = reθi f'(θ) = ireθi g'(θ) = -rsinθ + ircosθ f(θ) = -reθi g(θ) = -rcosθ - irsinθ = -f(θ) = -g(θ) Proof 1: Second derivative same, y = -y f(0) = g(0) f'(0) = g'(0) Proof 2: eε  1+ε eiε  1+iε  cos(ε) + isin(ε) reθi = rcosθ + irsinθ

Roots of Unity 16th roots of unity 164 165 163 166 162 3θ θ = 2/n 167 2θ 161  = e2i/n θ r=1 168 160 1615 169 1614 1610 1613 1611 1612 reθi = rcosθ + irsinθ eθi × eαi = e(θ+α)i

Roots of Unity 16th roots of unity square each of them and collapse 164 165 163 166 162 167 161 168 160 1615 169 1614 1610 1613 1611 1612

Roots of Unity 16th roots of unity square each of them and collapse 164 Are these special? 166 162 168 160 1614 1610 1612

Roots of Unity 8th roots of unity square each of them and collapse 82 Are these special? 83 81 84 80 87 85 86

Roots of Unity 4th roots of unity square each of them and collapse 41 Are these special? 42 40 43

Roots of Unity 2th roots of unity square each of them and collapse Are these special? 21 20 = 1

Roots of Unity My input: (a0,a1,a2,…,an-1) (nth roots of unity ni) My output: (y0,y1,y2,…,yn-1) yi = f(ni) 1st friend’s input? feven: (a0,a2,a4,…,an-2) (n/2th roots of unity n/2i) 2nd friend’s input? fodd: (a1,a3,a5,…,an-1) (n/2th roots of unity n/2i) 2nd friend’s output: i y<odd,i> = fodd(n/2i) 1st friend’s output: i y<even,i> = feven(n/2i) My output: i f(xi) = y<even,i> + xi y<odd,i> T(n) = 2 T(n/2) + O(n) = O(n log n) Excellent

FFT Code Algorithm FFT(y, , n): Input: y = [a0,a1,a2,…,an-1] (Time Domain) = e2i 1/n (nth root of unity) n = # of samples (2r) Output: Y = [y0,y1,y2,…,yn-1] (Frequency Domain) % Separate even and odd indices aeven = [a0,a2,a4,…,an-2] aodd = [a1,a1,a5,…,an-1] % Recurse yeven =FFT(aeven, 2, n/2) (2 = e2i 2/n ) yodd =FFT(aodd , 2, n/2) %Combining For i = 0 to n/2-1 y[i] = yeven[i] + i ∙yodd[i] y[i+n/2] = yeven[i] + i+n/2 ∙yodd[i] Return(Y)

Inverse FFT A Fourier Transform is a change in basis. It changes the representation of a function from the coefficients of the polynomial f(x) = a0+a1x +a2x2 + … + an-1xn-1 Evaluating f at these points. Interpolation x to the value f(xi) at key values xi. x0 x1 x2 x3 x … xn-1 y0 y1 y2 y3 y … yn-1 yi = f(xi) 82

Inverse FFT A Fourier Transform is a change in basis. It changes the representation of a function from the coefficients of the polynomial f(x) = a0+a1x +a2x2 + … + an-1xn-1 (x0)0 (x0)1 (x0)2 (x0)3 … (x0)n-1 a0 a1 a2 a3 … an-1 y0 y1 y2 y3 yn-1 = (x1)0 (x1)1 (x1)2 (x1)3 … (x1)n-1 (xn-1)0(xn-1)1(xn-1)2 (xn-1)3…(xn-1)n-1 (x2)0 (x2)1 (x2)2 (x2)3 … (x2)n-1 (x3)0 (x3)1 (x3)2 (x3)3 … (x3)n-1 yi = f(xi) Vandermonde matrix Invertible if xi distinct. V a = y a = V-1 y 83

Inverse FFT xi = i (x0)0 (x0)1 (x0)2 (x0)3 … (x0)n-1 (x1)0 (x1)1 (x1)2 (x1)3 … (x1)n-1 (xn-1)0(xn-1)1(xn-1)2 (xn-1)3…(xn-1)n-1 (x2)0 (x2)1 (x2)2 (x2)3 … (x2)n-1 (x3)0 (x3)1 (x3)2 (x3)3 … (x3)n-1 a0 a1 a2 a3 … an-1 y0 y1 y2 y3 … yn-1 = yi = f(xi) V a = y a = V-1 y 84

Inverse FFT xi = i (i)j = ij (0)0 (0)1 (0)2 (0)3 … (0)n-1 (1)0 (1)1 (1)2 (1)3 … (1)n-1 (2)0 (2)1 (2)2 (2)3 … (2)n-1 (3)0 (3)1 (3)2 (3)3 … (3)n-1 (n-1)0 (n-1)1 (n-1)2 (n-1)3 … (n-1)n-1 (i)j a0 a1 a2 a3 … an-1 y0 y1 y2 y3 … yn-1 = yi = f(xi) V a = y a = V-1 y 85

Inverse FFT xi = i (i)j = ij 0 0 0 0 … 0 0 1 2 3 … n-1 0 2 4 6 … 2n-2 0 3 6 9 … 3n-3 0 n-1 2n-2 3n-3 … (n-1)(n-1) ij a0 a1 a2 a3 … an-1 y0 y1 y2 y3 … yn-1 = yi = f(xi) V a = y a = V-1 y V-1 = 1/n V -1 86

1/n Vandermonde matrix V-1 = 1/n V 0 0 0 0 … 0 a0 a1 a2 a3 …
Inverse FFT 0 0 0 0 … 0 a0 a1 a2 a3 … an-1 y0 y1 y2 y3 yn-1 = 0 1 2 3 … n-1 0 2 4 6 … 2n-2 0 3 6 9 … 3n-3 0 n-1 2n-2 3n-3 … (n-1)(n-1) ij Vandermonde matrix V-1 = 1/n V -1 0 0 0 0 … 0 y0 y1 y2 y3 … yn-1 a0 a1 a2 a3 an-1 = 0 -1 -2 -3 … -(n-1) 0 -2 -4 -6 … -(2n-2) 0 -3 -6 -9 … -3(n-3) 0 -(n-1) -(2n-2) -(3n-3) … -(n-1)(n-1) -ij 1/n 87

Inverse FFT The inverse w-1 of w is 164 165 163 166 162 rr 167 161 168 -1 = 160 = 1616 = 1 1615 169 1614 1610 1613 1611 1612

Inverse FFT Cancellation Property: 161 160 162 163 164 165 166 167 168 169 1610 1611 1612 1613 1614 1615 = 1616 = 1 -1 = rr

Inverse FFT Proof: Let A=V -1V. We want to show that A=I, where If i=j, then If i and j are different, then

Inverse FFT The FFT and inverse FFT can use the same hardware FFT Input: <1, , a0,a1,a2,…,an-1 > Output: <y0,y1,y2,…,yn-1 > Inverse FFT Input: <1/n, -1, y0,y1,y2,…,yn-1 > Output: <a0,a1,a2,…,an-1 >

Polynomial Basis Basis: sin(2 f t/n), cos(2 f t/n) Basis: 1, x, x2, x3,… 92

Polynomial Basis Basis: sin(2 f t/n), cos(2 f t/n) The fth basis “vector” has frequency f. As t=0..n, t/n = 0..1, 2 f t/n = 0.. 2 f , cos(2 f t/n) does f full cycles. 93

Polynomial Basis Basis: 1, x, x2, x3,… 94

Polynomial Basis Basis: 1, x, x2, x3,… fth basis “vectors” is xf We compute this on x0, x1, x2, x3, …. tth value is xt xt = t fth basis on tth value is (xt) f = (t) f = ft 95

Roots of Unity 16th roots of unity × 164 165 163 166 162 rr 167 161 reθi = rcosθ + irsinθ r = 1, θ = 2/n  = e2i/n r θ 168 160 1615 169 1614 1610 1613 1611 1612

Polynomial Basis Basis: 1, x, x2, x3,… fth basis “vectors” is xf We compute this on x0, x1, x2, x3, …. tth value is xt xt = t fth basis on tth value is (xt) f = (t) f = ft = [e2i/n] ft = e2 f t/n i = cos(2 f t/n) + i sin(2 f t/n) reθi = rcosθ + irsinθ r = 1, θ = 2/n  = e2i/n 97

Polynomial Basis Basis: sin(2 f t/n), cos(2 f t/n) Basis: 1, x, x2, x3,… “vectors” fth basis “vectors” is xf We compute this on x0, x1, x2, x3, …. tth value is xt xt = t fth basis on tth value is (xt) f = (t) f = ft = [e2i/n] ft = e2 f t/n i = cos(2 f t/n) + i sin(2 f t/n) 98

Polynomial Basis Basis: sin(2 f t/n), cos(2 f t/n) Basis: 1, x, x2, x3,… 99

Polynomial Multiplication
f(x) = a0+a1x +a2x2 + … + an-1xn-1 g(x) = b0+b1x +b2x2 + … + bn-1xn-1 [f×g](x) = c0+c1x +c2x … c2n-2x2n-2 x5 coefficient: c5= a0×b5+a1×b4 + a2×b3 + … + a5×b0 Convolution Too much O(n2) Time =

Polynomial Multiplication
f(x) = a0+a1x +a2x2 + … + an-1xn-1 g(x) = b0+b1x +b2x2 + … + bn-1xn-1 [f×g](x) = c0+c1x +c2x … c2n-2x2n-2 Coefficient Domain aj Evaluation Domain yi [a0,a1,a2 ,…,an-1] [b0,b1,b2 ,…,bn-1] yi = f(xi) zi = g(xi) Fast Fourier Transform takes O(nlogn) time! yi×zi = [g×f](xi) Multipling values pointwise takes O(n) time! [c0,c1,c2 ,…,cn-1]

Multiplying Big Integers
X = 11… (N bits) Y = 10… X×Y = 10… The high school algorithm takes O(N2) bit operations. Can we do it faster? I hope so See Recursion for one way to do it faster. This is another.

X = 11… (N bits) m Break into m = O(log N) bit blocks

X = 0000 … m an … a a6 a a a3 a a a0 Break into m = O(log N) bit blocks ai = O(N)

X = 0000 … m f(x) = an-1xn-1 + … + a5x5 + a4x4 + a3x3 + a2x2 + a1x + a0 g(x) = bn-1xn-1 + … + b5x5 + b4x4 + b3x3 + b2x2 + b1x + b0 View as coefficients of a polynomial. Note X = f(2m). Same for Y = g(2m). Multiply g×f using FFT in time O(nlogn). Note X×Y = [g×f](2m).

The End

Roots of Unity 16th roots of unity (n/4)2 = n/2 = -1 i These could be Z mod 17 or complex numbers × 164 165 163 166 162 rr 167 161 (n/2)2 = 1 r θ 168 -1 = 160 = 1616 = 1 1615 169 1614 1610 1613 1611 1612 reθi = rcosθ + irsinθ reθi × seαi = (rs)e(θ+α)i (3n/4)2 = n/2 = -1 -i

Roots of Unity Goal: Proof f(θ) = g(θ) f(0) = g(0) f’(0) = g’(0) f’’(θ) = -f(θ) g’’(θ) = -g(θ) Proof by induction (over the reals) that f(θ) = g(θ) f(θ) g(θ) For this θ, f(θ) = g(θ) and f’(θ) = g’(θ) For next θ+, f(θ+) = g(θ+) f’’(θ) = -f(θ) = -g(θ) =g’’(θ) For next θ+, f’(θ+) = g’(θ+)

Roots of Unity g(θ) = rcosθ + irsinθ f(θ) = reθi Goal: Proof f(θ) = g(θ) g(0) = rcos0 + irsin0 = r f(0) = re0i = r f(0) = g(0) g’(θ) = -rsinθ + ircosθ f’(θ) = ireθi g’(0) = -rsin0 + ircos0 = ir f’(0) = ire0i = ir f’(0) = g’(0) g’’(θ) = -rcosθ - rsinθ f’’(θ) = -reθi = -f(θ) = -g(θ)

Sin & Cos basis Modifies DFT frequency coefficient calculations: ReX[ k ] = x[n] cos(2πkn/N) < k < N/2 x[i] ε Real ImX[ k ] = x[n] sin(2πkn/N) Uses complex and polar numbers as a shorthand: Xk = ReX[ k ] + i ImX[ k ] Xk = xn e –i2πkn/N = xn ωkn N-1 N-1 Ʃ n=0 N-1 Complex Ʃ N = 2r n=0 But if you need an algorithm that runs faster (but takes longer to write) then you’ll need to use the FFT It starts by modifying DFT to accept complex valued time signals and returns N values of k rather than N/2. Using complex numbers change two summations into one more compact summation which at first glance appears “complex” and intimidating. The hint to understand this equation is that e^-i2πkn/N has a magnitude of 1, so it really only represents a phase (shift) angle of some multiple kn, of 360º/N By replacing e^-i2π/N by ω we can temporarily forget that we are dealing with complex numbers. Now if we expand the series, we will get something that looks like N equations of polynomials of ω. When we write this information as a matrix, we get the Vandermonde matrix which Jeff has shown us previously. r·e iθ = r·cosθ + i r·sinθ = r θ Ʃ n=0 N-1 Ʃ n=0 N-1 ω = e –i2π/N 110

Sin & Cos basis 1. Convert your N real sampled values to complex numbers by adding 0i to them xn = xn + 0i < n < N-1 2. Feed this as the input to the FFT Remove FFT output’s redundant information (i.e. all frequencies above N/2) ReX diagram from Kester Because of its speed, there is still an advantage of using the complex FFT to calculate the DFT for real inputs. The FFT just gives additional redundant information that can be discarded. ImX “Negative” Frequency N-1 N/ N-1 N/2 “Negative” Frequency Even Symmetry About N/2 (fs/2) Odd Symmetry About N/2 (fs/2) 111

FFT Butterfly

The Ugly Math for the FFT
(0)0 (0)1 (0)2 (0)3 … (0)n-1 (1)0 (1)1 (1)2 (1)3 … (1)n-1 (2)0 (2)1 (2)2 (2)3 … (2)n-1 (3)0 (3)1 (3)2 (3)3 … (3)n-1 (n-1)0 (n-1)1 (n-1)2 (n-1)3 …(n-1)n-1 ... = x0 x1 x2 x3 … xN-1 X0 X1 X2 X3 XN-1 Behold the Vandermonde matrix! matrix and epic struggle between good and evil characters from Edmonds However, we have still not gained anything. But that’s O(N2) !! 113

The Ugly Math for the FFT
0 n-1 2n-2 3n-3 … (n-1)(n-1) 0 0 0 0 … 0 0 1 2 3 … n-1 0 2 4 6 … 2n-2 0 3 6 9 … 3n-3 ... = x0 x1 x2 x3 … xN-1 X0 X1 X2 X3 XN-1 But if I multiply the exponents ... However, we have still not gained anything. But that’s still O(N2) !! 114

Just watch! For example, if N=8 and I use the N roots of unity ...
0 0 0 0 1 2 3 4 5 6 7 02 8 10 12 14 9 15 18 21 0 4 16 20 24 28 05 25 30 35 36 42 0 7 49 Just watch! For example, if N=8 and I use the N roots of unity ... But now by taking advantages of the properties of the complex roots of unity, ω^p we can start to make some progress. p+4 = -p p = p mod 8 6 5 7 4 0 0 = 1 4 = -1 1 3 = 8 = e -i2π/8 2 115

= x0 x1 x2 x3 x4 x5 x6 x7 X0 X1 X2 X3 X4 X5 X6 X7 1 1 1 1  2 3 -1 - -2 -3 1 2 -1 -2 1 3 -2  -1 -3 2 - 1 -1 1 - 2 -3 -1  -2 3 1 -2 -1 2 1 -3 -2 - -1 3 2  Now the 2nd half of each row either equals the 1st half or its negative 116

x0 x4 = x0 x1 x2 x3 x4 x5 x6 x7 X0 X1 X2 X3 X4 X5 X6 X7 1 1 1 3 -2  -1 -3 2 - x0 and x4 have identical coefficients (ignoring sign) as do: x1 and x5 x2 and x6 x3 and x7 Note to self: Easier to duplicate slide than to co-ordinate appear and disappear animations 117

x1 x5 = x0 x1 x2 x3 x4 x5 x6 x7 X0 X1 X2 X3 X4 X5 X6 X7 1 1 1 3 -2  -1 -3 2 - x0 and x4 have identical coefficients (ignoring sign) as do: x1 and x5 x2 and x6 x3 and x7 118

Oh my! Half the columns are gone. What’s next?
(x0 + x4) (x2 + x6) (x1 + x5) (x3 + x7) = X0 (x0 - x4) + 2 (x2 - x6)  (x1 - x5) + 3 (x3 - x7) = X1 (x0 + x4) (x2 + x6) 2 (x1 + x5) - 2 (x3 + x7) = X2 (x0 - x4) - 2 (x2 - x6) 3 (x1 - x5) +  (x3 - x7) = X3 (x0 + x4) (x2 + x6) (x1 + x5) (x3 + x7) = X4 (x0 - x4) + 2 (x2 - x6)  (x1 - x5) - 3 (x3 - x7) = X5 (x0 + x4) (x2 + x6) 2 (x1 + x5) + 2 (x3 + x7) = X6 (x0 - x4) - 2 (x2 - x6) 3 (x1 - x5)  (x3 - x7) = X7 Now rewrite the matrix as equations in terms of: x0 ± x4, x2 ± x6 , x1 ± x5 , x3 ± x7 These observations lead to rewriting the equations in terms of x0 ± x4, x2 ± x6 , x1 ± x5 , x3 ± x7 End result: By introducing an addition and subtraction, we have eliminated half of the more expensive multiplications. We could continue to reduce the number of columns (i.e. multiplications) by combining additions and subtractions into larger ones, but from this point on, there is an easier way to get to our goal. Oh my! Half the columns are gone. What’s next? 121

(x0 + x4) + (x2 + x6) + (x1 + x5) + (x3 + x7) = X0
Think signal flow and construct the equations using the butterfly operator: Ex. for butterfly diagram from Smith In this case, we would set ω^p to 1 Ʃ ωp + - xo x4 xo + ωp x4 xo - ωp x4 x0 ± x4 (ωp = 1) 122

Note - the butterfly has a shorthand notation of:
(x0 + x4) (x2 + x6) (x1 + x5) (x3 + x7) = X0 (x0 - x4) + 2 (x2 - x6)  (x1 - x5) + 3 (x3 - x7) = X1 (x0 + x4) (x2 + x6) 2 (x1 + x5) - 2 (x3 + x7) = X2 (x0 - x4) - 2 (x2 - x6) 3 (x1 - x5) +  (x3 - x7) = X3 (x0 + x4) (x2 + x6) (x1 + x5) (x3 + x7) = X4 (x0 - x4) + 2 (x2 - x6)  (x1 - x5) - 3 (x3 - x7) = X5 (x0 + x4) (x2 + x6) 2 (x1 + x5) + 2 (x3 + x7) = X6 (x0 - x4) - 2 (x2 - x6) 3 (x1 - x5)  (x3 - x7) = X7 Note - the butterfly has a shorthand notation of: butterfly diagram from Smith ωk -1 xo x4 xo + ωp x4 xo - ωp x4 123

Using Bit Reverse Order and a tree of
butterflies, my Decimation in Time Algorithm can solve this in O (N log N) Damn you and your re “cursed” friends! If you’re lying, I’ll claim your soul! No friends this time. They’d just be overhead invading my stack space. Decimation in Time & Bit Reverse Order ( (rearranging the order of the N samples) Decimation in time is what we will use to arrange the input to our butterflies based on what we have seen in the previous equations. (Go back one slide) Notice the x0 ± x4, x2 ± x6 , x1 ± x5 , x3 ± x7 columns of the equations (Go back to this slide) correspond to the order of this last row. Decimation in time is the process of rearranging a set into two smaller sets, where every 2nd item goes into the second smaller set. The process is then repeated on each of the smaller sets as needed. In a multistage process, one can do all this in one step by noticing that the final index of an element corresponds to reversing the bits of its original index. (hence the name Bit Reverse Order) 124

FFT Block Diagram x0 X0 x4 X1 x2 X2 x6 X3 x1 X4 x5 X5 x3 X6 x7 X7
2 – Point Butterfly 2 combined 2-Point Butterflies 4 combined 2-Point Butterflies 2 – Point Butterfly 2 – Point Butterfly diagram from Kester So we start with a block of diagram, resembling a binary tree structure and with our inputs fed in Bit Reverse Order. Notice that at each stage boundary, the same number of lines (which in this example is 8) cross from one stage into the next . 2 combined 2-Point Butterflies 2 – Point Butterfly STAGE 1 STAGE 2 STAGE 3 125

Stepping Through the FFT
x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 diagram from Kester Next we install the butterflies per the previous equations, which results in this messy diagram, which I hope to unravel for you. To see the rhyme and reason of the multiplication weights of the second stage, remember that ω8^2 is equivalent to ω4^1 Note to self: Future improvement would be to trace out the signal flow for a few output points and show that it corresponds to the equations developed earlier. -1 80 82 -1 -1 80 82 83 -1 -1 -1 80 = 1 82 = 41 126

x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 So as we walk through the first stage, we see that it requires ... -1 80 82 -1 -1 80 82 83 -1 -1 -1 127

x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 -1 80 82 -1 -1 80 82 83 -1 -1 -1 128

x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 -1 80 82 -1 -1 80 82 83 -1 -1 -1 129

x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 ... four butterflies. Also note that this first stage took in 8 inputs and produced 8 outputs, which will now serve as the input to the next stage. Since there is no further use for the original inputs, their memory space can be used to store the output results (aka in place memory processing) This memory in place processing allows the FFT to be implemented in many embedded systems processors that have access to only a small amount of memory. -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/2 130

x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 Similarly, as we work through the second stage, we can see that despite its confusing appearance ... -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/2 131

x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/2 132

x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/2 133

x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 it also consists of only 4 simple butterflies as well. -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/ N/2 134

x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 And the same holds true for the final stage. It also consists of only 4 simple butterflies... -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/ N/2 135

x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/ N/2 136

x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/ N/2 137

x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 So the end result for the combined log N stages in terms of butterfly operations is O(N log N) Note to self for future: For best contrast, select light coloured lines and saturated colours for backgrounds. -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/ N/ N/2  O (N log N) 138

Algorithm FFT (ReX, ImX)
Input: ReX[ ], ImX[ ] = real, imaginary parts of the time samples Output: ReX[ ], ImX[ ] = cosine, sine coefficients of frequency domain N = SizeOf( ReX ) PutInBitReverseOrder (ReX, ImX) % time domain decomposition % frequency domain synthesis (done in place) for k = 1 to log 2 N % Loop for each stage Wre = 1; Wim = 0; θ = 2π/ 2k % Initialize stage constants for j = 1 to 2k % Loop for each sub DFT for i = j-1 to N-1 step 2k % Loop for each butterfly ip = i + 2k-1 tmpRe = ReX[ip]·Wre - ImX[ip]·Wim tmpIm = ReX[ip]·Wim + ImX[ip]·Wre ReX[ip] = ReX[ i ] - tmpRe ImX[ip] = ImX[ i ] - tempIm ReX[ i ] = ReX[ i ] + tempRe ImX[ i ] = ImX[ i ] + tempIm next i tempRe = Wre Wre = tmpRe·cos(θ) + Wim·sin(θ) Wim = - tmpRe·sin(θ) + Wim· cos(θ) next j next k return (ReX, ImX) % ReX[ ],ImX[ ] return freq coeffs 0 to N-1 Smith’s code and diagram used to generate algorithm and diagram So despite the complexity of the signal flow diagram, we see that we can perform the FFT using an iterative approach, and the FFT boils down to being just 3 nested loops and a little overhead.

Grade School Revisited: How To Multiply Two Numbers
Multiplying Big Integers Grade School Revisited: How To Multiply Two Numbers

X = 0011 … m Break into m = O(log N) bit blocks

X = 0000 … n=2r blocks m O(log p) an … a a6 a a a3 a a a0 Break into m = O(log N) bit blocks Pad with zero 2N bits to hold product n blocks where n is a power of 2, ie n=2r. Let p be a prime log p ≥ block size = m p-1 is divisible by n, so Z mod p has n nth roots of unity. View each block as a finite field element in Z mod p (no actual work)

X = 0000 … m f(x) = an-1xn-1 + … + a5x5 + a4x4 + a3x3 + a2x2 + a1x + a0 g(x) = bn-1xn-1 + … + b5x5 + b4x4 + b3x3 + b2x2 + b1x + b0 View as coefficients of a polynomial. Note X = f(2m). Same for Y = g(2m). Multiply g×f using FFT in time O(nlogn). Note X×Y = [g×f](2m). Evaluate [g×f](2m) in time O(n) operations, but each op could be on O(n) bit numbers for a total of O(n2) time.

Evaluate [g×f](2m) in time O(n). [g×f](x) = cn-1xn-1 + … + c5x5 + c4x4 + c3x3 + c2x2 + c1x + c0 X×Y = m O(log p) Some texts say the ci can just be shifted and joined. Problem: The field elements may be too big.

Evaluate [g×f](2m) in time O(n). [g×f](x) = cnxn + … + c5x5 + c4x4 + c3x3 + c2x2 + c1x + c0 Shift each ci by im. Add m O(log p) 101011 001101 110001 000111 100100 010011 X×Y = Adding n numbers each n bits long takes O(n2) but here the numbers are sparse.

Evaluate [g×f](2m) in time O(n). [g×f](x) = cnxn + … + c5x5 + c4x4 + c3x3 + c2x2 + c1x + c0 Shift each ci by im. Add m O(log p) 101011 001101 110001 000111 100100 010011 X×Y = At each point, at most two numbers overlap  Carry is at most one  O(N) bit operations.

X = 11… (N bits) Y = 10… X×Y = 10… Suppose N is really really big. How many bit operations are needed? O(N logN) O(N logN loglogN) O(N logN loglogN logloglogN loglogloglogN …) FFT time Time stated in text Time as far as I can see

X = … … N’ Input size = N bits Field element size = N’ = log(N) bits # ai = n = N/N’ # of field ops = O(nlogn) Time for × field op = ? X’ = N’’ Input size = N’ bits Field element size = N’’ = log(N’) bits # ai = n’ = N’/N’’ # of field ops = O(n’logn’) Time for × field op = ? Total time: O(N’ logN’ loglogN’ logloglogN’ …) And so on …

X = … … N’ Input size = N bits Field element size = N’ = log(N) bits # ai = n = N/N’ # of field ops = O(nlogn) Time for × field op = ? Total time: = O( n logn ) × O(N’ logN’ loglogN’ logloglogN’ …) = O(N/N’ logN/N’) × O(N’ loglogN logloglogN loglogloglogN …) = O( N logN loglogN logloglogN loglogloglogN …) O(N’ logN’ loglogN’ logloglogN’ …)

Fourier Transformations

Similar presentations

Presentation on theme: "Fourier Transformations"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Fourier Transformations

Similar presentations

Presentation on theme: "Fourier Transformations"— Presentation transcript:

Similar presentations

About project

Feedback