Fourier Transformations

Slides:



Advertisements
Similar presentations
DFT & FFT Computation.
Advertisements

Fast Fourier Transform for speeding up the multiplication of polynomials an Algorithm Visualization Alexandru Cioaca.
The Discrete Fourier Transform. The spectrum of a sampled function is given by where –  or 0 .
Fourier Transform (Chapter 4)
Review of Frequency Domain
FFT1 The Fast Fourier Transform. FFT2 Outline and Reading Polynomial Multiplication Problem Primitive Roots of Unity (§10.4.1) The Discrete Fourier Transform.
Image Enhancement in the Frequency Domain Part I Image Enhancement in the Frequency Domain Part I Dr. Samir H. Abdul-Jauwad Electrical Engineering Department.
FFT1 The Fast Fourier Transform by Jorge M. Trabal.
Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 16: Application-Driven Hardware Acceleration (1/4)
CSE 421 Algorithms Richard Anderson Lecture 13 Divide and Conquer.
Introduction to Algorithms
The Fourier series A large class of phenomena can be described as periodic in nature: waves, sounds, light, radio, water waves etc. It is natural to attempt.
Unit 7 Fourier, DFT, and FFT 1. Time and Frequency Representation The most common representation of signals and waveforms is in the time domain Most signal.
CELLULAR COMMUNICATIONS DSP Intro. Signals: quantization and sampling.
Integral Transform Dongsup Kim Department of Biosystems, KAIST Fall, 2004.
1 Chapter 8 The Discrete Fourier Transform 2 Introduction  In Chapters 2 and 3 we discussed the representation of sequences and LTI systems in terms.
Chapter 10 Review: Matrix Algebra
Motivation Music as a combination of sounds at different frequencies
Basic Concepts of Algebra
Fourier Transformations Jeff Edmonds York University COSC 6111 Change from Time to Polynomial Basis Evaluating & Interpolating FFT in nlogn Time Roots.
Transforms. 5*sin (2  4t) Amplitude = 5 Frequency = 4 Hz seconds A sine wave.
Digital Signal Processing – Chapter 10
Analysis of Algorithms
FFT1 The Fast Fourier Transform. FFT2 Outline and Reading Polynomial Multiplication Problem Primitive Roots of Unity (§10.4.1) The Discrete Fourier Transform.
Zhongguo Liu_Biomedical Engineering_Shandong Univ. Chapter 8 The Discrete Fourier Transform Zhongguo Liu Biomedical Engineering School of Control.
The Fast Fourier Transform and Applications to Multiplication
Fourier Transform.
Fast Fourier Transforms. 2 Discrete Fourier Transform The DFT pair was given as Baseline for computational complexity: –Each DFT coefficient requires.
Fourier Transform (Chapter 4) CS474/674 – Prof. Bebis.
Discrete Fourier Transform
1 Chapter 8 The Discrete Fourier Transform (cont.)
Digital Image Processing Lecture 8: Fourier Transform Prof. Charlene Tsai.
MATH Lesson 2 Binary arithmetic.
Unit 1 Introduction Number Systems and Conversion.
Trigonometric Identities
Fourier Transformations
Linear Algebra Review.
DIGITAL SIGNAL PROCESSING ELECTRONICS
Analysis of Algorithms
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Polynomial + Fast Fourier Transform
September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson.
Fast Fourier Transforms Dr. Vinu Thomas
Copyright © Cengage Learning. All rights reserved.
Systems of First Order Linear Equations
UNIT II Analysis of Continuous Time signal
Polynomials and the FFT(UNIT-3)
Lecture #17 INTRODUCTION TO THE FAST FOURIER TRANSFORM ALGORITHM
Unconventional Fixed-Radix Number Systems
4.1 DFT In practice the Fourier components of data are obtained by digital computation rather than by analog processing. The analog values have to be.
September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson.
The Fast Fourier Transform
Chapter 8 The Discrete Fourier Transform
Advanced Algorithms Analysis and Design
Fast Fourier Transformation (FFT)
Data Structures Review Session
UNIVERSITY OF MASSACHUSETTS Dept
Chapter 9 Computation of the Discrete Fourier Transform
Storing Negative Integers
Image Coding and Compression
UNIVERSITY OF MASSACHUSETTS Dept
ECE 352 Digital System Fundamentals
Chapter 8 The Discrete Fourier Transform
UNIVERSITY OF MASSACHUSETTS Dept
The Fast Fourier Transform
Chapter 8 The Discrete Fourier Transform
Math review - scalars, vectors, and matrices
Fast Fourier Transform (FFT) Algorithms
Lecture #17 INTRODUCTION TO THE FAST FOURIER TRANSFORM ALGORITHM
Presentation transcript:

Fourier Transformations Grad Algorithms Fourier Transformations Fourier Transformation (sine) Fourier Transformation (JPEG) Change from Time to Polynomial Basis Evaluating & Interpolating FFT in nlogn Time Roots of Unity Same FFT Code & Butterfly Inverse FFT Sin and Cos Basis FFT Butterfly Polynomial Multiplication Integer Multiplication Jeff Edmonds York University COSC 6111

Sum of sine waves gives a square wave.

Changing Basis Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. The standard basis of a vector space: A tuple <w1,w2,…,wd> of basis objects Linearly independent Spans the space uniquely "v $[a1,a2,…,ad], v = a1w1+a2w2 +… + adwd The new basis of a vector space: A tuple <W1,W2,…,Wd> of basis objects Linearly independent Spans the space uniquely "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd Use small letters aj for the coefficients in the standard basis and capital letters Ak for the coefficients in the new basis 3

[ ][ ] =[ ] Changing Basis Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] v = v = [a1,a2] = [3,2] [A1,A2] = [11/5,32/5] T-1([A1,A2,…,Ad]) = [a1,a2,…,ad] ? [ ][ ] =[ ] a1 a2 A1 A2 4

[ ][ ] =[ ] [ ][ ] =[ ] Changing Basis Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] v = -3/5 4/5 W1[1] W1[2] [a1,a2] = [4/5, -3/5] [A1,A2] = [1,0] T-1([A1,A2,…,Ad]) = [a1,a2,…,ad] [ ][ ] =[ ] ? [ ][ ] =[ ] 1 W1[1] W1[2] ? 4/5 -3/5 ? A1 A2 1 4/5 -3/5 a1 a2 5

[ ][ ] =[ ] [ ][ ] =[ ] Changing Basis Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] v = W2[1] W2[2] 3/5 4/5 [a1,a2] = [3/5,4/5] [A1,A2] = [0,1] T-1([A1,A2,…,Ad]) = [a1,a2,…,ad] [ ][ ] =[ ] [ ][ ] =[ ] 4/5 -3/5 ? 3/5 4/5 ? A1 A2 1 3/5 4/5 a1 a2 W1[1] W1[2] W2[1] W2[2] 1 W1[1] W1[2] 6

[ ][ ] =[ ] [ ][ ] =[ ] Changing Basis Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] [3,2] v = [a1,a2] = [11/5,32/5] [A1,A2] = T-1([A1,A2,…,Ad]) = [a1,a2,…,ad] [ ][ ] =[ ] [ ][ ] =[ ] 4/5 -3/5 3/5 4/5 11/5 32/5 A1 A2 a1 a2 W1[1] W1[2] W2[1] W2[2] 3 2 A1 A2 a1 a2 7

[ ][ ] =[ ] [ ] [ ] =[ ] Changing Basis Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] W1[1] W1[2] W2[1] W2[2] [ ][ ] =[ ] W1[1] W1[2] W2[1] W2[2] a1 a2 A1 A2 [ ] [ ] =[ ] W1[1] W1[2] W2[1] W2[2] a1 a2 A1 A2 -1 8

[ ][ ]=[ ] [ ] = [ ] Changing Basis Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] W1[1] W1[2] W2[1] W2[2] If the new basis vectors are orthogonal and of uniform length: |W1|2=n, then W1∙W1 = jW1[j]W1[j] = n W1W2, then W1∙W2 = jW1[j]W2[j] = 0 [ ][ ]=[ ] W1[1] W1[2] W2[1] W2[2] n [ ] = [ ] W1[1] W1[2] W2[1] W2[2] -1 1/n 9

[ ][ ] =[ ] [ ] [ ] =[ ] [ ][ ] =[ ] Changing Basis Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] W1[1] W1[2] W2[1] W2[2] [ ][ ] =[ ] W1[1] W1[2] W2[1] W2[2] a1 a2 A1 A2 [ ] [ ] =[ ] W1[1] W1[2] W2[1] W2[2] a1 a2 A1 A2 -1 W1[1] W2[1] W1[2] W2[2] [ ][ ] =[ ] a1 a2 A1 A2 1/n 10

[ ][ ] =[ ] Changing Basis Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. "v $[a1,a2,…,ad], v = a1w1 +a2w2 +… + adwd "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd =[w1,w2] =[ , ] Standard Basis New Basis =[W1,W2] =[ , ] W1[1] W1[2] W2[1] W2[2] Viewed a different way: A1 v  v  W1 v a1 a2 A1 = |v|cos() = v∙W1 = j ajW1[j] W1[1] W2[1] W1[2] W2[2] [ ][ ] =[ ] a1 a2 A1 A2 |W1| cos() = v∙W1 |v||W1| This is the correlation between v and W1 11

Fourier Transformation are a change of basis from the time basis to sine/cosine basis JPG or polynomial basis Applications Signal Processing Compressing data (eg images with .jpg) Multiplying integers in n logn loglogn time. …. Purposes: Some operations on the data are cheaper in new format Some concepts are easier to read from the data in new format Some of the bits of the data in the new format are less significant and hence can be dropped. Amazingly once you include complex numbers, the FFT code for sine/cosines and for polynomials are the SAME. http://www.dspguide.com/ch8.htm The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D.

Fourier Transformation Sine &Cosine Basis A continuous periodic function t time y(t) Swings, capacitors, and inductors all resonate at a given frequency, which is how the circuit picks out the contribution of a given frequency. Find the contribution of each frequency If this is the dominate musical note of frequency  = 2/T, then all the other basis functions are its harmonics frequencies: Frequency: Note on the Piano: , 2, 3, 4, 5, 6, ... C G E

Fourier Transformation Sine &Cosine Basis y(x) = x y(x)  2 sin(x) - sin(2x) + 2/3 sin(3x) Surely this can’t be expressed as sum of sines and cosines.

Fourier Transformation Sine &Cosine Basis y(x) = x2 y(x)  -4 sin(x) + sin(2x) - 4/9 sin(3x)

Fourier Transformation Time Domain y Frequency Domain Y The value y[j] of the signal at each point in time j. The amount Y[f] of frequency f in the signal for each frequency f.

Fourier Transformation Change of Basis Change of Basis: T([a1,a2,…,ad]) = [A1,A2,…,Ad] Changes the basis used to describe an object. The Time basis of a vector space: A tuple <w1,w2,…,wd> of basis objects Linearly independent Spans the space uniquely "v $[a1,a2,…,ad], v = a1w1+a2w2 +… + adwd The Fourier basis of a vector space: A tuple <W1,W2,…,Wd> of basis objects Linearly independent Spans the space uniquely "v $[A1,A2,…,Ad], v = A1W1+A2W2 +… + AdWd 17

Fourier Transformation Change of Basis Change of Basis: T(y[0],y[1],…,y[n-1]) = [YRe[0],…,YIm[n/2]] Changes the basis used to describe an object. "y $[y[0],y[1],…,y[n-1]], y = y[0]I1 +y[1]I2 +… + y[n-1]In The time basis j’ Ij[j’] zero one j Time Basis =[ , ] =[I1,I2,…] y = y[0]=3 y[1]=2 The value y[j] of the signal at each point in time j. A discrete periodic function j y[j] 18

Fourier Transformation Change of Basis Change of Basis: T(y[0],y[1],…,y[n-1]) = [YRe[0],…,YIm[n/2]] Changes the basis used to describe an object. "y $[y[0],y[1],…,y[n-1]], y = y[0]I1 +y[1]I2 +… + y[n-1]In y = YRe[0]∙c1+YIm[0]∙s1+ ,…,YRe[n/2]∙sn/2+YIm[n/2]∙sn/2 Time Basis =[ , ] =[I1,I2,…] y = Fourier Basis =[ , ] =[?,?] =[c1,s1,..] c1 sn/2 cn/2 s1 y = y[0]=3 y[1]=2 YRe[0] =11/5 YIm[0] =32/5 The amount Y[f] of frequency f in the signal for each frequency f. A discrete periodic function j y[j] 19

[ ] [ ] =[ ] [ ] [ ] =[ ] Fourier Transformation Change of Basis Change of Basis: T(y[0],y[1],…,y[n-1]) = [YRe[0],…,YIm[n/2]] Changes the basis used to describe an object. "y $[y[0],y[1],…,y[n-1]], y = y[0]I1 +y[1]I2 +… + y[n-1]In y = YRe[0]∙c1+YIm[0]∙s1+ ,…,YRe[n/2]∙sn/2+YIm[n/2]∙sn/2 Time Basis =[ , ] =[I1,I2,…] y = Fourier Basis =[ , ] =[c1,s1,..] c1 sn/2 cn/2 s1 y = y[0]=3 y[1]=2 YRe[0] =11/5 YIm[0] =32/5 s1[1] s1[2] s2[1] s2[2] Y[1] Y[2] y[1] y[2] [ ] [ ] =[ ] s1[1] s2[1] s1[2] s2[2] Y[1] Y[2] [ ] [ ] =[ ] y[1] y[2] -1 20

[ ][ ]=[ ] [ ] = [ ] Fourier Transformation =[I1,I2,…] =[c1,s1,..] Orthogonal Basis Time Basis =[ , ] =[I1,I2,…] Fourier Basis =[ , ] =[c1,s1,..] Sine and Cosines of different frequencies are orthogonal and of (almost) uniform length: [ ][ ]=[ ] s1[1] s1[2] s2[1] s2[2] n/2 [ ] = [ ] s1[1] s1[2] s2[1] s2[2] -1 2/n 21

[ ] [ ] =[ ] Fourier Transformation =[I1,I2,…] =[c1,s1,..] =[ , ] Orthogonal Basis Time Basis =[ , ] =[I1,I2,…] Fourier Basis =[ , ] =[c1,s1,..] [ ] [ ] =[ ] s1[1] s1[2] s2[1] s2[2] Y[1] Y[2] 2/n y[1] y[2] Duality of FT: If Y=FT(y), then y=FT(Y) 22

Fourier Transformation Duality of FT Time Domain y Frequency Domain Y Cosine wave Cosine with f=4 Delta function Impulse at Yre[4] Delta function Impulse at y[4] Cosine wave Cosine with f=4 Duality of FT: If Y=FT(y), then y=FT(Y)

Fourier Transformation Duality of FT Time Domain y Frequency Domain Y Square wave Sinc function How do you get these corner? Sinc function Square wave ? Duality of FT: If Y=FT(y), then y=FT(Y)

Fourier Transformation Duality of FT Time Domain y Frequency Domain Y Gaussian Duality of FT: If Y=FT(y), then y=FT(Y)

Fourier Transformation Continuous Functions 26

Fourier Transformation FFT Butterfly Fast Fourier Transform takes O(nlogn) time! (See Recursive Slides) O(log(n)) levels

Fourier Transformation Radio Signals Time Domain y Frequency Domain Y Sound Signal ie how far out is the speaker drum at each point in time. Sound is low frequency High frequencies filtered out.

Fourier Transformation Radio Signals Time Domain y Frequency Domain Y Radio Carrier Signal ie A wave of magnetic field that can travel far. One high frequency signal

Fourier Transformation Radio Signals Time Domain y Frequency Domain Y Carrier signal Audio Signal (shifted) Audio Signal (shifted &flipped) Modulation: Their product y(i) = y1(i)  y2(i)

Fourier Transformation Linear Filter This system takes in a signal and outputs transformed signal. x[] y[]

Fourier Transformation h[] = Linear Filter h[] = In order understand this transformation, we put in a single pulse. [] = h[] This response h[] identifies the system.

Fourier Transformation h[] = Linear Filter Feed in any signal x[] = Sum of contributions from each separate pulse. h[] Computationally trying to figure out what this electronic system does to a signal takes O(nm) time. How can we do it faster?

Fourier Transformation  * Convolution Time Domain y Frequency Domain Y x[] = Input h[] = Impulse Response Oops Fourier Transform takes O(n2) time. X[] H[] Fast Fourier Transform takes O(nlogn) time! X[]H[] x[]*h[] = Output Multiplication takes O(n) time. Y = X H Product y = x*h Convolution

Fourier Transformation  * Convolution Time Domain y Frequency Domain Y Impulse Response h[] = H[] = Impulse Response x[] = Input X[] x[]*h[] = Output X[]H[] Multiplying zeros low and high frequencies in input. Filters out low and high frequencies in input. Not clear what system does to input Y = X H Product y = x*h Convolution

Fourier Transformation JPG Image Compression JPEG (Image Compression) JPEG is two dimensional Fourier Transform exactly as done before.

Fourier Transformation JPG Image Compression Each 88 block of values from the image is encoded separately.

Fourier Transformation JPG Image Compression Each basis function has a coefficient, giving the contribution of this basis function to the image. Each 88 block of values from the image is encoded separately. It is decomposed as a linear combination of basis functions.

Fourier Transformation JPG Image Compression Each 88 block of values from the image is encoded separately. It is decomposed as a linear combination of basis functions. Each of the 64 basis functions is a two dimensional cosine.

Fourier Transformation JPG Image Compression The first basis is constant. Its coefficient gives the average value in within block. Because many images have large blocks of the same colour, this one coefficient gives much of the key information!

Fourier Transformation JPG Image Compression The second basis “slopes” left to right Its (pos or neg) coefficient gives whether left to right the value tends to increase or decrease.

Fourier Transformation JPG Image Compression The second basis “slopes” left to right Because many images have have a gradual change in colour, this one coefficient gives more key information!

Fourier Transformation JPG Image Compression A similar basis for top to bottom.

Fourier Transformation JPG Image Compression The <0,2> basis is Its coefficient gives whether the value tends to be smaller in the middle. This helps display the horizontal lines in images

Fourier Transformation JPG Image Compression As seen, the low frequency components of a signal are more important. Removing 90% of the bits from the high frequency components might remove, only 5% of the encoded information.

Fourier Transformation Polynomial Basis Have you seen Taylor Expansions of a Function? They show that functions f(x) can be expressed by specifying the coefficients of the polynomial. F(x) = a0+a1x +a2x2 +a3x3 + … Eg: f(x) = 1/(1-x) F(x) = 1+x +x2 +x3 + …

Fourier Transformation Polynomial Basis Have you seen Taylor Expansions of a Function? They show that functions f(x) can be expressed by specifying the coefficients of the polynomial. F(x) = a0+a1x +a2x2 +a3x3 + …

Fourier Transformation Polynomial Basis Have you seen Taylor Expansions of a Function? They show that functions f(x) can be expressed by specifying the coefficients of the polynomial. F(x) = a0+a1x +a2x2 +a3x3 + …

Fourier Transformation Polynomial Basis Instead of using sine and cosines as the basis, 49

Fourier Transformation Polynomial Basis Instead of using sine and cosines as the basis, We will now use polynomials. 50

Fourier Transformation Polynomial Basis Change of Basis: T([y[0],y[1],…, y[n-1]]) = [a1,a2,…,an-1] Changes the basis used to describe an object. "f $[y[0],y[1],…,y[n-1]], f = y0 I0 +y1 I1 +… + yn-1 In-1 The time basis x Ij[x] zero one xj Time Basis =[ , ] =[I0,I1,…] f = A discrete function x f(x) y[0]=3 y[1]=2 y[j] = the value f(xj) of the function at xj. These xj are fixed values. For FFT, we set xj = e2i j/n x0 x1 x2 x3 x4 … xn-1 51

Fourier Transformation Polynomial Basis Change of Basis: T([y[0],y[1],…, y[n-1]]) = [a1,a2,…,an-1] Changes the basis used to describe an object. "f $[y[0],y[1],…,y[n-1]], f = y0 I0 +y1 I1 +… + yn-1 In-1 "f $[a0,a1,a2 ,…,an-1], f = a0+a1x +a2x2 + … + an-1xn-1 Time Basis =[ , ] =[I0,I1,…] y = Fourier Basis =[ , ] =[1,x,x2,x3..] f = A discrete function x f(x) y[0]=3 y[1]=2 The aj are the cooeficients of the polynomial. a1 a2 x0 x1 x2 x3 x4 … xn-1 52

Fourier Transformation Evaluating & Interpolating A Fourier Transform is a change in basis. It changes the representation of a function from the coefficients of the polynomial f(x) = a0+a1x +a2x2 + … + an-1xn-1 Evaluating f at these points. Interpolation x to the value f(xi) at key values xi. x0 x1 x2 x3 x4 … xn-1 y0 y1 y2 y3 y4 … yn-1 yi = f(xi) 53

Fourier Transformation Evaluating & Interpolating Given a set of n points in the plane with distinct x-coordinates, there is exactly one (n-1)-degree polynomial going through all these points. Evaluating f at these points. Interpolation x to the value f(xi) at key values xi. x0 x1 x2 x3 x4 … xn-1 y0 y1 y2 y3 y4 … yn-1 yi = f(xi) 54

Fourier Transformation Evaluating & Interpolating A Fourier Transform is a change in basis. It changes the representation of a function from the coefficients of the polynomial f(x) = a0+a1x +a2x2 + … + an-1xn-1 (x0)0 (x0)1 (x0)2 (x0)3 … (x0)n-1 a0 a1 a2 a3 … an-1 y0 y1 y2 y3 yn-1 = (x1)0 (x1)1 (x1)2 (x1)3 … (x1)n-1 (xn-1)0(xn-1)1(xn-1)2 (xn-1)3…(xn-1)n-1 (x2)0 (x2)1 (x2)2 (x2)3 … (x2)n-1 (x3)0 (x3)1 (x3)2 (x3)3 … (x3)n-1 yi = f(xi) Vandermonde matrix Invertible if xi distinct. O(n2) time 55

Fast Fourier Transformation FFT nlogn Time The Fast Fourier Transform (FFT) is a very efficient algorithm for performing a discrete Fourier transform FFT principle first used by Gauss in 18?? (But was not interesting without computers) FFT algorithm published by Cooley & Tukey in 1965 In 1969, the 2048 point analysis of a seismic trace took 13 ½ hours. Using the FFT, the same task on the same machine took 2.4 seconds!

Fast Fourier Transformation FFT nlogn Time Not only do you get faster speed + in place memory processing but fewer calculations means less round off errors Maybe I should take CSE6111 after all! 70 60 50 40 30 20 10 Error (ppm) diagram from Smith And as a final additional bonus ... DFT FFT 16 32 64 128 256 512 1024 57

Fast Fourier Transformation FFT nlogn Time N DFT (N2) FFT (1.5N log N) faster 32 1,024 240 4.3 64 4,096 576 7.1 128 16,384 1,344 12.2 256 65,536 3,072 21.3 512 262,144 6,912 37.9 1024 1,048,576 15,360 68.2 2048 4,194,304 33,792 124.1 4096 16,777,216 73,728 227.6 Based on experimental times on most commonly used N values (using real valued DFT with sine look up tables vs complex FFT) FFT is 10 to 100 times faster than DFT and this performance advantage increases at higher input samples If one considers complex multiplications: DFT: N^2 FFT: (N/2) logN FFT is even more faster: 227 becomes more than 600 At 150MHz, current 32 bit floating point DSP can do 1024 points in 69uS (Analog Devices ADSP-TS001 TigerSHARC) Discrete Fourier Transform is too slow for real time! 58

Fast Fourier Transformation FFT nlogn Time Divide & Conquer - Friends - Recursion. Trust your friends to solve any subinstance: as long as smaller and is an instance to the same problem. My instance My friend’s Instance My friend’s Instance My friend’s Instance

Fast Fourier Transformation FFT nlogn Time My output: f(x) = a0+a1x +a2x2 + … + an-1xn-1 My input: (start with one x) (a0,a1,a2,…,an-1) & x 1st friend’s input? feven: (a0,a2,a4,…,an-2) & ? 2nd friend’s input? fodd: (a1,a3,a5,…,an-1) & ? feven(z) = a0+a2z+a4z2+a6z3+ … + an-2zn/2-1 f(x) = a0+a1x +a2x2 + … + an-1xn-1 = a0+a2x2+a4x4 + … + an-2xn-2 + a1x+a3x3+a5x5 + … + an-1xn-1 = a0+a2x2+a4x4 + … + an-2xn-2 + x( a1+a3x2+a5x4 + … + an-1xn-2 ) = feven(x2) + x( fodd(x2) ) f(x) = feven(x2) + x fodd(x2)

Fast Fourier Transformation FFT nlogn Time My output: f(x) = a0+a1x +a2x2 + … + an-1xn-1 My input: (start with one x) (a0,a1,a2,…,an-1) & x 1st friend’s input? feven: (a0,a2,a4,…,an-2) & x2 2nd friend’s input? fodd: (a1,a3,a5,…,an-1) & x2 2nd friend’s output: yodd = fodd(x2) 1st friend’s output: yeven = feven(x2) My output: f(x) = yeven + x yodd T(n) = 2 T(n/2) + O(1) = O(n) Ok. So it takes O(n) time to evaluate. f(x) = feven(x2) + x fodd(x2)

Fast Fourier Transformation FFT nlogn Time My input: (a0,a1,a2,…,an-1) (x0,x1,x2,…,xn-1) My output: (y0,y1,y2,…,yn-1) yi = f(xi) 1st friend’s input? feven: (a0,a2,a4,…,an-2) (x02,x12,x22,…,xn-12) 2nd friend’s input? fodd: (a1,a3,a5,…,an-1) (x02,x12,x22,…,xn-12) 2nd friend’s output: i y<odd,i> = fodd(xi2) 1st friend’s output: i y<even,i> = feven(xi2) My output: i f(xi) = y<even,i> + xi y<odd,i> T(n) = 2 T(n/2) + O(n) = O(n log n) Wow! That was easy.

Fast Fourier Transformation FFT nlogn Time My input: (a0,a1,a2,…,an-1) (x0,x1,x2,…,xn-1) n coefficients n values of x n/2 coefficients 1st friend’s input? feven: (a0,a2,a4,…,an-2) (x02,x12,x22,…,xn-12) Does not meet precondition! Oops

Fast Fourier Transformation FFT nlogn Time My input: (a0,a1,a2,…,an-1) (x0,x1,x2,…,xn-1) My output: (y0,y1,y2,…,yn-1) yi = f(xi) 1st friend’s input? feven: (a0,a2,a4,…,an-2) (x02,x12,x22,…,xn/2-12) 2nd friend’s input? fodd: (a1,a3,a5,…,an-1) (x02,x12,x22,…,xn/2-12) 3rd friend’s input? feven: (a0,a2,a4,…,an-2) (xn/22,xn/2+12,…,xn-12) 4th friend’s input? fodd: (a1,a3,a5,…,an-1) (xn/22,xn/2+12,…,xn-12) That’s no good. My output: i f(xi) = y<even,i> + xi y<odd,i> T(n) = 4 T(n/2) + O(n) = O(n2)

Fast Fourier Transformation Roots of Unity The values (x0,x1,x2,…,xn-1) are said to be special if: There are n distinct values. When you square each of them: The set collapses to n/2 distinct values. Eg: …, -3, -2, -1, 1, 2, 3, … square each of them …, 9, 4, 1, 1, 4, 9, … collapse the set 1, 4, 9, … half as many elements.

Fast Fourier Transformation Roots of Unity My input: (a0,a1,a2,…,an-1) Special (x0,x1,x2,…,xn-1) My output: (y0,y1,y2,…,yn-1) yi = f(xi) -3 3 1st friend’s input? feven: (a0,a2,a4,…,an-2) (x02,x12,x22,…,xn-12) n/2 distinct values 2nd friend’s input? fodd: (a1,a3,a5,…,an-1) (x02,x12,x22,…,xn-12) n/2 distinct values 9 2nd friend’s output: i y<odd,i> = fodd(xi2) 1st friend’s output: i y<even,i> = feven(xi2) feven(9) fodd(9) My output: i f(xi) = y<even,i> + xi y<odd,i> T(n) = 2 T(n/2) + O(n) = O(n log n) That’s better 3 f(3) -3 f(-3) feven(9) fodd(9)

Fast Fourier Transformation Roots of Unity My input: (a0,a1,a2,…,an-1) Special (x0,x1,x2,…,xn-1) My output: (y0,y1,y2,…,yn-1) yi = f(xi) To meet precondition these also need to be special 1st friend’s input? feven: (a0,a2,a4,…,an-2) (x02,x12,x22,…,xn-12) n/2 distinct values 2nd friend’s input? fodd: (a1,a3,a5,…,an-1) (x02,x12,x22,…,xn-12) n/2 distinct values

Fast Fourier Transformation Roots of Unity The values (x0,x1,x2,…,xn-1) are said to be special if: There are n distinct values. When you square each of them: The set collapses to n/2 distinct values. Which are also special Eg: …, -3, -2, -1, 1, 2, 3, … square each of them …, 9, 4, 1, 1, 4, 9, … collapse the set 1, 4, 9, … square each of them 2, 16, 81, … But these are not special.

Fast Fourier Transformation Roots of Unity The values (x0,x1,x2,…,xn-1) are said to be special if: There are n distinct values. When you square each of them: The set collapses to n/2 distinct values. Which are also special Eg: -i, -1, 1, i square each of them -1, 1, 1, -1  = -i,-1, 1, i are said to be 4th roots of unity Because 4 = 1 collapse the set -1, 1 square each of them 1, 1 collapse the set 1

Fast Fourier Transformation Roots of Unity  is said to be an nth root of unity (in a field) if n = 1 (There should be n solutions of this polynomial) Fermat’s Little Theorem: b≠0 bp-1 =mod p 1 says every nonzero element is an nth root of unity when n=p-1.

Fast Fourier Transformation Roots of Unity  is said to be an nth root of unity (in a field) if n = 1  is said to be a generator of the field if the numbers 1,,2, …,n-1 are all distinct 1,,2, …,n-1 are then special (when n is even) 1st half 2nd half n/2+n/2-1,…,n/2+3,n/2+2,n/2+1,n/2+0 ,0,1,2,3,…,n/2-1 square each of them n+n-2, …, n+6, n+4, n+2, n+0, 0,2,4,6, …,n-2 use n = 1 n-2, …, 6, 4, 2, 0, 0,2,4, 6, …,n-2 collapse the set 0,2,4, 6, …,n-2 We need these to be n/2 special values.

Fast Fourier Transformation Roots of Unity 16th roots of unity (n/4)2 = n/2 = -1 i These could be Z mod 17 or complex numbers × 164 165 163 166 162 rr 167 161 (n/2)2 = 1 r θ 168 -1 = 160 = 1616 = 1 1615 169 1614 1610 1613 1611 1612 reθi = rcosθ + irsinθ (3n/4)2 = n/2 = -1 -i

Fast Fourier Transformation Roots of Unity g(θ) = rcosθ + irsinθ f(θ) = reθi f'(θ) = ireθi g'(θ) = -rsinθ + ircosθ f(θ) = -reθi g(θ) = -rcosθ - irsinθ = -f(θ) = -g(θ) Proof 1: Second derivative same, y = -y f(0) = g(0) f'(0) = g'(0) Proof 2: eε  1+ε eiε  1+iε  cos(ε) + isin(ε) reθi = rcosθ + irsinθ

Fast Fourier Transformation Roots of Unity 16th roots of unity 164 165 163 166 162 3θ θ = 2/n 167 2θ 161  = e2i/n θ r=1 168 160 1615 169 1614 1610 1613 1611 1612 reθi = rcosθ + irsinθ eθi × eαi = e(θ+α)i

Fast Fourier Transformation Roots of Unity 16th roots of unity square each of them and collapse 164 165 163 166 162 167 161 168 160 1615 169 1614 1610 1613 1611 1612

Fast Fourier Transformation Roots of Unity 16th roots of unity square each of them and collapse 164 Are these special? 166 162 168 160 1614 1610 1612

Fast Fourier Transformation Roots of Unity 8th roots of unity square each of them and collapse 82 Are these special? 83 81 84 80 87 85 86

Fast Fourier Transformation Roots of Unity 4th roots of unity square each of them and collapse 41 Are these special? 42 40 43

Fast Fourier Transformation Roots of Unity 2th roots of unity square each of them and collapse Are these special? 21 20 = 1

Fast Fourier Transformation Roots of Unity My input: (a0,a1,a2,…,an-1) (nth roots of unity ni) My output: (y0,y1,y2,…,yn-1) yi = f(ni) 1st friend’s input? feven: (a0,a2,a4,…,an-2) (n/2th roots of unity n/2i) 2nd friend’s input? fodd: (a1,a3,a5,…,an-1) (n/2th roots of unity n/2i) 2nd friend’s output: i y<odd,i> = fodd(n/2i) 1st friend’s output: i y<even,i> = feven(n/2i) My output: i f(xi) = y<even,i> + xi y<odd,i> T(n) = 2 T(n/2) + O(n) = O(n log n) Excellent

Fourier Transformation FFT Code Algorithm FFT(y, , n): Input: y = [a0,a1,a2,…,an-1] (Time Domain) = e2i 1/n (nth root of unity) n = # of samples (2r) Output: Y = [y0,y1,y2,…,yn-1] (Frequency Domain) % Separate even and odd indices aeven = [a0,a2,a4,…,an-2] aodd = [a1,a1,a5,…,an-1] % Recurse yeven =FFT(aeven, 2, n/2) (2 = e2i 2/n ) yodd =FFT(aodd , 2, n/2) %Combining For i = 0 to n/2-1 y[i] = yeven[i] + i ∙yodd[i] y[i+n/2] = yeven[i] + i+n/2 ∙yodd[i] Return(Y)

Fourier Transformation Inverse FFT A Fourier Transform is a change in basis. It changes the representation of a function from the coefficients of the polynomial f(x) = a0+a1x +a2x2 + … + an-1xn-1 Evaluating f at these points. Interpolation x to the value f(xi) at key values xi. x0 x1 x2 x3 x4 … xn-1 y0 y1 y2 y3 y4 … yn-1 yi = f(xi) 82

Fourier Transformation Inverse FFT A Fourier Transform is a change in basis. It changes the representation of a function from the coefficients of the polynomial f(x) = a0+a1x +a2x2 + … + an-1xn-1 (x0)0 (x0)1 (x0)2 (x0)3 … (x0)n-1 a0 a1 a2 a3 … an-1 y0 y1 y2 y3 yn-1 = (x1)0 (x1)1 (x1)2 (x1)3 … (x1)n-1 (xn-1)0(xn-1)1(xn-1)2 (xn-1)3…(xn-1)n-1 (x2)0 (x2)1 (x2)2 (x2)3 … (x2)n-1 (x3)0 (x3)1 (x3)2 (x3)3 … (x3)n-1 yi = f(xi) Vandermonde matrix Invertible if xi distinct. V a = y a = V-1 y 83

Fourier Transformation Inverse FFT xi = i (x0)0 (x0)1 (x0)2 (x0)3 … (x0)n-1 (x1)0 (x1)1 (x1)2 (x1)3 … (x1)n-1 (xn-1)0(xn-1)1(xn-1)2 (xn-1)3…(xn-1)n-1 (x2)0 (x2)1 (x2)2 (x2)3 … (x2)n-1 (x3)0 (x3)1 (x3)2 (x3)3 … (x3)n-1 a0 a1 a2 a3 … an-1 y0 y1 y2 y3 … yn-1 = yi = f(xi) V a = y a = V-1 y 84

Fourier Transformation Inverse FFT xi = i (i)j = ij (0)0 (0)1 (0)2 (0)3 … (0)n-1 (1)0 (1)1 (1)2 (1)3 … (1)n-1 (2)0 (2)1 (2)2 (2)3 … (2)n-1 (3)0 (3)1 (3)2 (3)3 … (3)n-1 (n-1)0 (n-1)1 (n-1)2 (n-1)3 … (n-1)n-1 (i)j a0 a1 a2 a3 … an-1 y0 y1 y2 y3 … yn-1 = yi = f(xi) V a = y a = V-1 y 85

Fourier Transformation Inverse FFT xi = i (i)j = ij 0 0 0 0 … 0 0 1 2 3 … n-1 0 2 4 6 … 2n-2 0 3 6 9 … 3n-3 0 n-1 2n-2 3n-3 … (n-1)(n-1) ij a0 a1 a2 a3 … an-1 y0 y1 y2 y3 … yn-1 = yi = f(xi) V a = y a = V-1 y V-1 = 1/n V -1 86

1/n Vandermonde matrix V-1 = 1/n V 0 0 0 0 … 0 a0 a1 a2 a3 … Inverse FFT 0 0 0 0 … 0 a0 a1 a2 a3 … an-1 y0 y1 y2 y3 yn-1 = 0 1 2 3 … n-1 0 2 4 6 … 2n-2 0 3 6 9 … 3n-3 0 n-1 2n-2 3n-3 … (n-1)(n-1) ij Vandermonde matrix V-1 = 1/n V -1 0 0 0 0 … 0 y0 y1 y2 y3 … yn-1 a0 a1 a2 a3 an-1 = 0 -1 -2 -3 … -(n-1) 0 -2 -4 -6 … -(2n-2) 0 -3 -6 -9 … -3(n-3) 0 -(n-1) -(2n-2) -(3n-3) … -(n-1)(n-1) -ij 1/n 87

Fast Fourier Transformation Inverse FFT The inverse w-1 of w is 164 165 163 166 162 rr 167 161 168 -1 = 160 = 1616 = 1 1615 169 1614 1610 1613 1611 1612

Fast Fourier Transformation Inverse FFT Cancellation Property: 161 160 162 163 164 165 166 167 168 169 1610 1611 1612 1613 1614 1615 = 1616 = 1 -1 = rr

Fast Fourier Transformation Inverse FFT Proof: Let A=V -1V. We want to show that A=I, where If i=j, then If i and j are different, then

Fast Fourier Transformation Inverse FFT The FFT and inverse FFT can use the same hardware FFT Input: <1, , a0,a1,a2,…,an-1 > Output: <y0,y1,y2,…,yn-1 > Inverse FFT Input: <1/n, -1, y0,y1,y2,…,yn-1 > Output: <a0,a1,a2,…,an-1 >

Fourier Transformation Polynomial Basis Basis: sin(2 f t/n), . cos(2 f t/n) Basis: 1, x, x2, x3,… 92

Fourier Transformation Polynomial Basis Basis: sin(2 f t/n), . cos(2 f t/n) The fth basis “vector” has frequency f. As t=0..n, t/n = 0..1, 2 f t/n = 0.. 2 f , cos(2 f t/n) does f full cycles. 93

Fourier Transformation Polynomial Basis Basis: 1, x, x2, x3,… 94

Fourier Transformation Polynomial Basis Basis: 1, x, x2, x3,… fth basis “vectors” is xf We compute this on x0, x1, x2, x3, …. tth value is xt xt = t fth basis on tth value is (xt) f = (t) f = ft 95

Fast Fourier Transformation Roots of Unity 16th roots of unity × 164 165 163 166 162 rr 167 161 reθi = rcosθ + irsinθ r = 1, θ = 2/n  = e2i/n r θ 168 160 1615 169 1614 1610 1613 1611 1612

Fourier Transformation Polynomial Basis Basis: 1, x, x2, x3,… fth basis “vectors” is xf We compute this on x0, x1, x2, x3, …. tth value is xt xt = t fth basis on tth value is (xt) f = (t) f = ft = [e2i/n] ft = e2 f t/n i = cos(2 f t/n) + i sin(2 f t/n) reθi = rcosθ + irsinθ r = 1, θ = 2/n  = e2i/n 97

Fourier Transformation Polynomial Basis Basis: sin(2 f t/n), . cos(2 f t/n) Basis: 1, x, x2, x3,… “vectors” fth basis “vectors” is xf We compute this on x0, x1, x2, x3, …. tth value is xt xt = t fth basis on tth value is (xt) f = (t) f = ft = [e2i/n] ft = e2 f t/n i = cos(2 f t/n) + i sin(2 f t/n) 98

Fourier Transformation Polynomial Basis Basis: sin(2 f t/n), . cos(2 f t/n) Basis: 1, x, x2, x3,… 99

Polynomial Multiplication f(x) = a0+a1x +a2x2 + … + an-1xn-1 g(x) = b0+b1x +b2x2 + … + bn-1xn-1 [f×g](x) = c0+c1x +c2x2 + … +c2n-2x2n-2 x5 coefficient: c5= a0×b5+a1×b4 + a2×b3 + … + a5×b0 Convolution Too much O(n2) Time =

Polynomial Multiplication f(x) = a0+a1x +a2x2 + … + an-1xn-1 g(x) = b0+b1x +b2x2 + … + bn-1xn-1 [f×g](x) = c0+c1x +c2x2 + … +c2n-2x2n-2 Coefficient Domain aj Evaluation Domain yi [a0,a1,a2 ,…,an-1] [b0,b1,b2 ,…,bn-1] yi = f(xi) zi = g(xi) Fast Fourier Transform takes O(nlogn) time! yi×zi = [g×f](xi) Multipling values pointwise takes O(n) time! [c0,c1,c2 ,…,cn-1]

Multiplying Big Integers X = 11…10100011101100010010 (N bits) Y = 10…01001100011001001111 X×Y = 10…1110110101001001010100010100110010011110 The high school algorithm takes O(N2) bit operations. Can we do it faster? I hope so See Recursion for one way to do it faster. This is another.

Multiplying Big Integers X = 11…10100011101100010010 (N bits) m Break into m = O(log N) bit blocks

Multiplying Big Integers X = 0000 … 0000 0000 0011 1010 0011 1011 0001 0010 m an … a7 a6 a5 a4 a3 a2 a1 a0 Break into m = O(log N) bit blocks ai = O(N)

Multiplying Big Integers X = 0000 … 0000 0000 0011 1010 0011 1011 0001 0010 m f(x) = an-1xn-1 + … + a5x5 + a4x4 + a3x3 + a2x2 + a1x + a0 g(x) = bn-1xn-1 + … + b5x5 + b4x4 + b3x3 + b2x2 + b1x + b0 View as coefficients of a polynomial. Note X = f(2m). Same for Y = g(2m). Multiply g×f using FFT in time O(nlogn). Note X×Y = [g×f](2m).

The End

Fast Fourier Transformation Roots of Unity 16th roots of unity (n/4)2 = n/2 = -1 i These could be Z mod 17 or complex numbers × 164 165 163 166 162 rr 167 161 (n/2)2 = 1 r θ 168 -1 = 160 = 1616 = 1 1615 169 1614 1610 1613 1611 1612 reθi = rcosθ + irsinθ reθi × seαi = (rs)e(θ+α)i (3n/4)2 = n/2 = -1 -i

Fast Fourier Transformation Roots of Unity Goal: Proof f(θ) = g(θ) f(0) = g(0) f’(0) = g’(0) f’’(θ) = -f(θ) g’’(θ) = -g(θ) Proof by induction (over the reals) that f(θ) = g(θ) f(θ) g(θ) For this θ, f(θ) = g(θ) and f’(θ) = g’(θ) For next θ+, f(θ+) = g(θ+) f’’(θ) = -f(θ) = -g(θ) =g’’(θ) For next θ+, f’(θ+) = g’(θ+)

Fast Fourier Transformation Roots of Unity g(θ) = rcosθ + irsinθ f(θ) = reθi Goal: Proof f(θ) = g(θ) g(0) = rcos0 + irsin0 = r f(0) = re0i = r f(0) = g(0) g’(θ) = -rsinθ + ircosθ f’(θ) = ireθi g’(0) = -rsin0 + ircos0 = ir f’(0) = ire0i = ir f’(0) = g’(0) g’’(θ) = -rcosθ - rsinθ f’’(θ) = -reθi = -f(θ) = -g(θ)

Fast Fourier Transformation Sin & Cos basis Modifies DFT frequency coefficient calculations: ReX[ k ] = x[n] cos(2πkn/N) 0 < k < N/2 x[i] ε Real ImX[ k ] = - x[n] sin(2πkn/N) Uses complex and polar numbers as a shorthand: Xk = ReX[ k ] + i ImX[ k ] Xk = xn e –i2πkn/N = xn ωkn N-1 N-1 Ʃ n=0 N-1 Complex Ʃ N = 2r n=0 But if you need an algorithm that runs faster (but takes longer to write) then you’ll need to use the FFT It starts by modifying DFT to accept complex valued time signals and returns N values of k rather than N/2. Using complex numbers change two summations into one more compact summation which at first glance appears “complex” and intimidating. The hint to understand this equation is that e^-i2πkn/N has a magnitude of 1, so it really only represents a phase (shift) angle of some multiple kn, of 360º/N By replacing e^-i2π/N by ω we can temporarily forget that we are dealing with complex numbers. Now if we expand the series, we will get something that looks like N equations of polynomials of ω. When we write this information as a matrix, we get the Vandermonde matrix which Jeff has shown us previously. r·e iθ = r·cosθ + i r·sinθ = r θ Ʃ n=0 N-1 Ʃ n=0 N-1 ω = e –i2π/N 110

Fast Fourier Transformation Sin & Cos basis 1. Convert your N real sampled values to complex numbers by adding 0i to them xn = xn + 0i 0 < n < N-1 2. Feed this as the input to the FFT Remove FFT output’s redundant information (i.e. all frequencies above N/2) ReX diagram from Kester Because of its speed, there is still an advantage of using the complex FFT to calculate the DFT for real inputs. The FFT just gives additional redundant information that can be discarded. ImX “Negative” Frequency N-1 0 N/2 N-1 0 N/2 “Negative” Frequency Even Symmetry About N/2 (fs/2) Odd Symmetry About N/2 (fs/2) 111

Fast Fourier Transformation FFT Butterfly

The Ugly Math for the FFT (0)0 (0)1 (0)2 (0)3 … (0)n-1 (1)0 (1)1 (1)2 (1)3 … (1)n-1 (2)0 (2)1 (2)2 (2)3 … (2)n-1 (3)0 (3)1 (3)2 (3)3 … (3)n-1 (n-1)0 (n-1)1 (n-1)2 (n-1)3 …(n-1)n-1 ... = x0 x1 x2 x3 … xN-1 X0 X1 X2 X3 XN-1 Behold the Vandermonde matrix! matrix and epic struggle between good and evil characters from Edmonds However, we have still not gained anything. But that’s O(N2) !! 113

The Ugly Math for the FFT 0 n-1 2n-2 3n-3 … (n-1)(n-1) 0 0 0 0 … 0 0 1 2 3 … n-1 0 2 4 6 … 2n-2 0 3 6 9 … 3n-3 ... = x0 x1 x2 x3 … xN-1 X0 X1 X2 X3 XN-1 But if I multiply the exponents ... However, we have still not gained anything. But that’s still O(N2) !! 114

Just watch! For example, if N=8 and I use the N roots of unity ... 0 0 0 0 1 2 3 4 5 6 7 02 8 10 12 14 9 15 18 21 0 4 16 20 24 28 05 25 30 35 36 42 0 7 49 Just watch! For example, if N=8 and I use the N roots of unity ... But now by taking advantages of the properties of the complex roots of unity, ω^p we can start to make some progress. p+4 = -p p = p mod 8 6 5 7 4 0 0 = 1 4 = -1 1 3 = 8 = e -i2π/8 2 115

= x0 x1 x2 x3 x4 x5 x6 x7 X0 X1 X2 X3 X4 X5 X6 X7 + + + -+ + + - + + + -+ + + - 1 1 1 1  2 3 -1 - -2 -3 1 2 -1 -2 1 3 -2  -1 -3 2 - 1 -1 1 - 2 -3 -1  -2 3 1 -2 -1 2 1 -3 -2 - -1 3 2  Now the 2nd half of each row either equals the 1st half or its negative 116

x0 x4 = x0 x1 x2 x3 x4 x5 x6 x7 X0 X1 X2 X3 X4 X5 X6 X7 + + + -+ + + - + + + -+ + + - 1 1 1 3 -2  -1 -3 2 - x0 and x4 have identical coefficients (ignoring sign) as do: x1 and x5 x2 and x6 x3 and x7 Note to self: Easier to duplicate slide than to co-ordinate appear and disappear animations 117

x1 x5 = x0 x1 x2 x3 x4 x5 x6 x7 X0 X1 X2 X3 X4 X5 X6 X7 + + + -+ + + - + + + -+ + + - 1 1 1 3 -2  -1 -3 2 - x0 and x4 have identical coefficients (ignoring sign) as do: x1 and x5 x2 and x6 x3 and x7 118

x2 x6 = x0 x1 x2 x3 x4 x5 x6 x7 X0 X1 X2 X3 X4 X5 X6 X7 + + + -+ + + - + + + -+ + + - 1 1 1 3 -2  -1 -3 2 - x0 and x4 have identical coefficients (ignoring sign) as do: x1 and x5 x2 and x6 x3 and x7 119

x3 x7 = x0 x1 x2 x3 x4 x5 x6 x7 X0 X1 X2 X3 X4 X5 X6 X7 + + + -+ + + - + + + -+ + + - 1 1 1 3 -2  -1 -3 2 - x0 and x4 have identical coefficients (ignoring sign) as do: x1 and x5 x2 and x6 x3 and x7 120

Oh my! Half the columns are gone. What’s next? (x0 + x4) + (x2 + x6) + (x1 + x5) + (x3 + x7) = X0 (x0 - x4) + 2 (x2 - x6) +  (x1 - x5) + 3 (x3 - x7) = X1 (x0 + x4) - (x2 + x6) + 2 (x1 + x5) - 2 (x3 + x7) = X2 (x0 - x4) - 2 (x2 - x6) + 3 (x1 - x5) +  (x3 - x7) = X3 (x0 + x4) + (x2 + x6) - (x1 + x5) - (x3 + x7) = X4 (x0 - x4) + 2 (x2 - x6) -  (x1 - x5) - 3 (x3 - x7) = X5 (x0 + x4) - (x2 + x6) - 2 (x1 + x5) + 2 (x3 + x7) = X6 (x0 - x4) - 2 (x2 - x6) - 3 (x1 - x5) -  (x3 - x7) = X7 Now rewrite the matrix as equations in terms of: x0 ± x4, x2 ± x6 , x1 ± x5 , x3 ± x7 These observations lead to rewriting the equations in terms of x0 ± x4, x2 ± x6 , x1 ± x5 , x3 ± x7 End result: By introducing an addition and subtraction, we have eliminated half of the more expensive multiplications. We could continue to reduce the number of columns (i.e. multiplications) by combining additions and subtractions into larger ones, but from this point on, there is an easier way to get to our goal. Oh my! Half the columns are gone. What’s next? 121

(x0 + x4) + (x2 + x6) + (x1 + x5) + (x3 + x7) = X0 Think signal flow and construct the equations using the butterfly operator: Ex. for butterfly diagram from Smith In this case, we would set ω^p to 1 Ʃ ωp + - xo x4 xo + ωp x4 xo - ωp x4 x0 ± x4 (ωp = 1) 122

Note - the butterfly has a shorthand notation of: (x0 + x4) + (x2 + x6) + (x1 + x5) + (x3 + x7) = X0 (x0 - x4) + 2 (x2 - x6) +  (x1 - x5) + 3 (x3 - x7) = X1 (x0 + x4) - (x2 + x6) + 2 (x1 + x5) - 2 (x3 + x7) = X2 (x0 - x4) - 2 (x2 - x6) + 3 (x1 - x5) +  (x3 - x7) = X3 (x0 + x4) + (x2 + x6) - (x1 + x5) - (x3 + x7) = X4 (x0 - x4) + 2 (x2 - x6) -  (x1 - x5) - 3 (x3 - x7) = X5 (x0 + x4) - (x2 + x6) - 2 (x1 + x5) + 2 (x3 + x7) = X6 (x0 - x4) - 2 (x2 - x6) - 3 (x1 - x5) -  (x3 - x7) = X7 Note - the butterfly has a shorthand notation of: butterfly diagram from Smith ωk -1 xo x4 xo + ωp x4 xo - ωp x4 123

Using Bit Reverse Order and a tree of butterflies, my Decimation in Time Algorithm can solve this in O (N log N) Damn you and your re “cursed” friends! If you’re lying, I’ll claim your soul! No friends this time. They’d just be overhead invading my stack space. Decimation in Time & Bit Reverse Order ( (rearranging the order of the N samples) 0000 001 010 011 100 101 110 111 0 00 1 2 3 4 5 6 7 0 2 4 6 1 3 5 7 0 4 2 6 1 5 3 7 0000 100 010 110 001 101 011 111 Decimation in time is what we will use to arrange the input to our butterflies based on what we have seen in the previous equations. (Go back one slide) Notice the x0 ± x4, x2 ± x6 , x1 ± x5 , x3 ± x7 columns of the equations (Go back to this slide) correspond to the order of this last row. Decimation in time is the process of rearranging a set into two smaller sets, where every 2nd item goes into the second smaller set. The process is then repeated on each of the smaller sets as needed. In a multistage process, one can do all this in one step by noticing that the final index of an element corresponds to reversing the bits of its original index. (hence the name Bit Reverse Order) 124

FFT Block Diagram x0 X0 x4 X1 x2 X2 x6 X3 x1 X4 x5 X5 x3 X6 x7 X7 2 – Point Butterfly 2 combined 2-Point Butterflies 4 combined 2-Point Butterflies 2 – Point Butterfly 2 – Point Butterfly diagram from Kester So we start with a block of diagram, resembling a binary tree structure and with our inputs fed in Bit Reverse Order. Notice that at each stage boundary, the same number of lines (which in this example is 8) cross from one stage into the next . 2 combined 2-Point Butterflies 2 – Point Butterfly STAGE 1 STAGE 2 STAGE 3 125

Stepping Through the FFT x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 diagram from Kester Next we install the butterflies per the previous equations, which results in this messy diagram, which I hope to unravel for you. To see the rhyme and reason of the multiplication weights of the second stage, remember that ω8^2 is equivalent to ω4^1 Note to self: Future improvement would be to trace out the signal flow for a few output points and show that it corresponds to the equations developed earlier. -1 80 82 -1 -1 80 82 83 -1 -1 -1 80 = 1 82 = 41 126

Stepping Through the FFT x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 So as we walk through the first stage, we see that it requires ... -1 80 82 -1 -1 80 82 83 -1 -1 -1 127

Stepping Through the FFT x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 -1 80 82 -1 -1 80 82 83 -1 -1 -1 128

Stepping Through the FFT x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 -1 80 82 -1 -1 80 82 83 -1 -1 -1 129

Stepping Through the FFT x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 ... four butterflies. Also note that this first stage took in 8 inputs and produced 8 outputs, which will now serve as the input to the next stage. Since there is no further use for the original inputs, their memory space can be used to store the output results (aka in place memory processing) This memory in place processing allows the FFT to be implemented in many embedded systems processors that have access to only a small amount of memory. -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/2 130

Stepping Through the FFT x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 Similarly, as we work through the second stage, we can see that despite its confusing appearance ... -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/2 131

Stepping Through the FFT x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/2 132

Stepping Through the FFT x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/2 133

Stepping Through the FFT x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 it also consists of only 4 simple butterflies as well. -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/2 + N/2 134

Stepping Through the FFT x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 And the same holds true for the final stage. It also consists of only 4 simple butterflies... -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/2 + N/2 135

Stepping Through the FFT x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/2 + N/2 136

Stepping Through the FFT x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/2 + N/2 137

Stepping Through the FFT x0 x4 x2 x6 x1 x5 x3 x7 X0 X1 X2 X3 X4 X5 X6 X7 -1 80 80 -1 80 82 -1 -1 80 -1 -1 80 81 So the end result for the combined log N stages in terms of butterfly operations is O(N log N) Note to self for future: For best contrast, select light coloured lines and saturated colours for backgrounds. -1 80 82 -1 -1 80 82 83 -1 -1 -1 N/2 + N/2 + N/2  O (N log N) 138

Algorithm FFT (ReX, ImX) Input: ReX[ ], ImX[ ] = real, imaginary parts of the time samples Output: ReX[ ], ImX[ ] = cosine, sine coefficients of frequency domain N = SizeOf( ReX ) PutInBitReverseOrder (ReX, ImX) % time domain decomposition % frequency domain synthesis (done in place) for k = 1 to log 2 N % Loop for each stage Wre = 1; Wim = 0; θ = 2π/ 2k % Initialize stage constants for j = 1 to 2k-1 % Loop for each sub DFT for i = j-1 to N-1 step 2k % Loop for each butterfly ip = i + 2k-1 tmpRe = ReX[ip]·Wre - ImX[ip]·Wim tmpIm = ReX[ip]·Wim + ImX[ip]·Wre ReX[ip] = ReX[ i ] - tmpRe ImX[ip] = ImX[ i ] - tempIm ReX[ i ] = ReX[ i ] + tempRe ImX[ i ] = ImX[ i ] + tempIm next i tempRe = Wre Wre = tmpRe·cos(θ) + Wim·sin(θ) Wim = - tmpRe·sin(θ) + Wim· cos(θ) next j next k return (ReX, ImX) % ReX[ ],ImX[ ] return freq coeffs 0 to N-1 Smith’s code and diagram used to generate algorithm and diagram So despite the complexity of the signal flow diagram, we see that we can perform the FFT using an iterative approach, and the FFT boils down to being just 3 nested loops and a little overhead.

Grade School Revisited: How To Multiply Two Numbers Multiplying Big Integers Grade School Revisited: How To Multiply Two Numbers

Multiplying Big Integers X = 0011 … 1010 0011 1011 0001 0010 m Break into m = O(log N) bit blocks

Multiplying Big Integers X = 0000 … 0000 0000 0011 1010 0011 1011 0001 0010 n=2r blocks m O(log p) an … a7 a6 a5 a4 a3 a2 a1 a0 Break into m = O(log N) bit blocks Pad with zero 2N bits to hold product n blocks where n is a power of 2, ie n=2r. Let p be a prime log p ≥ block size = m p-1 is divisible by n, so Z mod p has n nth roots of unity. View each block as a finite field element in Z mod p. (no actual work)

Multiplying Big Integers X = 0000 … 0000 0000 0011 1010 0011 1011 0001 0010 m f(x) = an-1xn-1 + … + a5x5 + a4x4 + a3x3 + a2x2 + a1x + a0 g(x) = bn-1xn-1 + … + b5x5 + b4x4 + b3x3 + b2x2 + b1x + b0 View as coefficients of a polynomial. Note X = f(2m). Same for Y = g(2m). Multiply g×f using FFT in time O(nlogn). Note X×Y = [g×f](2m). Evaluate [g×f](2m) in time O(n) operations, but each op could be on O(n) bit numbers for a total of O(n2) time.

Multiplying Big Integers Evaluate [g×f](2m) in time O(n). [g×f](x) = cn-1xn-1 + … + c5x5 + c4x4 + c3x3 + c2x2 + c1x + c0 X×Y = 0011 1010 0011 1011 0001 0010 1110 0011 0010 m O(log p) Some texts say the ci can just be shifted and joined. Problem: The field elements may be too big.

Multiplying Big Integers Evaluate [g×f](2m) in time O(n). [g×f](x) = cnxn + … + c5x5 + c4x4 + c3x3 + c2x2 + c1x + c0 Shift each ci by im. Add m O(log p) 101011 001101 110001 000111 100100 010011 X×Y = 01 0101 0110 1011 1010 0001 1111 1011 Adding n numbers each n bits long takes O(n2) but here the numbers are sparse.

Multiplying Big Integers Evaluate [g×f](2m) in time O(n). [g×f](x) = cnxn + … + c5x5 + c4x4 + c3x3 + c2x2 + c1x + c0 Shift each ci by im. Add m O(log p) 101011 001101 110001 000111 100100 010011 X×Y = 01 0101 0110 1011 1010 0001 1111 1011 At each point, at most two numbers overlap  Carry is at most one  O(N) bit operations.

Multiplying Big Integers X = 11…10100011101100010010 (N bits) Y = 10…01001100011001001111 X×Y = 10…1110110101001001010100010100110010011110 Suppose N is really really big. How many bit operations are needed? O(N logN) O(N logN loglogN) O(N logN loglogN logloglogN loglogloglogN …) FFT time Time stated in text Time as far as I can see

Multiplying Big Integers X = …101000111011000100101010001001010 … N’ Input size = N bits Field element size = N’ = log(N) bits # ai = n = N/N’ # of field ops = O(nlogn) Time for × field op = ? X’ = 1010 0111 0110 0010 0101 0100 0100 1010 N’’ Input size = N’ bits Field element size = N’’ = log(N’) bits # ai = n’ = N’/N’’ # of field ops = O(n’logn’) Time for × field op = ? Total time: O(N’ logN’ loglogN’ logloglogN’ …) And so on …

Multiplying Big Integers X = …101000111011000100101010001001010 … N’ Input size = N bits Field element size = N’ = log(N) bits # ai = n = N/N’ # of field ops = O(nlogn) Time for × field op = ? Total time: = O( n logn ) × O(N’ logN’ loglogN’ logloglogN’ …) = O(N/N’ logN/N’) × O(N’ loglogN logloglogN loglogloglogN …) = O( N logN loglogN logloglogN loglogloglogN …) O(N’ logN’ loglogN’ logloglogN’ …)