CS 591 S1 – Computational Audio

CS 591 S1 – Computational Audio
Wayne Snyder Computer Science Department Boston University Lecture 15 The FFT: Fast DFT The Inverse FFT: Fast Synthesis of Musical Signals Convolution and Reverberation Lecture 16 Computing Spectrograms Application: Unlocked pitch and time shifting 1

Digital Audio Fundamentals: The Discrete Fourier Transform
Review from last time….. The FFT exhibits an “uncertainly principle” with respect to its resolution in time and frequency: a window of W samples has a time resolution of W / sec frequency resolution of / W Hz Examples: W = W = W = 44100 Product of time resolution and frequency resolution is always 1.0.  Window of W Samples  Freq Res 10 Hz 100 Hz 1 Hz 1 sec 0.1 sec 0.01 sec Time Res

Review from last time….. The FFT produces a spectrum which shows “leakage” or “spread” away from the true component frequency due to mismatch between the true frequency and the detectable frequencies, which gives an inaccurate measure of the amplitude…. Hann Windows can minimize, but not completely eliminate, this problem, by reducing the effect of partial waveforms at the edges of the window:  Window of W Samples 

The Discrete Fourier Transform uses complex numbers to account for phase when probing for frequencies in a linear sequnce of “frequency bins” but the naïve implementation, using two for loops, is prohibitively expensive: Compare with the Discrete Sine Transform:  Window of W Samples 

The Fast Fourier Transform (Cooley and Tukey, 1965, but invented first by Gauss in 1805!) uses divide and conquer to achieve an O( N log(N) ) complexity, which means the algorithm can be used in real time: Compare: if N = (1 second), then N2 = 1,944,810,000 N log(N) = 680,396.45 which is 0.03 %

The intuition for the FFT is that the periodic (repetitive) structure of sine waves can be exploited by taking alternate sequences of samples, applying the algorithm recursively until the base case (the Nyquist Limit) is reached, and combining the results:

We will use the SciPy signal processing library, which has an extensive set of FFT algorithms for various situations, including multiple dimensions:

The Inverse FFT: Turning spectra back into signals….. Here is the spectrum from applying realFF(..) to the entire Bach.Brandenburg.2.3.wav file (of duration 107,491 samples = secs) with a frequency resolution of Hz: .

The Inverse FFT: The fft can be reversed by essentially doing the same thing in reverse to the spectrum; this is a MUCH more efficient way of creating a signal from a spectrum than the makeSignal( …. ) method; the IFFT is O( N log N ) and makeSignal is O( N2 ): S = np.fft.fft( X ) X = np.fft.ifft( S ) These transforms move a signal back and forth between time and frequency domains with (up to arithmetic accuracy of the machine) no loss of information; hence the following are identity maps: X = IFFT ( FFT ( X ) ) S = FFT( IFFT( S ) ) [cf. X = makeSignal( spectrumFFT( X ) ) S = spectrumFFT( makeSignal( S ) )]

The Inverse FFT

Manipulating Signals using the FFT/IFFT Transform Pair We can do lots of interesting things to a signal by transforming it into the frequency domain, manipulating the spectrum, and then transforming it back: For example, we could FILTER the signal by removing some frequencies… Low Pass Filter:

Manipulating Signals using the FFT/IFFT Transform Pair: Filtering Low Pass Filter

Manipulating Signals using the FFT/IFFT Transform Pair: Filtering Let’s low-pass filter the Bach.Brandenburg.2.3.wave ( secs long) at 2000 Hz: Original: Filtered:

Manipulating Signals using the FFT/IFFT Transform Pair HOWEVER, the fact that ALL information in the signal is present in the spectrum means that all non-frequency information (e.g., amplitude, timing, …) is ALSO encoded as frequencies. A brief example: Here is a single 440 Hz sine wave: What do you think the spectrum looks like when we swell this signal?

Manipulating Signals using the FFT/IFFT Transform Pair What do you think the spectrum looks like when we swell this signal? [Think of how the amplitude of a signal would oscillate when a similar frequency was present, causing beats, or how amplitude modulation affects the spectrum.] What do you think will happen if we low-pass filter this signal at 440 Hz?

Manipulating Signals using the FFT/IFFT Transform Pair Low-passing a swelling 440 Hz signal.

Manipulating Signals using the FFT/IFFT Transform Pair ANY amplitude envelope for a 440 Hz (or any other frequency) can be created by an appropriate spectrum:

Manipulating Signals using the FFT/IFFT Transform Pair Just for fun….. here is what happens when we high-pass this signal at 440 Hz:

Manipulating Signals using the FFT/IFFT Transform Pair Punchline: If we can create an arbitrary amplitude envelope for any one frequency in a signal by an appropriate spectrum, then we can add together many of these spectra to create any arbitrary signal involve many different frequencies.

Another use of the FFT/IFFT Transform Pair is Convolution: Discrete Convolution is the SAME THING as filtering using weighted averages as presented a month ago, except that the “filter array” is considered another signal, and is reversed! Example: A = [1, 1, 0] B = [1, 2, 3, 4] 1 1 3

Convolution is a fundamental operation in graphics……

Convolution (and filtering) would normally take O( N2 ) and be exorbitantly expensive, but the following theorem shows the usefulness of the Transform Pair: some operations are easier in the frequency domain! Convolution Theorem: Informal: Convolution in the time domain is equivalent to point-wise multiplication in the frequency domain and vice versa. Formal: Let F{ X } be the Fourier transform of a signal X and X * Y be the convolution of X and Y, and X . Y be the point-wise multiplication of X and Y. Then: and

Thus, we could implement convolution, with a little pre- and post-processing as: But use the numpy version!

We can use convolution to do filtering as we saw a month ago:

The most common use of convolution in the audio industry is to simulate the reverberation of various sonic spaces. This is called Convolution Reverb. An impulse response (in this context) is the recording of a loud, short random noise (such as a starter pistol or a burst ballon) which then reverberates. Here is the impulse response of a racquetball court:

The most common use of convolution in the audio industry is to simulate the reverberation of various sonic spaces. This is called Convolution Reverb. Here is the convolution of the racquetball court impulse response with the organ piece we created in HW 04, Problem 1 (b): organ.wav: RacquetballOrgan.wave:

The most common use of convolution in the audio industry is to simulate the reverberation of various sonic spaces. This is called Convolution Reverb. There are many companies that have convolution reverb software and libraries of impulse responses collected from various spaces around the world:

CS 591 S1 – Computational Audio

Similar presentations

Presentation on theme: "CS 591 S1 – Computational Audio"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS 591 S1 – Computational Audio

Similar presentations

Presentation on theme: "CS 591 S1 – Computational Audio"— Presentation transcript:

Similar presentations

About project

Feedback