# Primer on Analyzing Animal Sounds:

## Presentation on theme: "Primer on Analyzing Animal Sounds:"— Presentation transcript:

Primer on Analyzing Animal Sounds:
Figures and Sample Sounds Jack Bradbury & Sandra Vehrencamp Cornell University

Recording Sounds A sound is a propagated disturbance in the ambient pressure of a medium (air, water, etc.) Each region of higher-than ambient pressure is matched by a following region of lower-than average pressure A microphone converts the variations in pressure created by a passing sound wave into electrical signals that mimic the rise and fall of sound pressure at the microphone

Describing and Comparing Sounds
A plot of pressure versus time is called the waveform of a sound. It is a description of a sound in the time domain. Examples: How can we describe and compare these signals? Bellbird Oropendola Pressure Time Time (Move cursor over waveform to play sound)

Simple Waveforms The simplest type of signal one could ever record is a single sine wave that does not change in either amplitude or frequency:

Time Domain Measurements
There are two measures that could easily be made on this waveform: Amplitude: What are the maximum or average deviations in pressure from ambient levels? APeak-peak ARMS average Time domain

Time Domain Measurements
There are two measures that could easily be made on this waveform: Amplitude: Rather than absolute values, one usually compares amplitude to some soft reference sound, as dB = 20 log10 (Aobs/Aref) APeak-peak ARMS average Time domain

Time Domain Measurements
There are two measures that could easily be made on this waveform: Frequency: How many cycles/sec (= Hz) are present? Easiest to compute time between cycles and take reciprocal T f = 1/T Time domain

Frequency Domain Measures
It is convenient to plot these two measures on their own graph, known as a frequency-domain description of the sound: T f = 1/T APeak-peak Amplitude f Frequency Time domain Frequency domain

Waves That Are Not Sine Waves
But how can we describe these waves? In the first example, the frequency is not constant. What should we put in the frequency-domain plot? In the second example, both the shape and amplitude of the successive “waves” change. What can we do with this one? Waveform 1 Waveform 2

Fourier Analysis There is hope!
Any continuous waveform can be broken down into a set of pure sine waves with frequency and amplitude values that can be computed or measured (Fourier analysis). Frequency-domain plots provide us with a very powerful way to describe and compare any set of sounds.

Fourier Analysis Applying the Fourier solution, we get:
A plot of amplitude versus frequency components is called the frequency spectrum (or power spectrum) of a sound. Waveform 1 Waveform 2 Amplitude Frequency Frequency

Fourier Analysis But what do we do if the waveform keeps changing during the signal, like in this lark sparrow song? (Move cursor over waveform to play sound) Pressure Time

Fourier Analysis But what do we do if the waveform keeps changing during the signal, like in this lark sparrow song? The solution is to break the song into homogeneous segments and create a frequency spectrum for each segment. Pressure Time

Fourier Analysis These are then strung together along the timeline so we can see how the frequency spectra change as the song progresses. Such a plot is called a spectrogram, and we shall come back to how these are generated. Pressure Time

Predicting Power Spectra from Waveforms
There are three types of deviations from a single sine wave. Most animal signals are some combination of these: Single sine wave Periodic nonsinusoidal signals Amplitude modulation (AM) Frequency modulation (FM)

Predicting Power Spectra from Waveforms
If we can predict the frequency spectrum for each type of deviation, we can predict the spectrum for nearly any signal. Single sine wave Periodic nonsinusoidal signals Amplitude modulation (AM) Frequency modulation (FM)

Analysis of Typical Waveforms
Sinusoidal amplitude modulation (AM): Amplitude Frequency Time Domain Frequency domain

Analysis of Typical Waveforms
Sinusoidal amplitude modulation (AM) Two time-domain measures are possible: (1) Carrier frequency ( f ) Amplitude Frequency T Carrier f = 1/T Time domain Frequency domain

Analysis of Typical Waveforms
Sinusoidal Amplitude Modulation (AM) Two time domain measures are possible: (1) Carrier frequency ( f ), and (2) Modulation rate (w), the number of complete modulation cycles per second Modulating frequency t w = 1/t Amplitude Frequency T Carrier f = 1/T Time domain Frequency domain

Analysis of Typical Waveforms
Sinusoidal amplitude modulation (AM) Frequency spectrum is 3 lines: carrier f and two side bands at f – w and f + w. Modulating frequency t w = 1/t Amplitude Frequency T Carrier f = 1/T f–w f f+w Time domain Frequency domain

Analysis of Typical Waveforms
Sinusoidal amplitude modulation (AM) Frequency spectrum is 3 lines: carrier f and two side bands at f – w and f + w. The greater the amplitude of w, the higher the sidebands, but these never exceed f amplitude Amplitude of w Modulating frequency t w = 1/t Amplitude Frequency T Carrier f = 1/T f–w f f+w Time domain Frequency domain

Predicting Power Spectra from Waveforms
If we can predict the frequency spectrum for each type of deviation, we can predict the spectrum for nearly any signal. Single sine wave Periodic nonsinusoidal signals Amplitude modulation (AM) Frequency modulation (FM)

Analysis of Typical Waveforms
Sinusoidal frequency modulation (FM) Suppose we keep amplitude fixed, but modulate the frequency of a sine wave sinusoidally, e.g.: Frequency Time Amplitude Time domain Frequency domain

Analysis of Typical Waveforms
Sinusoidal Frequency Modulation (FM) What can we measure in the time domain? fmax= 1/T1 Frequency Time Amplitude T1 Time domain Frequency domain

Analysis of Typical Waveforms
Sinusoidal frequency modulation (FM) What can we measure in the time domain? fmax= 1/T1 Frequency fmin= 1/T2 Time Amplitude T2 T1 Time domain Frequency domain

Analysis of Typical Waveforms
Sinusoidal frequency modulation (FM) What can we measure in the time domain? fmax= 1/T1 Frequency Carrier ( f ) = (fmax+ fmin) / 2 fmin= 1/T2 Time Amplitude T2 T1 Time domain Frequency domain

Analysis of Typical Waveforms
Sinusoidal frequency modulation (FM) What can we measure in the time domain? fmax= 1/T1 Modulating frequency, w Frequency Carrier ( f ) = (fmax+ fmin) / 2 fmin= 1/T2 Time t w = 1/t Amplitude T2 T1 Time domain Frequency domain

Analysis of Typical Waveforms
Sinusoidal frequency modulation (FM) What can we measure in the time domain? fmax= 1/T1 Modulating frequency, w Frequency Carrier ( f ) = (fmax+ fmin) / 2 fmin= 1/T2 Modulation index = (fmax – fmin) / w Time t w = 1/t Amplitude T2 T1 Time domain Frequency domain

Analysis of Typical Waveforms
Sinusoidal frequency modulation (FM) The frequency spectrum for a sinusoidally FM waveform has a line at the carrier and sidebands for each f ± nw around the carrier (nmax= ∞), where n is a positive integer (1, 2, 3, etc.). t w = 1/t f Amplitude f–2w f+2w T2 T1 f–3w f–w f+w f+3w Time domain Frequency domain

Analysis of Typical Waveforms
Sinusoidal frequency modulation (FM) The frequency spectrum for a sinusoidally FM waveform has a line at the carrier and sidebands for each f±nw around the carrier (nmax= ∞) If the modulation index <10, then the carrier has the highest amplitude and sideband amplitudes decrease with n t w = 1/t f Amplitude f–2w f+2w T2 T1 f–3w f–w f+w f+3w Time domain Frequency domain

Analysis of Typical Waveforms
Sinusoidal frequency modulation (FM) If the modulation index >20, then the sidebands and the carrier have the same frequency values as before, but the carrier can have a lower amplitude than the sidebands t w = 1/t f–2w f+2w Amplitude f T2 T1 f–3w f–w f+w f+3w Time domain Frequency domain

Predicting Power Spectra from Waveforms
If we can predict the frequency spectrum for each type of deviation, we can predict the spectrum for nearly any signal. Single sine wave Periodic nonsinusoidal signals Amplitude modulation (AM) Frequency modulation (FM)

Analysis of Typical Waveforms
Periodic nonsinusoidal waveforms Any shape of waveform is allowed under this category as long as there is a clearly repeating unit. For example: Amplitude Time domain Frequency domain

Analysis of Typical Waveforms
Periodic nonsinusoidal waveforms The major measurement we can make on this waveform in the time domain is the period of the repeats (t), and thus the repeat rate, w. w = 1/t t Amplitude Time domain Frequency domain

Analysis of Typical Waveforms
Periodic nonsinusoidal waveforms The frequency spectrum of a periodic waveform contains components at w, 2w, 3w, etc., to infinity. When spectrum components are integer multiples of some frequency w, we call the set a harmonic series. The fundamental is w and 2w is the second harmonic, etc. w = 1/t t 2w Amplitude 4w 6w w 3w 5w 7w Time domain Frequency domain

Analysis of Typical Waveforms
Periodic nonsinusoidal waveforms The amplitude of successively higher harmonics tends to decrease in an exponential manner (Dirichlet’s Rule)… w = 1/t t 2w Amplitude 4w 6w w 3w 5w 7w Time domain Frequency domain

Analysis of Typical Waveforms
Periodic nonsinusoidal waveforms The amplitude of successively higher harmonics tends to decrease in an exponential manner (Dirichlet’s Rule)… unless the wave is half-wave symmetric. To determine this… w = 1/t t 2w Amplitude 4w 6w w 3w 5w 7w Time domain Frequency domain

Analysis of Typical Waveforms
Periodic nonsinusoidal waveforms To determine if the wave is half-wave symmetric, divide a complete cycle of a periodic waveform in half…. 2w Amplitude 4w 6w w 3w 5w 7w Time domain Frequency domain

Analysis of Typical Waveforms
Periodic nonsinusoidal waveforms Divide a complete cycle of a periodic waveform in half. Then reflect the right half upside down. If the two halves are different, the waveform is half-wave asymmetric and the spectrum shows all harmonics. 2w Amplitude 4w 6w w 3w 5w 7w Time domain Frequency domain

Analysis of Typical Waveforms
Periodic nonsinusoidal waveforms Now try this on a different periodic waveform. Measure t and compute w. Again, isolate one complete cycle and divide it in half on the time axis: t w = 1/t Amplitude Time domain Frequency domain

Analysis of Typical Waveforms
Periodic nonsinusoidal waveforms Now flip the right half upside down. If the two halves are the same, the waveform is half-wave symmetric and the amplitudes of all even harmonics are zero. Only odd harmonics are present to follow Dirichlet’s Rule. t w = 1/t Amplitude w 3w 5w 7w Time domain Frequency domain

Analysis of Typical Waveforms
Periodic nonsinusoidal waveforms Another deviation from Dirichlet’s Rule occurs if there are “multiple maxima” in the waveform. Take the following example: Amplitude Time domain Frequency domain

Analysis of Typical Waveforms
Periodic nonsinusoidal waveforms We can measure the usual period between repeats of the periodic waveform, t, and use it to predict the fundamental of the harmonic series that will occur in the frequency spectrum: t Amplitude w = 1/t Time domain Frequency domain

Analysis of Typical Waveforms
Periodic nonsinusoidal waveforms But we can also measure the interval between multiple maxima, t, and use it to compute a frequency z. Because t < t, then z > w. t t Amplitude w = 1/t z = 1/t Time domain Frequency domain

Analysis of Typical Waveforms
Periodic nonsinusoidal waveforms This leads to the following harmonic series, based on the fundamental w. Whenever a harmonic of w is close to an integer multiple of z, it has lower amplitude than intermediate harmonics. t t Amplitude w = 1/t z = 1/t w 5w 10w 15w ≈ z ≈ 2z ≈ 3z Time domain Frequency domain

Analysis of Typical Waveforms
Periodic nonsinusoidal waveforms The result is bands of harmonics that have higher amplitudes (lobes) and intervening harmonics with low amplitudes (nodes). Note that harmonics still gradually decrease following Dirichlet’s Rule. Nodes t t Lobes Amplitude w = 1/t z = 1/t w 5w 10w 15w ≈ z ≈ 2z ≈ 3z Time domain Frequency domain

Analysis of Typical Waveforms
Compound waveforms While a few birds can emit pure sine waves, most animal sounds are some combination of AM, FM, and/or nonsinusoidal periodic signals. We call these compound waveforms. Compound waveforms can always be decomposed into a set of carrier sine waves and a set of modulating sine waves. We can then use the simple rules of AM and FM to add the appropriate sidebands for each modulating sine wave around each carrier sine wave. This is the spectrum of the compound wave.

Analysis of Typical Waveforms
Compound waveforms Consider the following example of a frog call: This appears to be a pure sine wave that has been amplitude-modulated with a repeating waveform that is not sinusoidal Time domain

Analysis of Typical Waveforms
Compound waveforms Continuing with the frog call… The first thing to do is characterize the carrier. We see that it is a pure sine wave. So, we measure the period of the sine waves inside the pulses and use the resulting T to compute the sine wave carrier frequency f = 1/T. T Time domain

Analysis of Typical Waveforms
Compound waveforms Continuing with the frog call… The next step is to characterize the modulating waveform. This repeats every t seconds giving a repetition rate of w = 1/t. The frequency spectrum of the modulating waveform will be a harmonic series with a fundamental of w. t Time domain

Analysis of Typical Waveforms
Compound waveforms Continuing with the frog call… We draw the envelope of the waveform to see the shape of the modulating waveform. This is not half-wave symmetric. The modulating spectrum will show all harmonics. Time domain

Analysis of Typical Waveforms
Compound waveforms Continuing with the frog call… Also, there is only one maximum in each repeat of the modulating waveform. So, there will be no lobes or nodes in its spectrum. Time domain

Analysis of Typical Waveforms
Compound waveforms Putting these results together, we have two frequency spectra to deal with: Carrier Modulating waveform Amplitude Amplitude f w, 2w, 3w… Frequency domain

Analysis of Typical Waveforms
Compound waveforms Since the modulating waveform now consists of a series of pure sine waves, we can use the AM rules to modulate the carrier f with each of them in turn. Carrier Modulating waveform Amplitude Amplitude f w, 2w, 3w… Frequency domain

Analysis of Typical Waveforms
Compound waveforms Since the modulating waveform now consists of a series of pure sine waves, we can use the AM rules to modulate the carrier f with each of them in turn, first using the fundamental of the modulating waveform, w: Carrier Modulating waveform f–w f + w Amplitude Amplitude f w, 2w, 3w… Frequency domain

Analysis of Typical Waveforms
Compound waveforms Next, sinusoidally amplitude-modulate the carrier with the second harmonic, 2w. Note that because this component has less amplitude than w in the modulating waveform spectrum, it also has lower amplitude as sidebands. Carrier Modulating waveform f–w f +w Amplitude Amplitude f–2w f f +2w w, 2w, 3w… Frequency domain

Analysis of Typical Waveforms
Compound waveforms We continue until we have sinusoidally amplitude-modulated f with every harmonic in the series constituting the modulating waveform spectrum: Carrier Modulating waveform f–w f +w Amplitude Amplitude f –2w f f +2w w, 2w, 3w… Frequency domain

Analysis of Typical Waveforms
Compound waveforms Nearly all animal sounds are compound waveforms. Any combination is possible: Modulating waveform Carrier Modulation Result FM AM FM

The Uncertainty Principle
Any Fourier analyzer needs several cycles of a signal to compute component frequencies. The more cycles of a stable frequency component that an analyzer can measure, the more accurate the measurement of that frequency.

The Uncertainty Principle
If the analyzer has only a short time to estimate frequencies, each component will appear as a wide band in the frequency spectrum; if a longer time is available, frequency components will be narrow bands. Example: sinusoidal AM signal: Short duration sample Medium duration sample Amplitude Frequency Long duration sample Amplitude Frequency Amplitude Frequency

The Uncertainty Principle
The bandwidth, f, of an analyzer is the minimum difference in two adjacent frequencies that can be distinguished. Clearly, short duration samples result in large f values, and long duration samples result in small f values. Short duration sample Medium duration sample Amplitude Frequency Long duration sample Amplitude Frequency Amplitude Frequency

The Uncertainty Principle
If we let t be the duration of the shortest sampling time available to a Fourier analyzer, the Uncertainty Principle for sound analysis states that: f·t ≈ 1 Small t, large f Medium t, medium f Amplitude Frequency Long t, small f Amplitude Frequency Amplitude Frequency

Making Spectrograms We noted earlier that a spectrogram is created by dividing a sound into segments, computing the frequency spectrum for each segment, and then stringing the segments together along the time axis.

Making Spectrograms Thus, we might take the lark sparrow song that we saw earlier… (Move cursor over waveform to play sound) Time Pressure

Making Spectrograms Divide it into separate time segments of duration t… t Pressure Time

Making Spectrograms Compute the frequency spectrum for each segment. Align these along the time axis (imagine the peaks sticking out of the plane of the graphs). t Frequency Time

Making Spectrograms Then, use black to mark those portions of the overall graph that have higher peaks, use white to mark the lower amplitude components, and use grey for intermediate portions. t Time Frequency

Making Spectrograms Then, use black to mark those portions of the overall graph that have higher peaks, use white to mark the lower amplitude components, and use grey for intermediate portions. t Time Frequency

Making Spectrograms The result is a spectrogram with frequency on the vertical axis, time on the horizontal axis, and amplitude of a frequency component at a given time indicated by darkness on the plot. t Frequency Time

Spectrograms and Bandwidth
The spectrogram we just made uses a pretty large t. This gives us very fine frequency resolution (f = 5 Hz), but much of the temporal resolution has been lost. Can we get by with a smaller t? Frequency Time

Spectrograms and Bandwidth
Let’s decrease t by 4×. This will give us a f = 20 Hz). This starts to restore some of the temporal pattern, and the frequency bands are still pretty thin. Frequency Time

Spectrograms and Bandwidth
Let’s decrease t by 4× again. This will give us a f = 80 Hz. We get much better temporal pattern and even some better frequency pattern because FM signals show as FM, not their components! Frequency Ti×me

Spectrograms and Bandwidth
Let’s decrease t by 4× once more. This will give us a f = 320 Hz. This is similar to the prior bandwidth, but we can see the temporal pattern in the last notes better. Frequency Time

Spectrograms and Bandwidth
Let’s decrease t by 4× again. This will give us a f = 1280 Hz. Now, large bands start to appear instead of fine lines, although the temporal pattern is retained. Frequency Time

Spectrograms and Bandwidth
Let’s decrease t by 4× yet again. This will give us a f = 5120 Hz. We have now lost any decent frequency resolution, but the temporal pattern is retained. Frequency Time

Spectrograms and Bandwidth
Clearly, an intermediate bandwidth, f, provides the optimal balance of frequency resolution and temporal resolution. Frequency Time

Spectrograms and Bandwidth
In general, you want a bandwidth: small enough to separate harmonics clearly; big enough to show FM undecomposed; and big enough to show AM undecomposed. Frequency Time

Digital Sound Analysis
Computers and DAT recorders sample (digitize) the continuous rise and fall of sound amplitudes at some fixed rate and store a long column (vector) of amplitude values. Music CDs sample at 44.1 kHz.

Digital Sound Analysis
At each sample point, the computer also digitizes the amplitude value into one of N equidistant categories. The number of categories depends on how many “bits” are used to store each value. N = 2number of bits Music CDs store 16 bits/sample and thus divide the full amplitude range into 216 = 65,536 possible values.

Digital Sound Analysis
The higher the sampling rate and the higher the bit depth, the more accurately the digital recording captures the original sound. However, increasing sampling rate or bit depth or both increases the size of the digital file that must be stored. In stereo recording, two columns of numbers must be stored, taking up even more memory.

Digital Sound Analysis
Nyquist frequency: A digital recorder or computer must be able to take at least 2 samples/cycle to be able to identify each frequency. Thus, if you digitize your sounds at R samples/sec, you will be unable to properly capture any component with frequency >R/2. This latter value is called the Nyquist frequency.

Digital Sound Analysis
Aliasing: If you do not sample your sounds at a high enough rate, any frequency in the sounds that is higher than half the sampling rate is aliased. This means you will see an artifact in your spectrograms consisting of an inverted version of what the sounds should have looked like if you had sampled at a sufficiently high rate. Not nice!

Digital Sound Analysis
Digital Bandwidths: In most computer sound analysis programs, you do not set the bandwidth f directly, but instead set the segment duration, t. Instead of setting a time, you indicate t by specifying the number of consecutive sample points to be used for each frequency spectrum in the spectrogram. This is often called “frame size.”

Digital Sound Analysis
Windowing: If you cut a sound directly into segments (a rectangular window) to make a spectrogram, you introduce artifacts at the beginning and end of each segment. This occurs because, with rectangular windows, each segment begins with no sound and is suddenly switched “on” and suddenly “off.” The frequency spectrum of sudden onsets and offsets must contain a wide smear of frequencies.

Digital Sound Analysis
Windowing: To reduce artifacts, most analyzers use “tapered” cutting windows (Hann, Blackman, etc.) that turn the segment off and on slowly. Lark Sparrow f = 320 Hz Rectangular window Lark Sparrow f = 320 Hz Blackman window

Download ppt "Primer on Analyzing Animal Sounds:"

Similar presentations