Download presentation

1
**Speech & Audio Processing**

Digital Systems: Hardware Organization and Design 4/16/2017 Speech & Audio Processing Speech & Audio Coding Examples Architecture of a Respresentative 32 Bit Processor

2
**Linear Prediction Analysis**

A Simple Speech Coder LPC Based Analysis Structure Linear Prediction Analysis Pre- emphasis Windowing Analysis Auto- Correlation Levinson- Durbin Audio Input Residual Residual Analysis Filter Quantization Filter Coeffs Filter Coeffs 16 April 2017 Veton Këpuska

3
**Windowing Analysis Stage**

N – Length of the Analysis Window 10-30 msec 16 April 2017 Veton Këpuska

4
Some Analysis Windows 16 April 2017 Veton Këpuska

5
**MATLAB Useful Functions**

wintool Use “doc wintool” for more information window Use “>doc window” for the list of supported windows Define your own window if needed e.g: Sine window and Vorbis window 16 April 2017 Veton Këpuska

6
**LPC Analysis Stage LPC Method Described in: Summary: MATLAB help**

Ch5-Analysis_&_Synthesis_of_Pole-Zero_Speech_Models.ppt Summary: Perform Autocorrelation Solve system of equations with Durbin-Levinson Method MATLAB help doc lpc, etc. 16 April 2017 Veton Këpuska

7
**Example of MATLAB Code ge[n] ŝ[n] function myLPCCodec(wavfile, N) %**

% wavfile - input MS wav file % N LPC Filter Order [x, fs, nbits] = wavread(wavfile); % plot(x); % Playing Original Signal soundsc(x,fs); % Performing LPC analysis using MATLAB lpc function [a, g] = lpc(x,N); % performing filtering operation on estimated filter coeffs % producing predicted samples est_x = filter([0 -a(2:end)], 1, x); % error signal e = x - est_x; % Testing the quality of predicted samples soundsc(est_x, fs); % Synthesis Stage With Zero Loss of Information syn_x = filter([0 -a(2:end)], 1, g.*e); soundsc(syn_x,fs); ge[n] ŝ[n] 16 April 2017 Veton Këpuska

8
**Analysis of Quantization Errors**

Use MATLAB functions to research the effects of quantization errors introduced by precision of the arithmetic operations and representation of the filter and error signal: Double (float64) representation (software emulation) Float (float32) representation (software emulation) Int (int32) representation (hardware emulation) Short (int16) representation (hardware emulation). Useful MATLAB functions: Fix, floor, round, ceil Example: sig_hat=fix(sig*2^(B-1))/2^(B-1); Truncation of the sig to B bits. 16 April 2017 Veton Këpuska

9
**Quantization of Error Signal & Filter Coefficients**

Can Apply ADPCM for Error Signal Filter Coefficients in the Direct Filter Form are found to be sensitive to quantization errors: Small quantization error can have a large effect on filter characteristics. Issue is that polynomial coefficients have non-linear mapping to poles of the filter (e.g., roots of the polynomial). Alternate representations possible that have significantly better tolerance to quantization error. 16 April 2017 Veton Këpuska

10
**LPC Filter Representations**

As noted previously when Levinson-Durbin algorithm was introduced one alternate representation to filter coefficients was also mentioned: PARCOR coefficients: LPC to PARCOR: 16 April 2017 Veton Këpuska

11
**PARCOR Filter Representation**

PARCOR to LPC: 16 April 2017 Veton Këpuska

12
**Line Spectral Frequency Representation**

It turns out that PARCOR coefficients can be represented with LSF that have significantly better properties. Note that: The PARCOR lattice structure of the LPC synthesis filter above: Input Output Ap Ap-1 A0 + + kp kp-1 kp+1=∓1 k0=-1 - - z-1 z-1 z-1 Bp Bp-1 B0 16 April 2017 Veton Këpuska

13
**Line Spectral Frequency Representation**

From previous slide the following holds: From this realization of the filter the LSP representation is derived: 16 April 2017 Veton Këpuska

14
LSF Representation 16 April 2017 Veton Këpuska

15
**LPC Synthesis Filter with LSF**

16 April 2017 Veton Këpuska

16
**A Simple Speech Coder LPC Based Synthesis Structure Residual Signal**

Synthesis Filter De- emphasis Audio Output Residual Decoding Filter Coeffs Filter Coeffs 16 April 2017 Veton Këpuska

17
Audio Coding

18
**Digital Systems: Hardware Organization and Design**

4/16/2017 Audio Coding Most of the Audio Coding Standards use principles of Psychoacoustics. Example of Basic Structure of MP3 encoder: Audio Input Bit-stream Filterbank & Transform Quantization Psychoacoustic Model 16 April 2017 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

19
**Basic Structure of Audio Coders**

Filterbank Processing Psychoacoustic Model Quantization 16 April 2017 Veton Këpuska

20
**Filter Bank Analysis Synthesis**

21
**Filterbank Processing:**

Splitting full-band signal into several sub-bands: Uniform sub-bands (FFT) Critical Band (FFT followed by non-linear transformation) Reflect Human Auditory Apparatus. Mel-Scale and Bark-Scale transformations 16 April 2017 Veton Këpuska

22
Mel-Scale 16 April 2017 Veton Këpuska

23
Bark-Scale 16 April 2017 Veton Këpuska

24
**Analysis Structure of Filterbank**

hk[n] – Impulse Response of a Quadrature Mirror kth-filter N – Number of Channels. Typically 32 ↓ - Down-sampling MDCT – Modified Discrete Cosine Transform h1[n] ↓ MDCT MDCT Audio Input Bit Stream hk[n] ↓ MDCT Quantization MDCT hN[n] ↓ MDCT MDCT 16 April 2017 Veton Këpuska

25
**Analysis Structure of Filterbank**

gk[n] – Impulse Response of a Inverse Quadrature Mirror kth-filter N – Number of Channels. Typically 32 ↑ - Up-sampling IMDCT – Inverse Modified Discrete Cosine Transform MDCT IMDCT ↑ g1[n] Bit Stream Audio Output Decoding MDCT IMDCT ↑ gk[n] MDCT IMDCT ↑ gN[n] 16 April 2017 Veton Këpuska

26
**Psycho-Acoustic Modeling**

27
Psychoacoustic Model Masking Threshold according to the human auditory perception. Masking threshold is used to quantize the Discrete Cosine Transform Coefficients Analysis is done in frequency domain represented by DFT and computed by FFT. 16 April 2017 Veton Këpuska

28
Threshold of Hearing Absolute threshold of audibly perceptible events in quiet conditions (no other sounds). Any signal bellow the threshold can be removed without effect on the perception. 16 April 2017 Veton Këpuska

29
Threshold of Hearing 16 April 2017 Veton Këpuska

30
**Frequency Masking Schröder Spreading Function Bark Scale Function:**

16 April 2017 Veton Këpuska

31
Masking Curve 16 April 2017 Veton Këpuska

32
Primary Tone 1kHz 16 April 2017 Veton Këpuska

33
Masked Tone 900 Hz 16 April 2017 Veton Këpuska

34
**Combined Sound 1kHz + 0.9kHz**

16 April 2017 Veton Këpuska

35
**Combined 1kHz + 0.9kHz (-10dB)**

16 April 2017 Veton Këpuska

36
**Combined 1kHz + 5kHz (-10dB)**

16 April 2017 Veton Këpuska

37
END 16 April 2017 Veton Këpuska

Similar presentations

© 2019 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google