Download presentation

Presentation is loading. Please wait.

1
**Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec**

Presented by Peter

2
AMR Narrow Band Adaptive Multi-Rate Codec for narrow band speech (AMR-NB) Specified by 3GPP for GSM/3G Systems Input: 8 kHz sampling rate, 13-bit PCM 20 ms frames, no overlap 8 modes + Comfort noise Output bitrate from 4.75 – 12.2 kbps Algebraic Code Excited Linear Prediction (ACELP) is used as speech codec

3
Frequency Response

4
**Speech Encoder Pre-processing**

Linear prediction analysis and quantization Open-loop pitch analysis Impulse response computation Target signal computation Adaptive codebook Algebraic codebook Quantization of the adaptive and fixed codebook gains Memory update

5
**Principles of the adaptive multi-rate speech encoder**

Eight source codecs with bit-rates of 12.2, 10.2, 7.95, 7.40, 6.70, 5.90, 5.15 and 4.75 kbit/s 10th order linear prediction (LP), or short‑term, synthesis filter is used which is given by The long‑term, or pitch, synthesis filter is given by The pitch synthesis filter is implemented using adaptive codebook approach

6
ACELP

7
**Pre-Processing Two pre‑processing functions**

high‑pass filtering signal down‑scaling – prevent overflow A filter with a cut off frequency of 80 Hz is used

8
**Linear Prediction Analysis**

Frame is spit into four sub-frames 12.2 kbit/s mode Performed twice per frame 30ms asymmetric window No lookahead 10.2, 7.95, 7.40, 6.70, 5.90, 5.15, 4.75 kbit/s Performed once per frame 5ms lookahead

9
**Windowing and Auto-correlation Computation**

12.2 kbit/s mode Two different asymmetric windows 1st window concentrates on 2nd sub-frame 2nd window concentrates on 4th sub-frame

10
**Windowing and Auto-correlation Computation**

10.2, 7.95, 7.40, 6.70, 5.90, 5.15, 4.75 kbit/s One asymmetric windows Concentrates on 4th sub-frame 5ms (40 samples) lookahead

11
**Auto-correlation Computation**

Lag 0 to 10 is computed is the windowed speech 60 Hz bandwidth expansion is used by lag windowing is multiplied by the white noise correction factor which is equivalent to adding a noise floor at ‑40 dB

12
**Levinson‑Durbin algorithm**

by solving the set of equations uses the following recursion: The final solution is given as

13
LP to LSP conversion The LP filter coefficients, are converted to the line spectral pair (LSP) representation for quantization and interpolation purposes The LSPs are defined as the roots of the sum and difference polynomials All roots of these polynomials are on the unit circle and they alternate each other z=-1 and 1 are eliminated

14
LP to LSP conversion

15
**Quantization of the LSP coefficients**

12.2 kbit/s mode Two sets of LSP are quantified using the representation in the frequency domain 1st order MA prediction is applied two residual LSF vectors are jointly quantified using split matrix quantization (SMQ) weighted LSP distortion measure is used in the quantization process 10.2, 7.95, 7.40, 6.70, 5.90, 5.15, 4.75 kbit/s modes residual LSF vector is quantified using split vector quantization weighted LSP distortion measure

16
**Interpolation of the LSPs**

12.2 kbit/s mode interpolated LSP vectors at the 1st and 3rd subframes are given by 10.2, 7.95, 7.40, 6.70, 5.90, 5.15, 4.75 kbit/s modes interpolated LSP vectors at the 1st, 2nd, and 3rd subframes are given by

17
**Open‑loop pitch analysis**

Performed twice per frame (each 10 ms) for 12.2k, 10.2k, 7.95k, 7.40, 6.70k, 5.90k bit/s modes Performed once per frame for 5.15k, 4.75k bit/s modes Filtering the pre-processed signal with a perceptual weighting filter original weighted unit circle Flat: Tilted:

18
**Impulse response computation**

The impulse response, h(n) is computed each subframe For the search of adaptive and fixed codebooks Computed by filtering the vector of coefficients of the filter extended by zeros through the two filters and

19
Adaptive codebook Adaptive codebook search is performed on a subframe basis The parameters are the delay and gain of the pitch filter The codebook contain entries taken from the previously synthesized excitation signal

20
**Algebraic codebook Encode the random portion of the excitation signal**

The periodic portion of the weighted residual is first removed. Only the random portion is remained to be coded by fixed codebook Codebook search by minimize error between perceptual weighted input speech and reconstructed speech Based on interleaved single-pulse permutation (ISPP) design A few sparse impulse sequence that are phase-shifted version of each other All the pulses have the same magnitude Amplitudes are +1 or -1

21
**Speech decoder Codebook parameter are decoded by table look up**

LSP coefficients are interpolated and converted to LP coefficients Excitation = sum of adaptive and fixed codebook vectors multiplied by their respective gains in each subframe Speech = excitation through vocal tract filter. Enhanced perceived quality by adaptive post-filtering.

22
Speech decoder

23
Synthesis model

24
**Synthesis model To reconstruct speech A noise-like speech**

A pitch filter model of the glottal vibrations A linear prediction filter model of the vocal tract

25
**Post‑processing Adaptive post-filtering High-pass filter**

Cascade of two filters: a format postfilter and a tilt compensation filter Updated every subframe of 5 ms High-pass filter Against undesired low frequency components Cut-off frequency of 60 Hz is used Up-scaling by a factor of 2 to compensate for the down-scaling by 2 which is applied to the input signal

Similar presentations

© 2021 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google