Presentation is loading. Please wait.

Presentation is loading. Please wait.

Minjie Xie, Dave Lindbergh, and Peter Chu

Similar presentations


Presentation on theme: "Minjie Xie, Dave Lindbergh, and Peter Chu"— Presentation transcript:

1 Minjie Xie, Dave Lindbergh, and Peter Chu
ITU-T G ANNEX C A NEW LOW-COMPLEXITY 14 KHZ AUDIO CODING STANDARD Minjie Xie, Dave Lindbergh, and Peter Chu ICASSP 2006

2 G.722.1C: First ITU-T Super-wideband Audio Coding Standard
Audio bandwidth: kHz Sample rate: kHz Bit rate: , 32, and 48 kbit/s Algorithm: Transform coding (Siren14TM) Frame size: ms Algorithmic delay: ms Complexity: <11 WMOPS (encoder+decoder) Very high audio quality Suitable for video and teleconferencing and Internet streaming Available on royalty-free licensing terms ICASSP 2006

3 Overview of Main G Mode Wideband coding standard approved by ITU-T in 1998 Provides Hz audio bandwidth at 24 and 32 kbit/s Based on transform coding, using a Modulated Lapped Transform (MLT) Operates on frames of 20 ms corresponding to 320 samples at a 16 kHz sampling rate A Look-ahead of 20 ms due to 50% overlap between frames Total algorithmic delay of 40 ms Very low computational complexity (about 5.3 WMOPS) ICASSP 2006

4 G.722.1C : Extension Mode of G.722.1 Audio signal sampled at 32 kHz
Double the audio bandwidth from 7 kHz to 14 kHz Same algorithmic steps as the main mode of G.722.1 Same frame size as G – 20 ms Total algorithmic delay of 40 ms ICASSP 2006

5 Block Diagram of the G.722.1C Encoder
ICASSP 2006

6 Block Diagram of the G.722.1C Decoder
ICASSP 2006

7 Encoder of G Annex C Double the MLT transform length from 320 to 640 samples Double the number of frequency regions from 14 to 28 Double the Huffman coding tables for encoding quantized region power indices Double the threshold for adjusting the number of available bits from 320 to 640 ICASSP 2006

8 Decoder of G Annex C Double the number of frequency regions from 14 to 28 Double the threshold for adjusting the number of available bits from 320 to 640 Extend the centroid table for reconstruction of MLT coefficients Double the IMLT transform length from 320 to 640 samples ICASSP 2006

9 Computational Complexity and Memory Requirements of G.722.1C
Bit rate (kbit/s) Encoder (WMOPS) Decoder Enc.+Dec. 24 4.5 5.3 9.7 32 4.8 5.5 10.3 48 5.1 5.9 10.9 Memory requirements RAM (K bytes) 18 ROM (K bytes) 30 ICASSP 2006

10 Computational Complexity of G.722.1C versus the 3GPP Audio Codecs
Bit rate (kbit/s) G.722.1C (WMOPS) eAAC+ AMR-WB+ 24 9.7 40.8 80.1 32 10.3 42.6 86.7 ICASSP 2006

11 Algorithmic Delay of G.722.1C versus the 3GPP Audio Codecs
(ms) eAAC+ AMR-WB+ 40.0 129.9[1] 113. 8[2] Note 1: Without bit-reservoir (see 3GPP TR V6.1.0) Note 2: ISF = 25.6 kHz (see 3GPP TR V6.1.0) ICASSP 2006

12 ITU-T Subjective Characterization Tests
Subjective tests performed by France Telecom according to a test plan designed by ITU-T SG12 SQEG Characterization test Phase 1 : Speech - ACR for clean speech and DCR for noisy speech Characterization test Phase 2 : Music and mixed content - MUSHRA method Reference codec : MPEG-4 AAC-LD PCEnc/DecPro Additional reference Codecs : 3GPP eAAC+ and AMR-WB+ Requirements : Not worse than the reference codec for a 99% confidence interval ICASSP 2006

13 ITU-T Subjective Test Results (Phase 1)
(MOS) ICASSP 2006

14 ITU-T Subjective Test Results (Phase 1)
(DMOS) ICASSP 2006

15 ITU-T Subjective Test Results (Phase 1)
(DMOS) ICASSP 2006

16 ITU-T Subjective Test Results (Phase 1)
(DMOS) ICASSP 2006

17 ITU-T Subjective Test Results (Phase 2)
(MUSHRA) ICASSP 2006

18 ITU-T Subjective Test Results (Phase 2)
(MUSHRA) ICASSP 2006

19 ITU-T Subjective Test Results (Phase 2)
(MUSHRA) ICASSP 2006

20 Conclusion G.722.1C met all performance requirements
Phase 1 (clean and noisy speech) - 24 kbit/s: Better than AAC-LD and Not Worse than eAAC+ - 32 kbit/s: Better than AAC-LD, Not Worse than eAAC+, and Not Worse than AMR-WB+ in most of tests - 48 kbit/s: Not Worse than AAC-LD at 48 and 64 kbit/s Phase 2 (music and mixed content) - 24 kbit/s: Better than AAC-LD - 32 kbit/s: Better than AAC-LD - 48 kbit/s: Better than AAC-LD at 48 and 64 kbit/s Executables, audio samples, and more information available at : ICASSP 2006

21 Acknowledgment The authors would like to acknowledge Claude Lamblin, ITU-T Q.10/SG16 Rapporteur, and Catherine Quinquis, ITU-T Q.7/SG12 Rapporteur, for their great work guiding this project to a completion. In addition, the authors would like to thank the speech quality experts and staff who performed the subjective characterization tests at France Telecom. ICASSP 2006


Download ppt "Minjie Xie, Dave Lindbergh, and Peter Chu"

Similar presentations


Ads by Google