Minjie Xie, Dave Lindbergh, and Peter Chu

Slides:



Advertisements
Similar presentations
T.Sharon-A.Frank 1 Multimedia Compression Basics.
Advertisements

Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology IIIDigital Audio III.6 (Fr Oct 24) The MP3 algorithm with PAC.
Developement and Implementation of an MPEG1 Layer III Decoder on x86 and TMS320C6711 platforms Braidotti Enrico (Farina Simone)
CS335 Principles of Multimedia Systems Audio Hao Jiang Computer Science Department Boston College Oct. 11, 2007.
MPEG-1 MUMT-614 Jan.23, 2002 Wes Hatch. Purpose of MPEG encoding To decrease data rate How? –two choices: could decrease sample rate, but this would cause.
Time-Frequency Analysis Analyzing sounds as a sequence of frames
2nd Workshop on Wideband Speech Quality - June Perceptual Wideband Audio Quality Assessments Using PEAQ Christian Schmidmer Opticom GmbH, Erlangen.
High Performance 32 Channel ADPCM Codec File Number Here ® LogiCORE Products.
Philippe Gournay, Bruno Bessette, Roch Lefebvre
Digital Representation of Audio Information Kevin D. Donohue Electrical Engineering University of Kentucky.
Codec requirements update Michael Knappe Co-chair, codec WG 1Michael Knappe IETF 77.
Speech codecs and DCCP with TFRC VoIP mode Magnus Westerlund
1 © NOKIA GPP2 Wideband Codec Presentation Interoperable Wideband Speech Coder for CDMA2000 and WCDMA Systems W-VRM: Wideband Variable-Rate Multi-Mode.
2nd Workshop on Wideband Speech Quality - June nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd.
Understanding the Internet Low Bit Rate Coder Jan Linden Vice President of Engineering Global IP Sound Presented by Jan Skoglund Sr. Research Scientist.
Spring 2003CS 4611 Multimedia Outline Compression RTP Scheduling.
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
ENGS Assignment 3 ENGS 4 – Assignment 3 Technology of Cyberspace Winter 2004 Thayer School of Engineering Dartmouth College Assignment 3 – Due Sunday,
Department of Computer Engineering University of California at Santa Cruz Data Compression (3) Hai Tao.
SWE 423: Multimedia Systems Chapter 7: Data Compression (1)
MPEG Audio Compression by V. Loumos. Introduction Motion Picture Experts Group (MPEG) International Standards Organization (ISO) First High Fidelity Audio.
Audio Coding MPEG1 Layers I, II, III MPEG2MPEG4 Sherida Subrati Anthony Caliendo.
ATSC Digital Television
© 2006 Cisco Systems, Inc. All rights reserved. 2.2: Digitizing and Packetizing Voice.
K. Salah 1 Chapter 28 VoIP or IP Telephony. K. Salah 2 VoIP Architecture and Protocols Uses one of the two multimedia protocols SIP (Session Initiation.
Warped Linear Prediction Concept: Warp the spectrum to emulate human perception; then perform linear prediction on the result Approaches to warp the spectrum:
Developement and Implementation of an MPEG1 Layer III Decoder on x86 and TMS320C6711 platforms Farina Simone (Braidotti Enrico)
Audio CompressiontMyn1 Audio Compression Audio compression has become well entrenched in consumer and professional digital audio products such as the compact.
A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording R. F. B. Sotero Filho, H. M. de Oliveira (qPGOM), R. Campello de Souza.
MPEG: (Moving Pictures Expert Group) A Video Compression Standard for Multimedia Applications Seo Yeong Geon Dept. of Computer Science in GNU.
Audio Compression Usha Sree CMSC 691M 10/12/04. Motivation Efficient Storage Streaming Interactive Multimedia Applications.
Sergei Hyppenen Supervisor: Professor Sven-Gustav Häggman
Audio Henning Schulzrinne Dept. of Computer Science Columbia University Fall 2003.
Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.
What’s new in Wideband Audio?
MPEG Audio coders. Motion Pictures Expert Group(MPEG) The coders associated with audio compression part of MPEG standard are called MPEG audio compressor.
EE 5359 Multimedia Processing Project Proposal Study and implementation of G.719 audio codec and performance analysis of G.719 with AAC (advanced audio.
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
Spring 2000CS 4611 Multimedia Outline Compression RTP Scheduling.
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
ITU-T G.729 EE8873 Rungsun Munkong March 22, 2004.
MPEG-1Standard By Alejandro Mendoza. Introduction The major goal of video compression is to represent a video source with as few bits as possible while.
Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.
A Very Low Bit Rate Protection Layer to Increase the Robustness of the AMR- WB+ Codec against Bit Errors Philippe Gournay Université de Sherbrooke Département.
COMPARATIVE STUDY OF HEVC and H.264 INTRA FRAME CODING AND JPEG2000 BY Under the Guidance of Harshdeep Brahmasury Jain Dr. K. R. RAO ID MS Electrical.
A UDIO B ANDWIDTH D ETECTION IN THE EVS C ODEC University of Sherbrooke, Canada VoiceAge Corporation, Montréal, Canada Fraunhofer IIS, Erlagen, Germany.
A Novel Frequency Domain BWE with Relaxed Synchronization and Associated BWE Switching Lei Miao, Zexin Liu, Xingtao Zhang, Chen Hu, Jon Gibbs Huawei Technologies.
Institut für Nachrichtengeräte und Datenverarbeitung Prof. Dr.-Ing. P. Vary On the Use of Artificial Bandwidth Extension Techniques in Wideband Speech.
1 Multimedia Outline Compression RTP Scheduling. 2 Compression Overview Encoding and Compression –Huffman codes Lossless –data received = data sent –used.
EE5359 Multimedia Processing Project Study and Comparison of AC3, AAC and HE-AAC Audio Codecs Dhatchaini Rajendran Student ID: Date :
IEEE GlobalSIP, Orlando, FL, USA, December 14-16, 2015 Enhanced AMR-WB Bandwidth Extension in 3GPP EVS Codec Magdalena Kaniewska, Stéphane Ragot Orange.
MEMORY-LESS GAIN QUANTIZATION IN THE EVS CODEC Vladimir Malenovsky Milan Jelinek University of Sherbrooke/VoiceAge Corp. CANADA.
Opus SW codec RTLAB Ki Eun Seong. What is the Opus Codec? Real-time interactive audio codec Targets interactive audio over the internet Aims to be royalty-free,
III Digital Audio III.7 (W Nov 04) The MP3 frame format.
JPEG Compression What is JPEG? Motivation
Multimedia Outline Compression RTP Scheduling Spring 2000 CS 461.
III Digital Audio III.7 (F Oct 20) The MP3 frame format.
Scalable Speech Coding for IP Networks: Beyond iLBC
PCM (Pulse Code Modulation)
Data Compression.
January 2004 Turbo Codes for IEEE n
Speech and Audio Processing
ON THE ARCHITECTURE OF THE CDMA2000® VARIABLE-RATE MULTIMODE WIDEBAND (VMR-WB) SPEECH CODING STANDARD Milan Jelinek†, Redwan Salami‡, Sassan Ahmadi*, Bruno.
III Digital Audio III.7 (Mo Oct 22) The MP3 frame format.
Understanding the Internet Low Bit Rate Coder
Scalable Speech Coding for IP Networks: Beyond iLBC
Standards Presentation ECE 8873 – Data Compression and Modeling
MPEG-1 Overview of MPEG-1 Standard
Govt. Polytechnic Dhangar(Fatehabad)
Presentation transcript:

Minjie Xie, Dave Lindbergh, and Peter Chu ITU-T G.722.1 ANNEX C A NEW LOW-COMPLEXITY 14 KHZ AUDIO CODING STANDARD Minjie Xie, Dave Lindbergh, and Peter Chu ICASSP 2006

G.722.1C: First ITU-T Super-wideband Audio Coding Standard Audio bandwidth: 14 kHz Sample rate: 32 kHz Bit rate: 24, 32, and 48 kbit/s Algorithm: Transform coding (Siren14TM) Frame size: 20 ms Algorithmic delay: 40 ms Complexity: <11 WMOPS (encoder+decoder) Very high audio quality Suitable for video and teleconferencing and Internet streaming Available on royalty-free licensing terms ICASSP 2006

Overview of Main G.722.1 Mode Wideband coding standard approved by ITU-T in 1998 Provides 50-7000 Hz audio bandwidth at 24 and 32 kbit/s Based on transform coding, using a Modulated Lapped Transform (MLT) Operates on frames of 20 ms corresponding to 320 samples at a 16 kHz sampling rate A Look-ahead of 20 ms due to 50% overlap between frames Total algorithmic delay of 40 ms Very low computational complexity (about 5.3 WMOPS) ICASSP 2006

G.722.1C : Extension Mode of G.722.1 Audio signal sampled at 32 kHz Double the audio bandwidth from 7 kHz to 14 kHz Same algorithmic steps as the main mode of G.722.1 Same frame size as G.722.1 – 20 ms Total algorithmic delay of 40 ms ICASSP 2006

Block Diagram of the G.722.1C Encoder ICASSP 2006

Block Diagram of the G.722.1C Decoder ICASSP 2006

Encoder of G.722.1 Annex C Double the MLT transform length from 320 to 640 samples Double the number of frequency regions from 14 to 28 Double the Huffman coding tables for encoding quantized region power indices Double the threshold for adjusting the number of available bits from 320 to 640 ICASSP 2006

Decoder of G.722.1 Annex C Double the number of frequency regions from 14 to 28 Double the threshold for adjusting the number of available bits from 320 to 640 Extend the centroid table for reconstruction of MLT coefficients Double the IMLT transform length from 320 to 640 samples ICASSP 2006

Computational Complexity and Memory Requirements of G.722.1C Bit rate (kbit/s) Encoder (WMOPS) Decoder Enc.+Dec. 24 4.5 5.3 9.7 32 4.8 5.5 10.3 48 5.1 5.9 10.9 Memory requirements RAM (K bytes) 18 ROM (K bytes) 30 ICASSP 2006

Computational Complexity of G.722.1C versus the 3GPP Audio Codecs Bit rate (kbit/s) G.722.1C (WMOPS) eAAC+ AMR-WB+ 24 9.7 40.8 80.1 32 10.3 42.6 86.7 ICASSP 2006

Algorithmic Delay of G.722.1C versus the 3GPP Audio Codecs (ms) eAAC+ AMR-WB+ 40.0 129.9[1] 113. 8[2] Note 1: Without bit-reservoir (see 3GPP TR 26.936 V6.1.0) Note 2: ISF = 25.6 kHz (see 3GPP TR 26.936 V6.1.0) ICASSP 2006

ITU-T Subjective Characterization Tests Subjective tests performed by France Telecom according to a test plan designed by ITU-T SG12 SQEG Characterization test Phase 1 : Speech - ACR for clean speech and DCR for noisy speech Characterization test Phase 2 : Music and mixed content - MUSHRA method Reference codec : MPEG-4 AAC-LD PCEnc/DecPro Additional reference Codecs : 3GPP eAAC+ and AMR-WB+ Requirements : Not worse than the reference codec for a 99% confidence interval ICASSP 2006

ITU-T Subjective Test Results (Phase 1) (MOS) ICASSP 2006

ITU-T Subjective Test Results (Phase 1) (DMOS) ICASSP 2006

ITU-T Subjective Test Results (Phase 1) (DMOS) ICASSP 2006

ITU-T Subjective Test Results (Phase 1) (DMOS) ICASSP 2006

ITU-T Subjective Test Results (Phase 2) (MUSHRA) ICASSP 2006

ITU-T Subjective Test Results (Phase 2) (MUSHRA) ICASSP 2006

ITU-T Subjective Test Results (Phase 2) (MUSHRA) ICASSP 2006

Conclusion G.722.1C met all performance requirements Phase 1 (clean and noisy speech) - 24 kbit/s: Better than AAC-LD and Not Worse than eAAC+ - 32 kbit/s: Better than AAC-LD, Not Worse than eAAC+, and Not Worse than AMR-WB+ in most of tests - 48 kbit/s: Not Worse than AAC-LD at 48 and 64 kbit/s Phase 2 (music and mixed content) - 24 kbit/s: Better than AAC-LD - 32 kbit/s: Better than AAC-LD - 48 kbit/s: Better than AAC-LD at 48 and 64 kbit/s Executables, audio samples, and more information available at : http://www.polycom.com/Siren14 ICASSP 2006

Acknowledgment The authors would like to acknowledge Claude Lamblin, ITU-T Q.10/SG16 Rapporteur, and Catherine Quinquis, ITU-T Q.7/SG12 Rapporteur, for their great work guiding this project to a completion. In addition, the authors would like to thank the speech quality experts and staff who performed the subjective characterization tests at France Telecom. ICASSP 2006