Presentation is loading. Please wait.

Presentation is loading. Please wait.

A STUDY OF DESIGN COMPROMISES FOR SPEECH CODERS IN PACKET NETWORKS 1.INTRODUCTION In voice over packet networks, the coding gain achieved by prediction-based.

Similar presentations


Presentation on theme: "A STUDY OF DESIGN COMPROMISES FOR SPEECH CODERS IN PACKET NETWORKS 1.INTRODUCTION In voice over packet networks, the coding gain achieved by prediction-based."— Presentation transcript:

1 A STUDY OF DESIGN COMPROMISES FOR SPEECH CODERS IN PACKET NETWORKS 1.INTRODUCTION In voice over packet networks, the coding gain achieved by prediction-based speech coders is offset by packet losses. Concealment must be applied to the missing packets, which reduces quality for two main reasons : not all missing packets can be concealed, especially when concealment uses only the past signal  onsets, transients the concealment error can propagate over several frames, even frames received correctly  culprit : desynchronisation of the excitation content (LTP) We propose to compare two approaches for alleviating this problem : Adding redundancy to increase the robustness of a baseline predictive encoder (G.729) Using a speech coding model which does not have interframe dependencies ( iLBC ) To be compared, solutions should have comparable bit rates 2. ADDED REDUNDANCY versus FRAME INDEPENDENCE 6. LISTENING TEST RESULTS 7. CONCLUSIONS R (kbps) D (ms) 16 45 14.1 45 15.2 25 12 35 14.1 25 8 25 3. PROPOSED APPROACHES FOR ADDING REDUNDANCY 4. EFFECT ON ERROR PROPAGATION5. SUBJECTIVE EXPERIMENT A formal listening test was conducted to compare the different solutions for increasing the robustness in case of missing packets. The main features of this test are : clean speech, narrowband, IRS filtered 4 male, 4 female speakers 32 naive listeners listening using binaural headphones following guidelines of ITU-T Rec. P.800 36 conditions in total, including MNRU and other reference conditions 0 – 20% random packet losses, synchronized between iLBC and G.729 20 ms packet 3rd Packet lost G.729 synthesis G.729-0 error at decoder G.729-1 error at decoder G.729-2 error at decoder G.729-3 error at decoder G.729-4 error at decoder iLBC error at decoder (compared to iLBC synthesis without frame loss) 20 ms frame encoded in « absolute » G.729-0 : Consider only G.729 at 8 kbps (baseline predictive coder) and add redundancy to obtain bit rates similar to iLBC at 15.2 kbps. 20 ms packet (two G.729 frames) P k-1 P k P k+1 F 2k-2 F 2k-1 F 2k F 2k+1 F 2k+2 F 2k+3 G.729 frame packet G.729-0 G.729-1 G.729-2 iLBC G.729-3 G.729-4 (Point size proportional to quality at 10 % FER) G.729-1 : Content of each 20-ms packet : Bit rate and algorithmic delay F 2k-2 F 2k-1 F 2k …… P k-1 PkPk P k+1 F 2k F 2k+1 F 2k+2 F 2k+3 F 2k+4 G.729-2 / G.729-3 : F 2k-2 F 2k-1 F’ 2k-3 …… P k-1 PkPk P k+1 F 2k F 2k+1 F’ 2k-1 F 2k+2 F 2k+3 F’ 2k+1 F’ 2k-4 F’ 2k-2 F’ 2k F 2k-2 F 2k-1 F 2k-3 …… P k-1 PkPk P k+1 F 2k F 2k+1 F 2k-1 F 2k+2 F 2k+3 F 2k+1 F 2k-4 F 2k-2 F 2k G.729-4 : F 2k-2 F 2k-1 …… P k-1 PkPk P k+1 F 2k F 2k+1 F 2k+2 F 2k+3 In G.729-2 and G.729-3, F’ k denotes F k but without the 18 LSF bits and pitch parity bit (hence, frame F’ k has 19 bits less than frame F k ). The missing ISFs have to be extrapolated at the decoder when a missing frame occurs. G.729-2 and G.729-3 differ at the decoder : G.729-2 : Decode packet P k when it arrives (do not wait for packet P k+1 ). If packet P k is missing, then apply concealment followed by resynchronisation of filter memories using F’ 2k and F’ 2k+1 that are received when packet P k+1 arrives. Then, start decoding packet P k+1. G.729-3 : Decode packet P k only after packet P k+1 has arrived (additional delay of 20 ms). If packet P k was missing, then just use F’ 2k and F’ 2k+1 that are added as redundancy in packet P k+1. No concealment is applied in this case. G.729-4 : At the decoder, wait for packet P k+1 before decoding packet P k. G.729-0 : Every missing 20-ms packet implies that two consecutive 10-ms frames of G.729 are lost. Concealment and propagation introduce large artefacts. G.729-1 : Every missing 20-ms packet reduces to a single 10-ms frame loss in G.729. Concealment is more optimal, and propagation is reduced. G.729-2 : Concealment followed by approximate resynchronisation of filter memories. G.729-3 : Limited concealment (there would be no concealment if F’ was equal to F). G.729-4 : No effective loss in all single packet losses. ILBC : Concealment, but limited error propagation (only due to post-filtering at decoder to smooth frame transitions). From the test results, we can make the following conclusions : In clean channel conditions, iLBC at 15.2 kbps has equivalent quality to G.729 at 8 kbps (i.e. a much higher bit rate is necessary in a « frame- independent » coder to increase both the quality in clean channel and frame loss conditions).  extreme example = G.711 at 64 kbps The best quality in frame loss conditions was achieved by using a low-rate CELP coder with added redundancy and delay (G.729-4), with a total bit rate close to iLBC (16 kbps compared to 15.2 kbps) The approaches studied to increase robustness represent only a subset of all possible combinations. Only solutions based on a standard CELP-coder (G.729) were considered, with some of them not optimal (ex.: G.729-2). Improved results could be expected by designing a solution without the constraint of using standard core codecs. The G.729 RTP payload can already support solutions G.729-1 and G.729-4. Roch Lefebvre,Philippe Gournay University of Sherbrooke Sherbrooke, Quebec, Canada Redwan Salami VoiceAge Corp. Montreal, Quebec, Canada % FER Quality (robustness to frame loss) 0 Codec_P Codec_FI or Codec_P + R Codec_P + R + Delay Codec_P R Redundancy Codec_FI Total payload bit rate Approach 1 : Use a lower bit rate, predictive (CELP) coder, and add channel redundancy to improve robustness to missing frames. Approach 2 : Use a higher bit rate, non-predictive or « frame- independent » codec, to improve robustness to missing frames in the core codec itself. Anticipated gains in quality 10 ms frame Long-term prediction Long-term prediction Past excitation Codec_P : G.729 (CELP-based) Codec_FI : iLBC (Freame-independent) 11.8 15


Download ppt "A STUDY OF DESIGN COMPROMISES FOR SPEECH CODERS IN PACKET NETWORKS 1.INTRODUCTION In voice over packet networks, the coding gain achieved by prediction-based."

Similar presentations


Ads by Google