November 1, 2005IEEE MMSP 2005, Shanghai, China1 Adaptive Multi-Frame-Rate Scheme for Distributed Speech Recognition Based on a Half Frame-Rate Front-End.

Slides:



Advertisements
Similar presentations
Jung-Hwan Low Redundancy Layered Multiple Description Scalable Coding Using The Subband Extension Of H.264/AVC Department of Electrical.
Advertisements

Packet Video Error Concealment With Auto Regressive Model Yongbing Zhang, Xinguang Xiang, Debin Zhao, Siwe Ma, Student Member, IEEE, and Wen Gao, Fellow,
Speech Processing for NSR Vs DSR Veeru Ramaswamy PhD CTO, Vianix LLC
1 A Brief Review of Joint Source-Channel Coding CUBAN/BEATS Meeting 29th April, 2004 Fredrik Hekland Department of Electronics and Telecommunication NTNU.
1 Wireless Sensor Networks Akyildiz/Vuran Administration Issues  Take home Mid-term Exam  Assign April 2, Due April 7  Individual work is required 
Histogram-based Quantization for Distributed / Robust Speech Recognition Chia-yu Wan, Lin-shan Lee College of EECS, National Taiwan University, R. O. C.
An Energy Search Approach to Variable Frame Rate Front-End Processing for Robust ASR Julien Epps and Eric H. C. Choi National ICT Australia Presenter:
Error detection and concealment for Multimedia Communications Senior Design Fall 06 and Spring 07.
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
Bernd Girod. Joint Source-Network Coding for Real-time Media 1 Joint Source-Network Coding for Real-time Media Bernd Girod Information Systems Laboratory.
Limin Liu, Member, IEEE Zhen Li, Member, IEEE Edward J. Delp, Fellow, IEEE CSVT 2009.
Compressed-domain-based Transmission Distortion Modeling for Precoded H.264/AVC Video Fan li Guizhong Liu IEEE transactions on circuits and systems for.
SCHOOL OF COMPUTING SCIENCE SIMON FRASER UNIVERSITY CMPT 820 : Error Mitigation Schaar and Chou, Multimedia over IP and Wireless Networks: Compression,
Sang-Chun Han Hwangjun Song Jun Heo International Conference on Intelligent Hiding and Multimedia Signal Processing (IIH-MSP), Feb, /05 Feb 2009.
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
Recursive End-to-end Distortion Estimation with Model-based Cross-correlation Approximation Hua Yang, Kenneth Rose Signal Compression Lab University of.
A Layered Hybrid ARQ Scheme for Scalable Video Multicast over Wireless Networks Zhengye Liu, Joint work with Zhenyu Wu.
Department of Communication Technology 30/08/ A Comparative Study of Feature-Domain Error Concealment Techniques for Distributed Speech Recognition.
Efficient Motion Vector Recovery Algorithm for H.264 Based on a Polynomial Model Jinghong Zheng and Lap-Pui Chau IEEE TRANSACTIONS ON MULTIMEDIA, June.
Rate-Distortion Optimized Layered Coding with Unequal Error Protection for Robust Internet Video Michael Gallant, Member, IEEE, and Faouzi Kossentini,
Robust Scalable Video Streaming over Internet with Network-Adaptive Congestion Control and Unequal Loss Protection Quan Zang, Guijin Wang, Wenwu Zhu, and.
Feature vs. Model Based Vocal Tract Length Normalization for a Speech Recognition-based Interactive Toy Jacky CHAU Department of Computer Science and Engineering.
Error Concealment For Fine Granularity Scalable Video Transmission Hua Cai; Guobin Shen; Feng Wu; Shipeng Li; Bing Zeng; Multimedia and Expo, Proceedings.
Efficient Fine Granularity Scalability Using Adaptive Leaky Factor Yunlong Gao and Lap-Pui Chau, Senior Member, IEEE IEEE TRANSACTIONS ON BROADCASTING,
Video Streaming: An FEC-Based Novel Approach Jianfei Cai, Chang Wen Chen Electrical and Computer Engineering, Canadian Conference on.
Wireless FGS video transmission using adaptive mode selection and unequal error protection Jianhua Wu and Jianfei Cai Nanyang Technological University.
Multi-Path Transport of FGS Video Jian Zhou, Huai-Rong Shao, Chia Shen and Ming-Ting Sun ICME 2003.
Improving the Performance of Turbo Codes by Repetition and Puncturing Youhan Kim March 4, 2005.
Statistical Multiplexer of VBR video streams By Ofer Hadar Statistical Multiplexer of VBR video streams By Ofer Hadar.
Department of Communication Technology A Subvector-Based Error Concealment Algorithm for Speech Recognition over Mobile Networks - ICASSP 2004, Montreal,
Low Latency Wireless Video Over Networks Using Path Diversity John Apostolopolous Wai-tian Tan Mitchell Trott Hewlett-Packard Laboratories Allen.
SHEAU-RU TONG Management Information System Dept., National Pingtung University of Science and Technology, Taiwan (R.O.C.) YUAN-TSE.
Electrical Engineering National Central University Video-Audio Processing Laboratory Data Error in (Networked) Video M.K.Tsai 04 / 08 / 2003.
An Error – Concealment Technique for Wireless Digital Audio Delivery N. Tatlas, A. Floros, T. Zarouchas and J. Mourjopoulos.
 Coding efficiency/Compression ratio:  The loss of information or distortion measure:
ETSI STQ-Aurora Distributed Speech Recognition (DSR) Bernhard Noé Distributed Speech Recognition.
UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.
Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.
Image compression using Hybrid DWT & DCT Presented by: Suchitra Shrestha Department of Electrical and Computer Engineering Date: 2008/10/09.
17.0 Distributed Speech Recognition and Wireless Environment References: 1. “Quantization of Cepstral Parameters for Speech Recognition over the World.
A Robust Resolution-Enhancement Scheme for Video Transmission Over Mobile Ad-Hoc Networks Authors : Source : IEEE TRANSACTIONS ON BROADCASTING, VOL. 54,
Adaptive Multi-path Prediction for Error Resilient H.264 Coding Xiaosong Zhou, C.-C. Jay Kuo University of Southern California Multimedia Signal Processing.
TM Paramvir Bahl Microsoft Corporation Adaptive Region-Based Multi-Scaled Motion- Compensated Video Coding for Error Prone Communication.
Rate-distortion Optimized Mode Selection Based on Multi-channel Realizations Markus Gärtner Davide Bertozzi Classroom Presentation 13 th March 2001.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
ITU-T G.729 EE8873 Rungsun Munkong March 22, 2004.
TI Cellular Mobile Communication Systems Lecture 4 Engr. Shahryar Saleem Assistant Professor Department of Telecom Engineering University of Engineering.
Scrutinizing bit-and symbol-errors of IEEE Communication in Industrial Environments Filip Barac, Student Member, IEEE, Mikael Gidlund, Member,
“Compensating for Packet Loss in Real-Time Applications“
1.INTRODUCTION The use of the adaptive codebook (ACB) in CELP-like speech coders allows the achievement of high quality speech, especially for voiced segments.
Selective Retransmission of MPEG Video Streams over IP Networks Árpád Huszák, Sándor Imre Budapest University of Technology and Economics Department of.
A Robust Luby Transform Encoding Pattern-Aware Symbol Packetization Algorithm for Video Streaming Over Wireless Network Dongju Lee and Hwangjun Song IEEE.
Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,
Video Compression—From Concepts to the H.264/AVC Standard
A Very Low Bit Rate Protection Layer to Increase the Robustness of the AMR- WB+ Codec against Bit Errors Philippe Gournay Université de Sherbrooke Département.
Channel Capacity. Techniques to reduce errors in digital communication systems Automatic repeat request (ARC) Forward error correction (FEC) Channel.
CASA 2006 CASA 2006 A Skinning Approach for Dynamic Mesh Compression Khaled Mamou Titus Zaharia Françoise Prêteux.
Channel Coding and Error Control 1. Outline Introduction Linear Block Codes Cyclic Codes Cyclic Redundancy Check (CRC) Convolutional Codes Turbo Codes.
1 LOW-RESOURCE NOISE-ROBUST FEATURE POST-PROCESSING ON AURORA 2.0 Chia-Ping Chen, Jeff Bilmes and Katrin Kirchhoff SSLI Lab Department of Electrical Engineering.
GSM Speech Coding To send a voice across a radio network, we have to turn our voice into a digital signal. GSM uses a method called RPE-LPC (Regular Pulse.
Scalable Speech Coding for IP Networks: Beyond iLBC
MDC METHOD FOR HDTV TRANSMISSION OVER EXISTING IP NETWORK
Injong Rhee ICMCS’98 Presented by Wenyu Ren
Streaming To Mobile Users In A Peer-to-Peer Network
Scalable Speech Coding for IP Networks: Beyond iLBC
Standards Presentation ECE 8873 – Data Compression and Modeling
On the Integration of Speech Recognition into Personal Networks
Kyoungwoo Lee, Minyoung Kim, Nikil Dutt, and Nalini Venkatasubramanian
Govt. Polytechnic Dhangar(Fatehabad)
Presentation transcript:

November 1, 2005IEEE MMSP 2005, Shanghai, China1 Adaptive Multi-Frame-Rate Scheme for Distributed Speech Recognition Based on a Half Frame-Rate Front-End Zheng-Hua Tan, Paul Dalsgaard and Børge Lindberg Aalborg University, Denmark

November 1, 2005IEEE MMSP 2005, Shanghai, China2 Outline Background and motivation Half frame-rate front-end Experimental evaluation Adaptive multi-frame-rate DSR scheme Experimental evaluation Conclusions

November 1, 2005IEEE MMSP 2005, Shanghai, China3 Distributed speech recognition (DSR) – automatic speech recognition (ASR) over mobile networks Networking introduced challenges: Bandwidth limitations Transmission errors Background and motivation Feature extraction ASR decoding Word s Speech Network constraints Source & channel coding Source & channel decoding

November 1, 2005IEEE MMSP 2005, Shanghai, China4 Existing solutions: Source coding to compress speech features, e.g. split vector quantization, discrete cosine transform Channel coding and error concealment to protect and recover speech features Our alternative solutions: in the front-end feature extraction stage based on the redundancies known to exist in full frame-rate (FFR) features  half frame-rate (HFR) front-end adaptive multi-frame-rate scheme Background and motivation

November 1, 2005IEEE MMSP 2005, Shanghai, China5 Full frame-rate front-end Temporal correlation between speech features caused by Vocal tract inertia Overlapping in the feature extraction procedure: ms ms frame shift 15 ms overlap 25 ms frame length

November 1, 2005IEEE MMSP 2005, Shanghai, China6 Half frame-rate front-end 25 ms frame length & 20 ms frame shift  5 ms overlap But why is FFR front-end prevalent in ASR systems? And why is HFR front-end promising in DSR? ms ms frame shift 5 ms overlap 25 ms frame length

November 1, 2005IEEE MMSP 2005, Shanghai, China7 HFR front-end in DSR Observation: the performance degradation of DSR is marginal when packet loss occurs in short bursts on the condition that a proper error concealment technique is applied. so why not deliberately drop some packets (speech frames)?  HFR + repetition ‘error concealment’: Prior to server-side recognition, each HFR feature vector is repeated once to construct the FFR vector equivalent.

November 1, 2005IEEE MMSP 2005, Shanghai, China8 Experiments Recognition accuracy (%) across the front-ends for three databases using FFR models Repetition of each HFR feature vector is critical! Danish digitsCity namesAurora 2 (TI digits) FFR HFR-Repetition HFR- NoRepetition

November 1, 2005IEEE MMSP 2005, Shanghai, China9 Derived DSR schemes The FFR-based ETSI-DSR standard The HFR front-end – half the bit rate FFR-based one-frame coding FFR-based interleaving24 No delay when transmission errors as opposed to the regular interleaving! FFR-based multiple description coding (MDC): odd- numbered & even-numbered feature vectors

November 1, 2005IEEE MMSP 2005, Shanghai, China10 Comparison of DSR schemes Robustness against transmission errors (Word Error Rate %) Aurora 2 database corrupted by GSM error pattern 3 (4 dB C/I ratio) Error-free MDC Interleaving24 Half frame-rate – Repetition ETSI-DSR Standard No CRC Which is the best? WER

November 1, 2005IEEE MMSP 2005, Shanghai, China11 Adaptive multi-frame-rate scheme Client Front-End Server Back-End Channel Encoder Channel Decoder incl. EC Split VQ Decoder Recogniser Words Speech Split VQ Coder FFR Front-End Error-Prone Channel Network Context HFR Front-End

November 1, 2005IEEE MMSP 2005, Shanghai, China12 Conclusions Half frame-rate front-end for DSR: half frame-rate, half bit-rate, half client-side computation. comparable performance, but repetition of HFR features is critical. Adaptive multi-frame-rate DSR scheme HFR one-frame coding Interleaving no transmission errors, no delay MDC a performance close to error-free channel