Presentation is loading. Please wait.

Presentation is loading. Please wait.

IIT Bombay Dr. Prem C. Pandey Dr. Pandey is a Professor in Electrical Engineering at IIT Bombay. He is currently also the Associate.

Similar presentations


Presentation on theme: "IIT Bombay Dr. Prem C. Pandey Dr. Pandey is a Professor in Electrical Engineering at IIT Bombay. He is currently also the Associate."— Presentation transcript:

1 IIT Bombay pcpandey@ee.iitb.ac.in Dr. Prem C. Pandey Dr. Pandey is a Professor in Electrical Engineering at IIT Bombay. He is currently also the Associate Dean of Academic Programmes. He received B.Tech. in electronics engineering from Banaras Hindu University in 1979, M.Tech. in electrical engineering from IIT Kanpur in 1981, and Ph.D. in electrical & biomedical engineering from the University of Toronto (Canada) in 1987. In 1987, he joined the University of Wyoming (USA) as an assistant professor and later joined IIT Bombay in 1989. His research interests include speech & signal processing; biomedical signal processing & instrumentation; electronic instrumentation & embedded system design. The focus of his R&D efforts have been in the areas of speech and hearing and impedance cardiography.

2 IIT Bombay pcpandey@ee.iitb.ac.in P. C. Pandey (EE Dept, IIT Bombay): " Speech Processing for Persons with Moderate Sensorineural Hearing Impairment", Plenary talk, Second International Conference on Intelligent Interactive Technologies and Multimedia (IITM 2013), 09-11 March 2013, Allahabad, India Abstract Our objective is to develop techniques for improving speech perception by listeners with moderate-to-severe sensorineural loss and to implement these techniques using a low-power DSP chip for real-time operation and with acceptable signal delay (< 60 ms). Here we present two techniques to reduce the adverse effects of increased spectral masking associated with sensorimeural loss. The first technique reduces the effects of noise in the listening environment and the second one reduces the effects of increased intra-speech spectral masking. A spectral subtraction technique is presented for real-time speech enhancement in the aids used by hearing impaired listeners. For reducing computational complexity and memory requirement, it uses a cascaded-median based estimation of the noise spectrum without voice activity detection. The technique is implemented and tested for satisfactory real-time operation, with sampling frequency of 12 kHz, processing using window length of 30 ms with 50% overlap, and noise estimation by 3-frame 4-stage cascaded-median, on a 16-bit fixed-point DSP processor with on-chip FFT hardware. Enhancement of speech with different types of additive stationary and non-stationary noise resulted in SNR advantage of 4 – 13 dB. Widening of auditory filters in persons with sensorineural hearing impairment leads to increased spectral masking and degraded speech perception. Multi-band frequency compression of the complex spectral samples using pitch-synchronous processing has been reported to increase speech perception by persons with moderate sensorineural loss. It is shown that implementation of multi-band frequency compression using fixed-frame processing along with least-squares error based signal estimation reduces the processing delay and the speech output is perceptually similar to that from pitch-synchronous processing. The processing is implemented on a 16-bit fixed-point DSP processor and real- time operation is achieved using about one-tenth of its computing capacity. References S. K. Waddi, P. C. Pandey, N. Tiwari, "Speech Enhancement Using Spectral Subtraction and Cascaded Median Based Noise Estimation for Hearing Impaired Listeners ", Proc. NCC 2013, Delhi, 15-17 Feb. 2013, Paper 3.2_2_1569696063. N. Tiwari, P. C. Pandey, P. N. Kulkarni Real-time Implementation of Multi-band Frequency Compression for Listeners with Moderate Sensorineural Impairment (,Proc. Interspeech 2012, Portland, Oregon, 9-13 Sept 2012, Paper 689) P. C. Pandey (EE Dept, IIT Bombay): " Speech Processing for Persons with Moderate Sensorineural Hearing Impairment", Plenary talk, Second International Conference on Intelligent Interactive Technologies and Multimedia (IITM 2013), 09-11 March 2013, Allahabad, India Abstract Our objective is to develop techniques for improving speech perception by listeners with moderate-to-severe sensorineural loss and to implement these techniques using a low-power DSP chip for real-time operation and with acceptable signal delay (< 60 ms). Here we present two techniques to reduce the adverse effects of increased spectral masking associated with sensorimeural loss. The first technique reduces the effects of noise in the listening environment and the second one reduces the effects of increased intra-speech spectral masking. A spectral subtraction technique is presented for real-time speech enhancement in the aids used by hearing impaired listeners. For reducing computational complexity and memory requirement, it uses a cascaded-median based estimation of the noise spectrum without voice activity detection. The technique is implemented and tested for satisfactory real-time operation, with sampling frequency of 12 kHz, processing using window length of 30 ms with 50% overlap, and noise estimation by 3-frame 4-stage cascaded-median, on a 16-bit fixed-point DSP processor with on-chip FFT hardware. Enhancement of speech with different types of additive stationary and non-stationary noise resulted in SNR advantage of 4 – 13 dB. Widening of auditory filters in persons with sensorineural hearing impairment leads to increased spectral masking and degraded speech perception. Multi-band frequency compression of the complex spectral samples using pitch-synchronous processing has been reported to increase speech perception by persons with moderate sensorineural loss. It is shown that implementation of multi-band frequency compression using fixed-frame processing along with least-squares error based signal estimation reduces the processing delay and the speech output is perceptually similar to that from pitch-synchronous processing. The processing is implemented on a 16-bit fixed-point DSP processor and real- time operation is achieved using about one-tenth of its computing capacity. References S. K. Waddi, P. C. Pandey, N. Tiwari, "Speech Enhancement Using Spectral Subtraction and Cascaded Median Based Noise Estimation for Hearing Impaired Listeners ", Proc. NCC 2013, Delhi, 15-17 Feb. 2013, Paper 3.2_2_1569696063. N. Tiwari, P. C. Pandey, P. N. Kulkarni Real-time Implementation of Multi-band Frequency Compression for Listeners with Moderate Sensorineural Impairment (,Proc. Interspeech 2012, Portland, Oregon, 9-13 Sept 2012, Paper 689)

3 IIT Bombay pcpandey@ee.iitb.ac.in Noise Suppression [1]H. Levitt, J. M. Pickett, and R. A. Houde (eds.), Senosry Aids for the Hearing Impaired. New York: IEEE Press, 1980. [2]J. M. Pickett, The Acoustics of Speech Communication: Fundamentals, Speech Perception Theory, and Technology. Boston, Mass.: Allyn Bacon, 1999, pp. 289–323. [3]H. Dillon, Hearing Aids. New York: Thieme Medical, 2001. [4]T. Lunner, S. Arlinger, and J. Hellgren, “8-channel digital filter bank for hearing aid use: preliminary results in monaural, diotic, and dichotic modes,” Scand. Audiol. Suppl., vol. 38, pp. 75–81, 1993. [5]P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti, “Binaural dichotic presentation to reduce the effects of spectral masking in moderate bilateral sensorineural hearing loss,” Int. J. Audiol., vol. 51, no. 4, pp. 334–344, 2012. [6]J. Yang, F. Luo, and A. Nehorai, “Spectral contrast enhancement: Algorithms and comparisons,” Speech Commun., vol. 39, no. 1–2, pp. 33–46, 2003. [7]T. Arai, K. Yasu, and N. Hodoshima, “Effective speech processing for various impaired listeners,” in Proc. 18th Int. Cong. Acoust., 2004, Kyoto, Japan, pp. 1389–1392. [8]P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti, “Multi-band frequency compression for improving speech perception by listeners with moderate sensorineural hearing loss,” Speech Commun., vol. 54, no. 3 pp. 341–350, 2012. [9]P. C. Loizou, "Speech processing in vocoder-centric cochlear implants," in A. R. Moller (ed.), Cochlear and Brainstem Implants, Adv. Otorhinolaryngol. vol. 64, Basel: Karger, 2006, pp. 109–143. [10]P. C. Loizou, Speech Enhancement: Theory and Practice. New York: CRC, 2007. [11]R. Martin, “Spectral subtraction based on minimum statistics,” in Proc. Eur. Signal Process. Conf., 1994, pp. 1182-1185. [12]I. Cohen, “Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging,” IEEE Trans. Speech Audio Process., vol. 11, no. 5, pp. 466-475, 2003. [13]H. Hirsch and C. Ehrlicher, “Noise estimation techniques for robust speech recognition,” in Proc. IEEE ICASSP, 1995, pp. 153-156. Noise Suppression [1]H. Levitt, J. M. Pickett, and R. A. Houde (eds.), Senosry Aids for the Hearing Impaired. New York: IEEE Press, 1980. [2]J. M. Pickett, The Acoustics of Speech Communication: Fundamentals, Speech Perception Theory, and Technology. Boston, Mass.: Allyn Bacon, 1999, pp. 289–323. [3]H. Dillon, Hearing Aids. New York: Thieme Medical, 2001. [4]T. Lunner, S. Arlinger, and J. Hellgren, “8-channel digital filter bank for hearing aid use: preliminary results in monaural, diotic, and dichotic modes,” Scand. Audiol. Suppl., vol. 38, pp. 75–81, 1993. [5]P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti, “Binaural dichotic presentation to reduce the effects of spectral masking in moderate bilateral sensorineural hearing loss,” Int. J. Audiol., vol. 51, no. 4, pp. 334–344, 2012. [6]J. Yang, F. Luo, and A. Nehorai, “Spectral contrast enhancement: Algorithms and comparisons,” Speech Commun., vol. 39, no. 1–2, pp. 33–46, 2003. [7]T. Arai, K. Yasu, and N. Hodoshima, “Effective speech processing for various impaired listeners,” in Proc. 18th Int. Cong. Acoust., 2004, Kyoto, Japan, pp. 1389–1392. [8]P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti, “Multi-band frequency compression for improving speech perception by listeners with moderate sensorineural hearing loss,” Speech Commun., vol. 54, no. 3 pp. 341–350, 2012. [9]P. C. Loizou, "Speech processing in vocoder-centric cochlear implants," in A. R. Moller (ed.), Cochlear and Brainstem Implants, Adv. Otorhinolaryngol. vol. 64, Basel: Karger, 2006, pp. 109–143. [10]P. C. Loizou, Speech Enhancement: Theory and Practice. New York: CRC, 2007. [11]R. Martin, “Spectral subtraction based on minimum statistics,” in Proc. Eur. Signal Process. Conf., 1994, pp. 1182-1185. [12]I. Cohen, “Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging,” IEEE Trans. Speech Audio Process., vol. 11, no. 5, pp. 466-475, 2003. [13]H. Hirsch and C. Ehrlicher, “Noise estimation techniques for robust speech recognition,” in Proc. IEEE ICASSP, 1995, pp. 153-156.

4 IIT Bombay pcpandey@ee.iitb.ac.in Multiband frequency compression [1]H. Levitt, J. M. Pickett, and R. A. Houde (eds.), Senosry Aids for the Hearing Impaired. New York: IEEE Press, 1980. [2]B. C. J. Moore, An Introduction to the Psychology of Hearing, London, UK: Academic, 1997, pp 66–107. [3]J. M. Pickett, The Acoustics of Speech Communication: Fundamentals, Speech Perception Theory, and Technology. Boston, Mass.: Allyn Bacon, 1999, pp. 289–323. [4]H. Dillon, Hearing Aids. New York: Thieme Medical, 2001. [5]T. Lunner, S. Arlinger, and J. Hellgren, “8-channel digital filter bank for hearing aid use: preliminary results in monaural, diotic, and dichotic modes,” Scand. Audiol. Suppl., vol. 38, pp. 75–81, 1993. [6]P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti, “Binaural dichotic presentation to reduce the effects of spectral masking in moderate bilateral sensorineural hearing loss”, Int. J. Audiol., vol. 51, no. 4, pp. 334–344, 2012. [7]T. Baer, B. C. J. Moore, and S. Gatehouse, “Spectral contrast enhancement of speech in noise for listeners with sensorineural hearing impairment: effects on intelligibility, quality, and response times”, Int. J. Rehab. Res., vol. 30, no. 1, pp. 49–72, 1993. [8]J. Yang, F. Luo, and A. Nehorai, “Spectral contrast enhancement: Algorithms and comparisons,” Speech Commun., vol. 39, no. 1–2, pp. 33–46, 2003. [9]T. Arai, K. Yasu, and N. Hodoshima, “Effective speech processing for various impaired listeners,” in Proc. 18th Int. Cong. Acoust., 2004, Kyoto, Japan, pp. 1389–1392. [10]K. Yasu, M. Hishitani, T. Arai, and Y. Murahara, “Critical-band based frequency compression for digital hearing aids,” Acoustical Science and Technology, vol. 25, no. 1, pp. 61-63, 2004. [11]P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti, “Multi-band frequency compression for reducing the effects of spectral masking,” Int. J. Speech Tech., vol. 10, no. 4, pp. 219–227, 2009. [12]P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti, “Multi-band frequency compression for improving speech perception by listeners with moderate sensorineural hearing loss,” Speech Commun., vol. 54, no. 3 pp. 341–350, 2012. [13]E. Zwicker, “Subdivision of the audible frequency range into critical bands (Freqenzgruppen),” J. Acoust. Soc. Am., vol. 33, no. 2, pp. 248, 1961. [14]D. G. Childers and H. T. Hu, “Speech synthesis by glottal excited linear prediction”, J. Acoust. Soc. Am., vol. 96, no. 4, pp. 2026–2036, 1994.


Download ppt "IIT Bombay Dr. Prem C. Pandey Dr. Pandey is a Professor in Electrical Engineering at IIT Bombay. He is currently also the Associate."

Similar presentations


Ads by Google