EEL 6586: AUTOMATIC SPEECH PROCESSING Windows Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 10, 2003.

Slides:



Advertisements
Similar presentations
Digital filters: Design of FIR filters
Advertisements

Digital Kommunikationselektronik TNE027 Lecture 5 1 Fourier Transforms Discrete Fourier Transform (DFT) Algorithms Fast Fourier Transform (FFT) Algorithms.
Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project.
DFT/FFT and Wavelets ● Additive Synthesis demonstration (wave addition) ● Standard Definitions ● Computing the DFT and FFT ● Sine and cosine wave multiplication.
Windowing Purpose: process pieces of a signal and minimize impact to the frequency domain Using a window – First Create the window: Use the window formula.
Fourier series With coefficients:. Complex Fourier series Fourier transform (transforms series from time to frequency domain) Discrete Fourier transform.
Hossein Sameti Department of Computer Engineering Sharif University of Technology.
Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Filtering Filtering is one of the most widely used complex signal processing operations The system implementing this operation is called a filter A filter.
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
1 Speech Parametrisation Compact encoding of information in speech Accentuates important info –Attempts to eliminate irrelevant information Accentuates.
Lecture 17 spectral analysis and power spectra. Part 1 What does a filter do to the spectrum of a time series?
FFT-based filtering and the Short-Time Fourier Transform (STFT) R.C. Maher ECEN4002/5002 DSP Laboratory Spring 2003.
Pole Zero Speech Models Speech is nonstationary. It can approximately be considered stationary over short intervals (20-40 ms). Over thisinterval the source.
Spectral Analysis Spectral analysis is concerned with the determination of the energy or power spectrum of a continuous-time signal It is assumed that.
Leakage & Hanning Windows
Representing Acoustic Information
Introduction to Spectral Estimation
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
LE 460 L Acoustics and Experimental Phonetics L-13
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.
Time-Domain Methods for Speech Processing 虞台文. Contents Introduction Time-Dependent Processing of Speech Short-Time Energy and Average Magnitude Short-Time.
Filter Design Techniques
Chapter 4 The Frequency Domain of Signals and Systems.
ECE 8443 – Pattern Recognition ECE 3163 – Signals and Systems Objectives: Introduction to the IEEE Derivation of the DFT Relationship to DTFT DFT of Truncated.
The Wavelet Tutorial Dr. Charturong Tantibundhit.
Copyright ©2010, ©1999, ©1989 by Pearson Education, Inc. All rights reserved. Discrete-Time Signal Processing, Third Edition Alan V. Oppenheim Ronald W.
Chapter 16 Speech Synthesis Algorithms 16.1 Synthesis based on LPC 16.2 Synthesis based on formants 16.3 Synthesis based on homomorphic processing 16.4.
Minimum Mean Squared Error Time Series Classification Using an Echo State Network Prediction Model Mark Skowronski and John Harris Computational Neuro-Engineering.
1 BIEN425 – Lecture 10 By the end of the lecture, you should be able to: –Describe the reason and remedy of DFT leakage –Design and implement FIR filters.
1 Linear Prediction. Outline Windowing LPC Introduction to Vocoders Excitation modeling  Pitch Detection.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Systems (filters) Non-periodic signal has continuous spectrum Sampling in one domain implies periodicity in another domain time frequency Periodic sampled.
Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE Speech Processing Instructor: Dr Kepuska.
“Digital stand for training undergraduate and graduate students for processing of statistical time-series, based on fractal analysis and wavelet analysis.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
1 Spectrum Estimation Dr. Hassanpour Payam Masoumi Mariam Zabihi Advanced Digital Signal Processing Seminar Department of Electronic Engineering Noushirvani.
Chapter 3 Time Domain Analysis of Speech Signal. 3.1 Short-time windowing signal (1) Three types windows : –Rectangular window –h r [n] = u[n] – u[n –
Time Frequency Analysis
Speaker Identification by Combining MFCC and Phase Information Longbiao Wang (Nagaoka University of Technologyh, Japan) Seiichi Nakagawa (Toyohashi University.
Lecture#10 Spectrum Estimation
Ch4 Short-time Fourier Analysis of Speech Signal z Fourier analysis is the spectrum analysis. It is an important method to analyze the speech signal. Short-time.
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 27,
Automatic Speech Recognition A summary of contributions from multiple disciplines Mark D. Skowronski Computational Neuro-Engineering Lab Electrical and.
Summary of Widowed Fourier Series Method for Calculating FIR Filter Coefficients Step 1: Specify ‘ideal’ or desired frequency response of filter Step 2:
Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)
Fourier Analysis Using the DFT Quote of the Day On two occasions I have been asked, “Pray, Mr. Babbage, if you put into the machine wrong figures, will.
A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.
ECE 8443 – Pattern Recognition EE 3512 – Signals: Continuous and Discrete Objectives: Derivation of the DFT Relationship to DTFT DFT of Truncated Signals.
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 20,
Correlation and Power Spectra Application 5. Zero-Mean Gaussian Noise.
Prof. Brian L. Evans Dept. of Electrical and Computer Engineering The University of Texas at Austin EE445S Real-Time Digital Signal Processing Lab Spring.
Lecture 19 Spectrogram: Spectral Analysis via DFT & DTFT
Fourier series With coefficients:.
Computational Data Analysis
Lecture on Continuous and Discrete Fourier Transforms
Spectral Analysis Spectral analysis is concerned with the determination of the energy or power spectrum of a continuous-time signal It is assumed that.
FFT-based filtering and the
Fourier Analysis of Signals Using DFT
Filter Design by Windowing
Ideal Filters One of the reasons why we design a filter is to remove disturbances Filter SIGNAL NOISE We discriminate between signal and noise in terms.
Quiz: Fast Fourier Transforms (FFTs) and Windowing TIPL 4302 TI Precision Labs – ADCs Created by Art Kay.
Hanning and Rectangular Windows
Digital Systems: Hardware Organization and Design
APPLICATION of the DFT: Estimation of Frequency Spectrum
Linear Prediction.
DCT-based Processing of Dynamic Features for Robust Speech Recognition Wen-Chi LIN, Hao-Teng FAN, Jeih-Weih HUNG Wen-Yi Chu Department of Computer Science.
INTRODUCTION TO THE SHORT-TIME FOURIER TRANSFORM (STFT)
Speech Processing Final Project
Presentation transcript:

EEL 6586: AUTOMATIC SPEECH PROCESSING Windows Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 10, 2003

No, not MS Windows ® …

…not those either!

Speech windows Speech is NONSTATIONARY

Assume speech is stationary over ‘short’ window of time. ‘SEVEN’ Speech windows

What is a ‘short’ window of time? 10 μs: smallest difference detectable by auditory system (localization), 3 ms: shortest phoneme (plosive burst), 10 ms: glottal pulse period, 100 ms: average phoneme duration, 4 s: exhale period during speech. ‘Short’ depends on application.

Applications using windows Automatic speech recognition, Speech coding/decoding, Speaker identification, Text-to-speech synthesis, Noise reduction Typical window (frame) length: ms Typical frame rate: 100 frames/sec

Short-time analysis s(n) : entire speech utterance w(n) : window function x(n) : frame of speech Window function is non-zero for N samples, n=0,…,N-1

Short-term Fourier Transform s(m) : entire speech utterance w(m) : window function X(n,ω) : STFT of speech at time n STFT is a smoothed version of original spectrum.

STFT example s(n) : pure sinewave of infinite length w(n) : rectangular window:

STFT example |W(ω)| * |S(ω)| ω0ω0 ω0ω0 = |X(ω)|

Window types Rectangular Hann (cosine) Hamming (raised cosine) Blackman Kaiser-Bessel Tradeoff between leakage and blurring

Window tradeoff Blurring: main lobe width A Leakage: side lobe suppression B B A

Popular windows WindowUnit BWSidelobe Rectangle1-13 dB Hann2-31 dB Hamming2-43 dB Blackman3-68 dB Kaiser- Bessel 4-91 dB

Practical issues Rule of thumb: –Time domain, use Rectangle window –Freq domain, use Hamming window Why?

Time domain issues Correlation in time domain interfered by tapered windows 20 ms /eh/, male utterance, pitch measurement (normalized autocorrelation). First side peak lower using Hamming window

Frequency domain issues fs=12.5 KHz, /eh/, 800 samples, male speaker. Blurring/Leakage tradeoff evidence: