Presentation is loading. Please wait.

Presentation is loading. Please wait.

Guided By, DINAKAR DAS.C.N ( Assistant professor ECE ) Presented by, ARUN.V.S S7 EC ROLL NO: 2 1.

Similar presentations


Presentation on theme: "Guided By, DINAKAR DAS.C.N ( Assistant professor ECE ) Presented by, ARUN.V.S S7 EC ROLL NO: 2 1."— Presentation transcript:

1 Guided By, DINAKAR DAS.C.N ( Assistant professor ECE ) Presented by, ARUN.V.S S7 EC ROLL NO: 2 1

2 INTRODUCTION The field of Automatic Speech Recognition (ASR) is about 60 years old. First speech recognizer was invented at BELL LABS in 1950 Development of ASR increased gradually until the invention of Hidden Markov Models 2

3 Content Speech Recognition Speech Recogniton based on HMM Architecture of HMM based speech recognition system Application Advantages and disadvantages Conclusion 3

4 SPEECH RECOGNITION Speech recognition task Speech recognition system concept Efficiency Lmitations 4

5 5 Speech recognition task Getting a computer to understand spoken language By “understand” we might mean React appropriately Convert the input speech into another medium, e.g. Text

6 SPEECH RECOGNITION CONCEPT 6

7 7 SPEECH RECOGNITION IN COMPUTERS Digitization Acoustic analysis of the speech signal Linguistic interpretation Acoustic waveformAcoustic signal Speech recognition

8 EFFICIENCY Clean environment 99.5% Noisy environment 88% 8

9 LMITATIONS IN SPEECH RECOGNITION Digitization Converting analogue signal into digital representation Signal processing Separating speech from background noise Phonetics Variability in human speech Phonology Recognizing individual sound distinctions (similar phonemes) Lexicology and syntax Disambiguating homophones Features of continuous speech Syntax and pragmatics Interpreting prosodic features Pragmatics Filtering of performance errors 9

10 Digitization  Analogue to digital conversion Sampling and quantizing Use filters to measure energy levels for various points on the frequency spectrum Knowing the relative importance of different frequency bands (for speech) makes this process more efficient E.g. high frequency sounds are less informative, so can be sampled using a broader bandwidth (log scale) 10

11 11 Separating speech from background noise Noise cancelling microphones Two mics, one facing speaker, the other facing away Ambient noise is roughly same for both mics Knowing which bits of the signal relate to speech Spectrograph analysis

12 Variability in individuals’ speech Variation among speakers due to Vocal range (f0, and pitch range) Voice quality (growl, whisper, physiological elements such as nasality, adenoidality, etc) ACCENT !!! (especially vowel systems, but also consonants, allophones, etc.) Variation within speakers due to Health, emotional state Ambient conditions Speech style: formal read vs spontaneous 12

13 13 Speaker-(in)dependent systems Speaker-dependent systems Require “training” to “teach” the system your individual idiosyncracies The more the merrier, but typically nowadays 5 or 10 minutes is enough User asked to pronounce some key words which allow computer to infer details of the user’s accent and voice Fortunately, languages are generally systematic More robust But less convenient And obviously less portable Speaker-independent systems Language coverage is reduced to compensate need to be flexible in phoneme identification Clever compromise is to learn on the fly

14 14 (Dis)continuous speech Discontinuous speech much easier to recognize Single words tend to be pronounced more clearly Continuous speech involves contextual coarticulation effects Weak forms Assimilation Contractions

15 15 Interpreting prosodic features Pitch, length and loudness are used to indicate “stress” All of these are relative On a speaker-by-speaker basis And in relation to context Pitch and length are phonemic in some languages

16 16 Performance errors Performance “errors” include Non-speech sounds Hesitations False starts, repetitions Filtering implies handling at syntactic level or above Some disfluencies are deliberate and have pragmatic effect – this is not something we can handle in the near future

17 ARCHITECTURE OF HMM BASED SPEECH RECOGNITION SYSTEM 17

18 HMM based speech recognition system Receiving and digitizing the input speech signal. Extracting features for all input speech signals using MFCC algorithm will convert and sort each signal’s features into a feature vector. Classifying the feature vectors into the phonetic based categories at each frame using HMM algorithm. Finally, performing a Viterbi search which is an algorithm to compute the optimal (most likely) state sequence in HMM given a sequence of observed outputs. 18

19 19 HMM Model in Speech The most common model used for speech is constrained, allowing a state to transition only to itself or to a single succeeding state.

20 APPLICATION Banking Phone dialing system Computer 20

21 ADVANTAGES A speech-enabled IVR gives users much greater flexibility Call routers become easier for users Users can provide open-ended input 21

22 CONCLUSION Hmm consider speech signal as a piecewise stationary or short time stationary signal Popular due to they can be trained automatically Implemented with the help of a MATLAB 22

23 REFERENCE Garfinkel (1998). Retrieved on 10th February 2009, www.dragonmedicaltranscription.com/historyspeechrecognitio n.html M. A. M. Abu Shariah, R. N. Ainon, R. Zainuddin, and O. O. Khalifa,“Human Computer Interaction Using Isolated-Words SpeechRecognition Technology,” IEEE Proceedings of The InternationalConference on Intelligent and Advanced Systems (ICIAS’07), KualaLumpur, Malaysia, pp. 1173 – 1178, 2007. M.Z., Bhotto and M.R., Amin, “Bangali Text Dependent SpeakerIdentification Using MelFrequency Cepstrum Coefficientand VectorQuantization”. 3rd InternationalConference on Electrical and ComputerEngineering, Dhaka, Bangladesh, pp. 569-572, 2004. www.lumenvox.com/resources/tips/uses-of-speech- recognition.aspx 23

24 24


Download ppt "Guided By, DINAKAR DAS.C.N ( Assistant professor ECE ) Presented by, ARUN.V.S S7 EC ROLL NO: 2 1."

Similar presentations


Ads by Google