Presentation is loading. Please wait.

Presentation is loading. Please wait.

Speech Recognition Xiaofeng Lai. What is speech recognition?  Speech recognition :  This is the ability of a machine or program to identify words and.

Similar presentations


Presentation on theme: "Speech Recognition Xiaofeng Lai. What is speech recognition?  Speech recognition :  This is the ability of a machine or program to identify words and."— Presentation transcript:

1 Speech Recognition Xiaofeng Lai

2 What is speech recognition?  Speech recognition :  This is the ability of a machine or program to identify words and phrases in spoken language and convert them to a machine-readable format.

3 Outline  Brief history of speech recognition.  Introduction to how it works  Applications  Dragon Dictation

4 Brief History  1950’s, AT&T Bell Laboratories designed “Audrey”.  1960’s, IBM demonstrated “Shoebox”.  1970’s, with the help of DoD’s DARPA.  1980’s, The Hidden Markov Model helped.  1990’s, the software for speech recognition came to people, for example, Dragon.  2000’s, computer speech recognition sort of stalls. Like, google voice research, Siri.

5 How it works Input Speech Statistical modeling systems Output ADC

6 How it works  Input speech  Discrete  Continuous  Analog-to-digital converter (ADC)  The speech recognition technology converts these created vibrations to digital format.  Extract phonemes  Organize grammar

7 How it works  Statistical modeling systems  The Hidden Markov model (HMM)  Most common used in everything from data compression to sound recognition.  Artificial neural networks (ANN)  Were originally developed to model of human brain function.  Biology  The difference  HMM is a special case of the ANN  ANNs are capable of modeling extremely complex biological functions

8 The Hidden Markov Model  It is a directed graph augmented with probability scores.  N1 N2 N3 = 0.4 X 0.8 X 0.5 = 0.16  N1 N2 N2 N2 N3 N3 N3 N3 N3 = 0.4 x 0.2 x 0.2 x 0.8 x 0.5 x 0.5 x 0.5 x 0.5 = 0.0008  N1 N1 N2 N2 N3 = 0.6 x 0.4 x 0.2 x 0.8 x 0.5 = 0.192

9  Example t ow m aa t ow - British English t ah m ey t ow - American English t ah mey t a - Possibly pronunciation when speaking quickly

10 Applications  Healthcare  Military  Telephone  Business  People with disabilities  Google’s Voice Search, however, has been available on Android and iPhones.

11 Applications  Dragon Dictation  Powered by Nuance’s world-renowned Dragon NaturallySpeaking software  2.0, you can send text or email to your friends, send notes and reminders to yourself … all using your voice.

12 Applications

13 Thank you  Questions?

14 References  http://www.generation5.org/content/2002 /howsrworks.asp http://www.generation5.org/content/2002 /howsrworks.asp  http://electronics.howstuffworks.com/gadg ets/high-tech-gadgets/speech- recognition1.htm http://electronics.howstuffworks.com/gadg ets/high-tech-gadgets/speech- recognition1.htm  http://en.wikipedia.org/wiki/Speech_recog nition http://en.wikipedia.org/wiki/Speech_recog nition


Download ppt "Speech Recognition Xiaofeng Lai. What is speech recognition?  Speech recognition :  This is the ability of a machine or program to identify words and."

Similar presentations


Ads by Google