Midterm Review Spoken Language Processing Prof. Andrew Rosenberg.

Slides:



Advertisements
Similar presentations
Building an ASR using HTK CS4706
Advertisements

Pronunciation Modeling Lecture 11 Spoken Language Processing Prof. Andrew Rosenberg.
Speech Recognition with Hidden Markov Models Winter 2011
Major branches of phonetics 1. Experimental – How are speech sounds studied? 2. Articulatory – How are speech sounds produced? 3. Acoustic – What is the.
Speech Recognition Part 3 Back end processing. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.
Frederico Rodrigues and Isabel Trancoso INESC/IST, 2000 Robust Recognition of Digits and Natural Numbers.
Fast Fourier Transform Lecture 6 Spoken Language Processing Prof. Andrew Rosenberg.
Hidden Markov Models Reading: Russell and Norvig, Chapter 15, Sections
Chapter 15 Probabilistic Reasoning over Time. Chapter 15, Sections 1-5 Outline Time and uncertainty Inference: ltering, prediction, smoothing Hidden Markov.
What is Statistical Modeling
Hidden Markov Models Bonnie Dorr Christof Monz CMSC 723: Introduction to Computational Linguistics Lecture 5 October 6, 2004.
Hidden Markov Models Theory By Johan Walters (SR 2003)
Sequential Modeling with the Hidden Markov Model Lecture 9 Spoken Language Processing Prof. Andrew Rosenberg.
Natural Language Processing - Speech Processing -
Application of HMMs: Speech recognition “Noisy channel” model of speech.
Speech Recognition. What makes speech recognition hard?
Course Overview Lecture 1 Spoken Language Processing Prof. Andrew Rosenberg.
Overview What is in a speech signal?
Shallow Processing: Summary Shallow Processing Techniques for NLP Ling570 December 7, 2011.
Midterm Review CS4705 Natural Language Processing.
ITCS 6010 Spoken Language Systems: Architecture. Elements of a Spoken Language System Endpointing Feature extraction Recognition Natural language understanding.
COMP 4060 Natural Language Processing Speech Processing.
Dynamic Time Warping Applications and Derivation
Why is ASR Hard? Natural speech is continuous
A PRESENTATION BY SHAMALEE DESHPANDE
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Representing Acoustic Information
Audio Processing for Ubiquitous Computing Uichin Lee KAIST KSE.
Introduction to Automatic Speech Recognition
Isolated-Word Speech Recognition Using Hidden Markov Models
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Speech Signal Processing
Artificial Intelligence 2004 Speech & Natural Language Processing Natural Language Processing written text as input sentences (well-formed) Speech.
Speech and Language Processing
7-Speech Recognition Speech Recognition Concepts
Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.
A brief overview of Speech Recognition and Spoken Language Processing Advanced NLP Guest Lecture August 31 Andrew Rosenberg.
Prof. Pushpak Bhattacharyya, IIT Bombay.1 Application of Noisy Channel, Channel Entropy CS 621 Artificial Intelligence Lecture /09/05.
A brief overview of Speech Recognition and Spoken Language Processing Advanced NLP Guest Lecture August 31 Andrew Rosenberg.
Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,
Speech recognition and the EM algorithm
IRCS/CCN Summer Workshop June 2003 Speech Recognition.
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
Artificial Intelligence 2004 Speech & Natural Language Processing Natural Language Processing written text as input sentences (well-formed) Speech.
Speech, Perception, & AI Artificial Intelligence CMSC February 13, 2003.
Artificial Intelligence 2004 Speech & Natural Language Processing Speech Recognition acoustic signal as input conversion into written words Natural.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Hidden Markov Models: Decoding & Training Natural Language Processing CMSC April 24, 2003.
1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.
CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.
Speech Recognition with CMU Sphinx Srikar Nadipally Hareesh Lingareddy.
Probabilistic reasoning over time Ch. 15, 17. Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –Exceptions: games.
Presented by: Fang-Hui Chu Discriminative Models for Speech Recognition M.J.F. Gales Cambridge University Engineering Department 2007.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
Statistical Models for Automatic Speech Recognition Lukáš Burget.
1 Electrical and Computer Engineering Binghamton University, State University of New York Electrical and Computer Engineering Binghamton University, State.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
A NONPARAMETRIC BAYESIAN APPROACH FOR
Automatic Speech Recognition
EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture
Statistical Models for Automatic Speech Recognition
HUMAN LANGUAGE TECHNOLOGY: From Bits to Blogs
Speech Processing Speech Recognition
Statistical Models for Automatic Speech Recognition
CS4705 Natural Language Processing
CONTEXT DEPENDENT CLASSIFICATION
LECTURE 15: REESTIMATION, EM AND MIXTURES
Speech Recognition: Acoustic Waves
The Application of Hidden Markov Models in Speech Recognition
Presentation transcript:

Midterm Review Spoken Language Processing Prof. Andrew Rosenberg

Lecture 1 - Overview Applications –speech recognition –speech synthesis –other applications: indexing, language id, etc. Information in speech –words –speaker identity –speaker state –discourse acts 1

Lecture 2 – From Sounds to Language Differences between orthography and sounds Phonetic symbol sets –e.g. IPA, ARPAbet. Vocal organs –articulators Classes of sounds Coarticulation 2

Lecture 3 – Spoken Dialog Systems Maxims of Conversational Implicature Dialog System Architecture –Speech Recognition –Dialog Management –Response Generation –Speech Synthesis Dialog Strategies 3

Lecture 4 – Acoustics of Speech Phone Recognition Prosody Speech Waveforms Analog to Digital Conversion Nyquist Rate Pitch Doubling and Halving 4

Lecture 5 – Speech Recognition Overview History of Speech Recognition –Rule based recognition –Dynamic Time Warping –Statistical Modeling What are qualities that make speech recognition difficult? Noisy Channel Model Training and Test Corpora Word Error Rate 5

Lecture 6 – Fast Fourier Transform Multiplying Polynomials Divide-and-Conquer for multiplying polynomials. Relationship between multiplying polynomials and cosine transform Complex roots at unity 6

Lecture 7 - MFCC What is the MFCC used for? Overlapping Windows Mel Frequency Spectrogram 7

Lecture 8 – Statistical Modeling Probabilities –Bayes Rule –Bayesians vs. Frequentists Maximum Likelihood Estimation Multinomial Distribution –Bernoulli Distribution Gaussian Distribution –Multidimensional Gaussian Difference between Classification, Clustering, Regression Black Swans and the Long Tail 8

Lecture 9 – Acoustic Modeling What does an Acoustic Model do? Gaussian Mixture Model Potential Problems –Inconsistent Numbers of Gaussians –Singularities Training Acoustic Models. 9

Lecture 10 – Hidden Markov Model The Markov Assumption Difference between states and observations Finite State Automata Decoding using Viterbi Forced Alignment Flat Start Silence 10

Lecture 11 - Pronunciation Modeling Dictionary Finite State Automata Use in speech recognition Using morphology for pronunciation modeling Grapheme to Phoneme Conversion –Letter to Sound rules Machine Learning for G-to-P 11

Lecture 12 – Language Modeling Using a Context Free Grammar to define a set of recognized sequences of words. –Terminals, non-terminals, start state N-Gram models –Mathematical underpinnings –Theoretical background How a “word” is defined. Learning n-gram statistics Terminology 12

Next Class Midterm Exam Reading: J&M Chapter 4 13