Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.

Slides:

Advertisements

Similar presentations

DIMENSIONALITY REDUCTION: FEATURE EXTRACTION & FEATURE SELECTION Principle Component Analysis.

Advertisements

Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.

An Overview of Machine Learning

Supervised Learning Recap

ECE 8443 – Pattern Recognition Objectives: Course Introduction Typical Applications Resources: Syllabus Internet Books and Notes D.H.S: Chapter 1 Glossary.

What is Statistical Modeling

ECE 8527 Homework Final: Common Evaluations By Andrew Powell.

Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.

Shallow Processing: Summary Shallow Processing Techniques for NLP Ling570 December 7, 2011.

Lesson 8: Machine Learning (and the Legionella as a case study) Biological Sequences Analysis, MTA.

1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.

Presented by Zeehasham Rasheed

Statistical techniques in NLP Vasileios Hatzivassiloglou University of Texas at Dallas.

Lecture #1COMP 527 Pattern Recognition1 Pattern Recognition Why? To provide machines with perception & cognition capabilities so that they could interact.

Support Vector Machines

Redaction: redaction: PANAKOS ANDREAS. An Interactive Tool for Color Segmentation. An Interactive Tool for Color Segmentation. What is color segmentation?

Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.

12 -1 Lecture 12 User Modeling Topics –Basics –Example User Model –Construction of User Models –Updating of User Models –Applications.

Introduction to Data Mining Engineering Group in ACL.

Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.

1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.

Statistical automatic identification of microchiroptera from echolocation calls Lessons learned from human automatic speech recognition Mark D. Skowronski.

Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.

Isolated-Word Speech Recognition Using Hidden Markov Models

This week: overview on pattern recognition (related to machine learning)

Short Introduction to Machine Learning Instructor: Rada Mihalcea.

Midterm Review Spoken Language Processing Prof. Andrew Rosenberg.

INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.

A brief overview of Speech Recognition and Spoken Language Processing Advanced NLP Guest Lecture August 31 Andrew Rosenberg.

Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.

1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.

Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan.

LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.

1 Generative and Discriminative Models Jie Tang Department of Computer Science & Technology Tsinghua University 2012.

CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov

Classification Course web page: vision.cis.udel.edu/~cv May 12, 2003  Lecture 33.

Today Ensemble Methods. Recap of the course. Classifier Fusion

Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.

Classification Techniques: Bayesian Classification

Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.

Some questions -What is metadata? -Data about data.

National Taiwan University, Taiwan

Jakob Verbeek December 11, 2009

1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.

Performance Comparison of Speaker and Emotion Recognition

Objectives: Terminology Components The Design Cycle Resources: DHS Slides – Chapter 1 Glossary Java Applet URL:.../publications/courses/ece_8443/lectures/current/lecture_02.ppt.../publications/courses/ece_8443/lectures/current/lecture_02.ppt.

Data Mining and Decision Support

Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer.

Scikit-Learn Intro to Data Science Presented by: Vishnu Karnam A

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Statistical Models for Automatic Speech Recognition Lukáš Burget.

Scientific Data Analysis via Statistical Learning Raquel Romano romano at hpcrd dot lbl dot gov November 2006.

Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:

Next, this study employed SVM to classify the emotion label for each EEG segment. The basic idea is to project input data onto a higher dimensional feature.

SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.

Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.

Supervised Learning – Network is presented with the input and the desired output. – Uses a set of inputs for which the desired outputs results / classes.

Machine Learning Usman Roshan Dept. of Computer Science NJIT.

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Deep Learning Amin Sobhani.

Conditional Random Fields for ASR

Statistical Models for Automatic Speech Recognition

Statistical Models for Automatic Speech Recognition

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Artificial Intelligence 2004 Speech & Natural Language Processing

Presentation transcript:

Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg

Spoken Language Processing How machines can interact with speech. Speech Recognition Speech Synthesis Analysis of Speech Lexical Processing Acoustic Signal Processing 1

Processing Tools In many applications, a common trajectory can be observed. –Manually written rule-based systems –Corpus-based Analysis –Automatic training of systems via Machine Learning 2

Rule based systems Expert knowledge of a domain Often based on un-tested hypotheses Brittle –These are difficult to modify –Often have complex interdependencies –Rarely are able to determine “confidence” 3

Machine Learning Systems Learn the relationship between –A Feature Vector, and –A dependent variable {label, or number} 4 Classifier Training data Learning Algorithm Classifier Hypothesis Feature Vector Featur e Vectors Labels

Where do we use machine learning? The trajectory in natural language processing and speech has been from –manually written rules, to –automatically generated rules learned from an abundance of data 5 Speech Recognition Speech Synthesis Prosodic Analysis Segmentation Grapheme to phoneme conversion Speech act classification Disfluency Identification Emotion classification Speech segmentation Part of speech tagging Parsing Translation Turn-taking Information Extraction

How do we use machine learning The Standard Approach to learning –Identify labeled training data –Decide what to label – syllables or words –Extract aggregate acoustic features based on the labeling region –Train a supervised classifier –Evaluate using cross-validation or a held-out test set. 6

What’s the role of linguistics? How do the rule based systems inform machine learning? Feature Representations. –The way we represent an entity or phenomenon is informed by intuitions and prior study. –The process of hand generating rules has moved to hand generation of Feature Extraction methods 7

What are the favorite tools in SLP? Decision Trees Support Vector Machines –Conditional Random Fields Neural Networks Hidden Markov Model k-means k-nearest neighbors Graphical Models Expectation maximization 8

Training, Development and Testing Available data is commonly divided into three sets –Training Used to train the model –Development Used to learn the best settings for parameters –Testing Used to evaluate the performance of the model trained on the training data with parameters l 9

Cross-Validation Cross Validation is a technique to estimate the generalization performance of a classifier. Identify n “folds” of the available data. Train on n-1 folds Test on the remaining fold. Calculate average performance 10 …

Stratified Cross-validation Some classes have skewed distributions –For example, parts of speech. When creating cross validation folds, the class distribution is maintained across all folds 11 Function Verb Adj.

Dimensionality In general, the more dimensions that a feature vector has, the more training data is necessary for reliable learning. –Some classifiers are more sensitive to this than others. When we have a vocabulary of size N, this is often converted to N binary variables. This can quickly lead to an enormous feature space. 12

Dimensionality Reduction techniques Dimensionality reduction techniques are commonly used to reduce the number of dimensions, while keeping as much information as possible Regularization Principle Components Analysis Multi-dimensional scaling Quantization 13

Next Time Working Session Anonymous Course Feedback 14