Techniques for Emotion Classification Kaushal N Lahankar Oct 12,2009 COMS 6998.

Slides:



Advertisements
Similar presentations
Chapter 5: Space and Form Form & Pattern Perception: Humans are second to none in processing visual form and pattern information. Our ability to see patterns.
Advertisements

Cornell Notes.
Rulebase Expert System and Uncertainty. Rule-based ES Rules as a knowledge representation technique Type of rules :- relation, recommendation, directive,
Research Methodology Lecture No : 11 (Goodness Of Measures)
Research Methodology For reader assistance, have an introductory paragraph in which attention is given to the organization of the section in relation to.
CS305: HCI in SW Development Evaluation (Return to…)
First, let’s talk about some of your introductions from last time: – What did you think was good about it? – What did you think was poor about it? What.
© 2004 Prentice-Hall, Inc.Chap 1-1 Basic Business Statistics (9 th Edition) Chapter 1 Introduction and Data Collection.
CAP 252 Lecture Topic: Requirement Analysis Class Exercise: Use Cases.
Learning in the Wild Satanjeev “Bano” Banerjee Dialogs on Dialog March 18 th, 2005 In the Meeting Room Scenario.
© 2002 Prentice-Hall, Inc.Chap 1-1 Statistics for Managers using Microsoft Excel 3 rd Edition Chapter 1 Introduction and Data Collection.
TRADING OFF PREDICTION ACCURACY AND POWER CONSUMPTION FOR CONTEXT- AWARE WEARABLE COMPUTING Presented By: Jeff Khoshgozaran.
1 Evidence of Emotion Julia Hirschberg
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 7: Expert Systems and Artificial Intelligence Decision Support.
Techniques for Emotion Classification Julia Hirschberg COMS 4995/6998 Thanks to Kaushal Lahankar.
An evaluation framework
ML ALGORITHMS. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of classifying new examples.
1 A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions Zhihong Zeng, Maja Pantic, Glenn I. Roisman, Thomas S. Huang Reported.
Scaling and Attitude Measurement in Travel and Hospitality Research Research Methodologies CHAPTER 11.
Basic Business Statistics (8th Edition)
RESEARCH METHODS IN EDUCATIONAL PSYCHOLOGY
Producing Emotional Speech Thanks to Gabriel Schubiner.
© 2011 Pearson Prentice Hall, Salkind. Nonexperimental Research: Qualitative Methods.
Chapter One of Your Thesis
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
© 2015 albert-learning.com Marketing Research Process MARKETING RESEARCH PROCESS.
Dr. Engr. Sami ur Rahman Assistant Professor Department of Computer Science University of Malakand Research Methods in Computer Science Lecture: Research.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.
Lect 6 chapter 3 Research Methodology.
Public and Private Families Chapter 1. Increasing ambivalence Women in workforce vs. children in day care Divorce vs. unhappy marriage.
Classroom Assessments Checklists, Rating Scales, and Rubrics
Chapter SixChapter Six. Figure 6.1 Relationship of Qualitative Research to the Previous Chapters and the Marketing Research Process Focus of This Chapter.
Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.
Methods of Media Research Communication covers a broad range of topics. Also it draws heavily from other fields like sociology, psychology, anthropology,
Evaluating a Research Report
Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.
SPEECH CONTENT Spanish Expressive Voices: Corpus for Emotion Research in Spanish R. Barra-Chicote 1, J. M. Montero 1, J. Macias-Guarasa 2, S. Lufti 1,
Recognition of spoken and spelled proper names Reporter : CHEN, TZAN HWEI Author :Michael Meyer, Hermann Hild.
Multimodal Information Analysis for Emotion Recognition
Basic Business Statistics
Exploratory Research and Proper Problem Definition Lecture 3.
The GIS Project First Steps. Introduction Designing a GIS project. –What is the nature of the project? –What is the scope of the project? Project management.
1 The Theoretical Framework. A theoretical framework is similar to the frame of the house. Just as the foundation supports a house, a theoretical framework.
Copyright 2010, The World Bank Group. All Rights Reserved. Testing and Documentation Part II.
Performance Comparison of Speaker and Emotion Recognition
Predicting Voice Elicited Emotions
0 / 27 John-Paul Hosom 1 Alexander Kain Brian O. Bush Towards the Recovery of Targets from Coarticulated Speech for Automatic Speech Recognition Center.
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-1 Inferential Statistics for Forecasting Dr. Ghada Abo-zaid Inferential Statistics for.
Business Applications with Object-Oriented Paradigm (Modeling Concepts) Professor Chen School of Business Gonzaga University Spokane, WA
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft.
Acoustic Cues to Emotional Speech Julia Hirschberg (joint work with Jennifer Venditti and Jackson Liscombe) Columbia University 26 June 2003.
RESEARCH MOTHODOLOGY SZRZ6014 Dr. Farzana Kabir Ahmad Taqiyah Khadijah Ghazali (814537) SENTIMENT ANALYSIS FOR VOICE OF THE CUSTOMER.
Presented By Meet Shah. Goal  Automatically predicting the respondent’s reactions (accept or reject) to offers during face to face negotiation by analyzing.
Interpreting Ambiguous Emotional Expressions Speech Analysis and Interpretation Laboratory ACII 2009.
Research Methodology Proposal Prepared by: Norhasmizawati Ibrahim (813750)
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Implicature. I. Definition The term “Implicature” accounts for what a speaker can imply, suggest or mean, as distinct from what the speaker literally.
CRITICALLY APPRAISING EVIDENCE Lisa Broughton, PhD, RN, CCRN.
Detection Of Anger In Telephone Speech Using Support Vector Machine and Gaussian Mixture Model Prepared By : Siti Marahaini Binti Mahamood.
Chapter 1 Introduction and Data Collection
School of Computer Science & Engineering
Supervised Time Series Pattern Discovery through Local Importance
Child Outcomes Summary (COS) Process Training Module
Mean Shift Segmentation
Child Outcomes Summary (COS) Process Training Module
Emotional Speech Julia Hirschberg CS /8/2018.
Child Outcomes Summary (COS) Process Training Module
Planning Training Programs
Presentation transcript:

Techniques for Emotion Classification Kaushal N Lahankar Oct 12,2009 COMS 6998

Topics covered Describing the emotional states expressed in speech Identification of Confusion and Surprise in Spoken Dialog using Prosodic Features Emotion clustering using the results of subjective opinion tests for emotion recognition in Infants’ cries

Describing the emotional states expressed in speech Emotions are not easily captured in symbols. A suitable descriptive system for emotions does not exist as of yet. Objective - To encourage the speech community towards standardization of key terms and descriptive techniques.

Emotion (WSD) from a philosophical point of view Fullblown emotions – Natural discrete units which can be counted and possess distinct boundaries. Is it worth researching? Can emotions be treated as tangible states distinguishable from each other? Emotional states - An attribute of certain states. E.g. – “Her voice was tinged with emotion”. Intangible and difficult to quantify, but definitely worth researching These distinctions exist to that the domain and the scope of the research can be determined and to ensure that there is scope for research on one objective to support research on the others.

Cause and effect type descriptors Cause type – The type of description that identifies the emotion related internal states and external factors that caused a person’s speech to have particular characteristics. Effect type – This type of description would describe what effect the characteristics mentioned earlier would be likely to have on a typical listener. Cause type descriptors ultimately tend to focus attention on the physiological systems which can be used to describe emotions. Effect type descriptors favor describing emotional states in terms of categories and dimensions that people find natural.

Basic emotion categories Basic or primary emotions – Few emotional states which are pure and primitive in a way that the other are not. E.g. – The big six (Ekman)– – Fear, anger, happiness, sadness, surprise,disgust Second order emotions – The emotional states that are not so basic can be termed as second order emotions.

Wheel of emotions More information

Complementary representation Need to come up with a complementary representation that offers ways of drawing fewer, grosser distinctions to make things such as acquiring speech co-relates manageable. Feeltrace might be such a system?

Emotion related states emotion-proper property : An emotion-proper property is the property that is proper to, or ‘belongs to’, a type of emotion.. For example, being frightening is the emotion-proper property for fear Emotion terms are surrounded by terms that bear resemblance with the terms in spite of being different on some levels. They are called emotion related terms and the states associated with them are called emotion related states.

The states discussed Arousal – States of arousal have maximum vocal overlap with emotional proper. – Problems faced while addressing this issue? Attitude – “A person who exhibits a particular attitude approaches a situation prepared to find certain kinds of problem or opportunity and to take certain kinds of actions.” – Similar problems faced. Any other states?

Emotion and everyday terms Some words used to describe emotions have a lot of implicit information. E.g. – vengeful Basic vs Secondary emotions in terms of arousal and attitude. Other factors involved in description of everyday emotional states?

Biological Representation It is assumed that descriptions of emotions are surrogates for descriptions of physiological states. So, can we replace verbal descriptions with physiological parameters? Which theoretical perspective does it seem to follow? Pros and cons?

Continuous representation Represent emotions in 2-D space in terms of evaluation (X axis) and Activation (Y axis). Dimensions can be increased to represent the relationships which lie very close in the 2-D space. Pros and cons?

Feeltrace system

Structural models Cognitive approach of describing emotions Based on the hypothesis that distinct types of emotion correspond to distinct ways of appraising the situation evoking the emotion.

Timing For the purpose of speech research, emotions can be analyzed over time and both the short and the long time span can be considered to synthesize as well as to recognize emotional speech, i.e., trainers can be trained on both type of data. Pros and cons?

Interactions which determine how underlying emotional tendencies are expressed Restraint Ambivalence Simulation

Tools to describe emotion for analyzing it’s relationship with speech Naturalistic Database – Belfast Naturalistic Database annotated using Feeltrace Feeltrace - Basic Emotion Vocabulary -

Discussion Can a descriptive system which captures everything described in the paper possible? Which representation would make more sense? In which situations? Everything described in this paper is concerned with natural emotions. What about simulated emotions?

Identification of Confusion and Surprise in Spoken Dialog using Prosodic Features Objective – To detect the speaker’s states of confusion and surprise using prosodic features found in utterances. Previous Work – – High accuracies in detecting annoyance and frustration using prosodic and language based models trained on data that was manually annotated with categorical labels of emotions. – Batliner proposed to look for indicators of trouble in communication as indirect evidence of emotional responses from users.

“Innovative experimental paradigm” In this experiment, self-report questionnaires were used as a gold standard for training classifiers instead of data annotated by experimenters.

Confusion and Surprise The authors have related Uncertainty, Lack of Clarity and Inappropriateness to the emotions of Confusion and Surprise.

Methodology Participants were invited to interact with an SDS. Every participant was asked a set of 16 questions by the SDS which were meant to elicit emotions of uncertainty, lack of clarity and inappropriateness from the participants. The participants were asked to rate the clarity of system’s intention, appropriateness of the questions and certainty of their responses on a scale of 0 to 5, 5 corresponding to perfect clarity, appropriateness and certainty in respective cases.

Evaluation of the elicitation

Features used for classification 52 features were used for each of the 257 utterances. F0 and Power contours with fixed frame size were extracted for each utterance. Example of an F0 contour for the following file Example of an F0 contour In the contour, analysis window length = s Frame interval = 0.01s

Classification experiments and results The classifier was trained on the sets of labels – Non-aggregated labels – Aggregated labels Classification accuracies-

Conclusion Improvement in classification accuracy for both uncertainty and lack of clarity was inconclusive. The classification accuracy for inappropriateness was higher than the human classification in both the experiments. An improvement of >27% over the baseline and ≈12% over the human score. Seems too good?

Discussion Were the selected humans experts? Does listening to only 105 utterances and classifying them enough? Only 7 instances of inappropriateness, out of which 6 elicited correctly….enough data? Uncertainty, lack of Clarity and Inappropriateness were the 3 emotions selected and later, their selection was justified. Could any other emotions such as anger or boredom contribute as well? Why were only prosodic features used?

Emotion clustering Using the results of subjective opinion tests for emotion recognition in infant’s cries Objective – To design an emotion clustering algorithm for emotion detection in infants’ cries. Previous work – Acoustic analysis of an infant’s cries has been performed. – Classification between “hunger” and “sleepiness” has been studied. – Some emotion detection products currently available in the market employ simple acoustic techniques.

Methodology Mothers were asked to fill up an emotion table, consisting of 10(!) emotion tags, after recording each cry. Baby rearing experts were also asked fill up the same table based only on the recorded cries.

Clustering technique used Here, an emotion i was selected from a cluster X={e1,…,eI} and j, j≠i was selected from cluster Y={e1,…,e(i-1),eΦ,e(i+1),…eI}. This is a type of hierarchical clustering where the conditional entropy is the objective function to be minimized.

Emotion recognition Using Bayes’ theorem, we can find the probability when the infant utters the sequence z caused by the emotion cluster e the acoustic evidence q will be observed. We can find e and z which maximize the conditional probability to ultimately find the emotional cluster to which the cry belongs to. Let a cry z be represented by a series of N acoustic segments.

Clustering trees

Discussion Are 10 emotions actually needed? Instead of relying on the input from baby- rearing experts, could the inputs from the mothers have been used? Is the input data reliable? Improvements?