The Computerised FDA Application Formulating A System of Acoustic Objective Measures for the Frenchay Dysarthria Assessment Tests.

Slides:



Advertisements
Similar presentations
Presented by Eroika Jeniffer.  We want to set tasks that form a representative of the population of oral tasks that we expect candidates to be able to.
Advertisements

Cognitive Modelling – An exemplar-based context model Benjamin Moloney Student No:
Coarticulation Analysis of Dysarthric Speech Xiaochuan Niu, advised by Jan van Santen.
Speech Recognition Part 3 Back end processing. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.
Automatic Speech Recognition with Sparse Training Data for Dysarthric Speakers P. Green 1, J. Carmichael 1, A. Hatzis 1, P. Enderby 3, M. Hawley & M. Parker.
Frenchay Dysarthria Assessment: What’s new?
Literacy Assessment and Monitoring Programme (LAMP) UNESCO Institute for Statistics.
CSC 380 Algorithm Project Presentation Spam Detection Algorithms Kyle McCombs Bridget Kelly.
Do you suffer from judgement creep? A group moderation session will soon put you right!
LANGUAGE TESTING: Approaches & Techniques
CAP 252 Lecture Topic: Requirement Analysis Class Exercise: Use Cases.
INTERPRET MARKETING INFORMATION TO TEST HYPOTHESES AND/OR TO RESOLVE ISSUES. INDICATOR 3.05.
EE225D Final Project Text-Constrained Speaker Recognition Using Hidden Markov Models Kofi A. Boakye EE225D Final Project.
Building Knowledge-Driven DSS and Mining Data
Dynamic Time Warping Applications and Derivation
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Communication Difficulties Oral Expression & Listening Comprehension.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
CSD 2230 HUMAN COMMUNICATION DISORDERS
Introduction to Automatic Speech Recognition
Clinical Applications of Speech Technology Phil Green Speech and Hearing Research Group Dept of Computer Science University of Sheffield
Determining Sample Size
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Data Presentation.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Group 3 Teacher: Kate Chen Student: Nicole Ivy Julie Yuki Sandy Kelly.
Center for Human Computer Communication Department of Computer Science, OG I 1 Designing Robust Multimodal Systems for Diverse Users and Mobile Environments.
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
Decision Support Systems Management Information Systems BUS 391 Barry Floyd.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Review: Two Main Uses of Statistics 1)Descriptive : To describe or summarize a collection of data points The data set in hand = all the data points of.
7-Speech Recognition Speech Recognition Concepts
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
Vegas Baby A trip to Vegas is just a sample of a random variable (i.e. 100 card games, 100 slot plays or 100 video poker games) Which is more likely? Win.
Measures of Variability In addition to knowing where the center of the distribution is, it is often helpful to know the degree to which individual values.
1 CS 552/652 Speech Recognition with Hidden Markov Models Winter 2011 Oregon Health & Science University Center for Spoken Language Understanding John-Paul.
STARDUST – Speech Training And Recognition for Dysarthric Users of Assistive Technology Mark Hawley et al Barnsley District General Hospital and University.
Modeling Speech using POMDPs In this work we apply a new model, POMPD, in place of the traditional HMM to acoustically model the speech signal. We use.
Group 3 林正昀 Adam, 李燕俞 Amber, 李季樺 Gina, 徐家慧 Alice.
1 Generative and Discriminative Models Jie Tang Department of Computer Science & Technology Tsinghua University 2012.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Improving Speech Modelling Viktoria Maier Supervised by Prof. Hynek Hermansky.
Gile Sampling1 Sampling. Fundamental principles. Daniel Gile
Bioinformatics Ayesha M. Khan 9 th April, What’s in a secondary database?  It should be noted that within multiple alignments can be found conserved.
C M Clarke-Hill1 Analysing Quantitative Data Forming the Hypothesis Inferential Methods - an overview Research Methods.
The New Normal: Goodness Judgments of Non-Invariant Speech Julia Drouin, Speech, Language and Hearing Sciences & Psychology, Dr.
Maximum Entropy Models and Feature Engineering CSCI-GA.2590 – Lecture 6B Ralph Grishman NYU.
1 CSE 552/652 Hidden Markov Models for Speech Recognition Spring, 2005 Oregon Health & Science University OGI School of Science & Engineering John-Paul.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
Part III – Gathering Data
Artificial Intelligence, Expert Systems, and Neural Networks Group 10 Cameron Kinard Leaundre Zeno Heath Carley Megan Wiedmaier.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Tests can be categorised according to the types of information they provide. This categorisation will prove useful both in deciding whether an existing.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Statistical Significance Hypothesis Testing.
Automated Speach Recognotion Automated Speach Recognition By: Amichai Painsky.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Classification of melody by composer using hidden Markov models Greg Eustace MUMT 614: Music Information Acquisition, Preservation, and Retrieval.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Sample Size Mahmoud Alhussami, DSc., PhD. Sample Size Determination Is the act of choosing the number of observations or replicates to include in a statistical.
1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
Verification vs. Validation Verification: "Are we building the product right?" The software should conform to its specification.The software should conform.
Speaker Recognition UNIT -6. Introduction  Speaker recognition is the process of automatically recognizing who is speaking on the basis of information.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 12: Artificial Intelligence and Expert Systems.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Automatic Speech Recognition
LECTURE 33: STATISTICAL SIGNIFICANCE AND CONFIDENCE (CONT.)
Speech Recognition UNIT -5.
Hidden Markov Models - Training
Learning From Observed Data
Presentation transcript:

The Computerised FDA Application Formulating A System of Acoustic Objective Measures for the Frenchay Dysarthria Assessment Tests

Background: The paper-based FDA  The type and severity of a given instance of dysarthria (disordered speech arising from impaired articulator control) is diagnosable by an assessment procedure known as the Frenchay Dysarthria Assessment (FDA) tests.  Two of the three FDA intelligibility tests are concerned with the measurement of intelligibility…but what exactly is intelligibility anyway? “ The degree of success in establishing communication between the sender and intended recipient of a message ”

Intelligibility, a very variable percept  Are both of these speech samples equally intelligible?  Initially, a listener will find it more difficult to understand a newly encountered accent than a familiar one. Nonetheless, increased exposure to the initially unfamiliar speaking style will usually invoke a subconscious adaptation, a learning effect, making that speech easier to understand. This holds true even for dysarthric speech. Naïve Listeners Expert Listeners Learning Effect from Repeated Exposure to Dysarthric Speech Data - Mean Score Improvement: Round 1 vs. Rounds 2-5 (from ABI Corpus, Birmingham Uni.)

Modelling the Naïve Listener  If the learning effect alters a listener’s perception of a particular individual’s speaking style, is that listener’s judgement still representative of the naïve listener? If the learning effect introduces an inevitable bias, can a computer model be built which behaves like an “eternal” naïve listener (i.e. never adapting to an unfamiliar speaking style and therefore always consistent in assessment)? If the learning effect introduces an inevitable bias, can a computer model be built which behaves like an “eternal” naïve listener (i.e. never adapting to an unfamiliar speaking style and therefore always consistent in assessment)? Possible Solution: Using HMM Models to Emulate the Naïve listener A hidden Markov Model (HMM) is, essentially, a statistical representation of a speech unit at the phone/word/utterance level. HMM models are “trained” by analysing the acoustic features of multiple utterances representing the specified speech unit. A hidden Markov Model (HMM) is, essentially, a statistical representation of a speech unit at the phone/word/utterance level. HMM models are “trained” by analysing the acoustic features of multiple utterances representing the specified speech unit. Multiple Speech Samples from multiple speakers

Goodness of Fit  Once trained, an HMM word model can be used to estimate the likelihood that a given speech sound could have actually been produced by that word model. This likelihood is called a goodness of fit (GOF) and can be expressed as a log likelihood, e.g (or simply -35). Mr. HMM Model, could you’ve been my daddy? Hmm, with a log likelihood of , I’m not so sure… The more acoustically dissimilar an utterance is from what the IE has been trained on, the lower the GOF score

Using Forced-Alignment GOF scoring to measure Intelligibility  Since two of the FDA intelligibility tests require the repetition of words/phrases from a pre-selected vocabulary, HMM utterance models can be built for these words/phrases.  Furthermore, the incoming speech can be matched to the corresponding utterance model to determine the goodness of fit. This matching of a speech sample to a specific utterance model and only that model is called forced alignment.  We hypothesise that force-aligning a speech sample with its corresponding “everyman” word model will yield GOF scores which are systematically related to that speech sample’s intelligibility. When HMMs are used in this way, we call them intelligibility estimators.

…so, how does it work in practice?  IE utterance models are trained on normal speech from a variety of speakers and a range of GOF scores for normal speech test data is established: typically between -5 and -10.  Ranges have been established for moderate and low intelligibility (which, in an FDA diagnostic context = dysarthric) speech, typically with GOF scores between -11 and -20 (moderately intelligible) and < -20 (low intelligibility). These scores are relative to the maximum likelihood utterance (i.e. the speech file with the highest GOF score) in the IE’s training set.

Sample GOF scores GOF scores for isolated single words GOF scores for short sentence utterances

Problem: How do we make IEs truly naïve?  “Everyman’ HMM utterance models are not really ‘everyman’, it’s not feasible to train them on speech data representing all the world’s anglophone accents. In this experiment, the utterance models have been trained on speech principally from the South Yorkshire region, thus accents not represented in the HMM training data could receive GOF scores which do not truly reflect that speech sample’s intelligibility as perceived by a naïve listener.  A non-trivial problem: Certain anglophone accents, due to their prestige, are more universally intelligible than others, e.g. Estuary English and RP, while others are a lot less intelligible internationally (e.g. the Glaswegian accent). What mix of accents should be used to train an HMM word model to make it truly representative of a ‘typical’ naïve listener?

Objective #2: Overall Diagnosis  After collecting data from all the 28 FDA sub-tests, how do we arrive at a dysarthria sub-type diagnosis?  Usually by template matching and symptom categorisation (e.g. “At-rest tasks performed better than in-speech tasks? If so, spastic dysarthria most likely”).  Can these processes be automated? Yes, via a neural network combined with an expert system. The neural network does the basic pattern matching while the rule-based expert system attempts to disambiguate diagnostic information not directly represented in the FDA letter grades. Uncontrollably Rapid Speech Rate? Hypokinetic Dysarthria Most likely of 5 types Slow Speech Rate? Extrapyramidal Dysarthria less likely than other 4 types Yes No Flaccid Dysarthria most likely of 5 types Yes No Example of CFDA Expert system rule- based data disambiguation

Diagnostic Accuracy of Hybrid System

The automated diagnostic system will even tell you why it came to a given decision…

Future Work  Acquisition of HMM Technology which (for the Intelligibility Estimator) doesn’t have prohibitively high license fees.  Collection of dysarthric data to build an FDA- specific dysarthric speech database.  More interviews with experienced speech therapists to increase the diagnostic expert system’s knowledge database.  Results of NHS Field Trials of the CFDA application