Presentation on theme: "Conversational Technologies Using Speech Recognition for Speech Therapy A multimodal application for users with aphasia Deborah A. Dahl SpeechTEK August."— Presentation transcript:
Conversational Technologies Using Speech Recognition for Speech Therapy A multimodal application for users with aphasia Deborah A. Dahl SpeechTEK August 20-23, 2007
Conversational Technologies The Problem Acquired aphasia is a disability that affects 1 million American adults Aphasia is a general term that refers to difficulty with language because of an injury to the parts of the brain that control language Aphasia leads to social isolation and inability to work Insurance only pays for a limited amount of speech therapy Speech recognition could provide feedback to these users about the accuracy of their words
Conversational Technologies Participants Myrna Schwartz and Ruth Fink (MossRehab) MossRehab is part of the Albert Einstein Healthcare Network, a member of the Jefferson Health System. MossRehab has consistently been named one of “America's Best Hospitals” by U.S.News & World Report and focuses on innovative research and outstanding clinical care in medical and physical rehabilitation Deborah Dahl (Conversational Technologies) Conversational Technologies provides consulting services in speech technology which enable its customers to apply speech recognition and text to speech to create innovative applications and products Funding: This project is funded, in part, under a grant with the Pennsylvania Department of Health. The Department specifically disclaims responsibility for any analyses, interpretations or conclusions. MossTalk Words® was developed under partial funding from the McLean Contributionship and MossRehab
Conversational Technologies Users Users have aphasia, most often due to a stroke They are very motivated to improve their speech and language abilities and are receptive to using computer programs However, they vary in their ability to use a mouse or keyboard, read, speak, and understand spoken language
Conversational Technologies The Approach An earlier program, MossTalk Words ® (Fink, et. al, 2002) shows users a picture, and they try to say the word that corresponds to the picture Cues are available to help the user remember the word if necessary In the original program, users self-monitored the correctness of their words, or worked with a clinician. This project adds speech recognition to reduce the need for a clinician to be present As soon as the user says the right word, the speech recognizer plays a tone, says “that’s right”, says the target word, and shows the text of the target word Advantages of the computer Low cost Available 24/7 Consistent User’s performance is automatically recorded
Conversational Technologies Architecture logging custom exercises speech grammar recorded files Web Browser Naming Exercise (applet) speech recognizer MossTalk Words (HTML pages) therapy logic
Conversational Technologies Speech Recognition System uses the Microsoft 6.1 speech recognition engine dynamic grammars are modified to include just the words for each picture in addition to the correct word and its synonyms, the grammars also include minor “acceptable” phonological variations
Conversational Technologies Testing Pretest with 8 people without aphasia Initial test with 7 users mild to moderate aphasia, specifically anomia, in which errors are word-based rather than sound based good articulation
Conversational Technologies Test Results – Speech Recognition Goal: measure accuracy of speech recognition in this application Metric is correct acceptance -- when the user says the correct word, the recognizer responds if the recognizer doesn’t respond, that counts as an error
Conversational Technologies Test Results -- User Satisfaction Goal: assess users’ subjective response to the system and to speech recognition Developed a 7 item questionnaire (5 point scale) Example questions I enjoyed talking to the computer I would recommend this system to other people with aphasia I would like to have this computer program for practicing Tested with default speech profile and female profile
Conversational Technologies Results (seven users) -- User Satisfaction and Speech Recognizer Accuracy blue – male with male profile green – female with female profile red – female with male profile same user with different profiles
Conversational Technologies Users’ Comments “when we said it again, it understood me, it was perfect” (S1) “the computer was good at understanding me” (S2) “I liked it a lot when the computer understood me” (S3) “I enjoyed hearing the computer tell me I was correct” (S5) “It didn’t bother me when the computer didn’t understand me, I knew I was right” (S6)
Conversational Technologies Conclusions Users with good articulation can get satisfactory speech recognition Could use for naming practice Next test: Could this system be used for pronunciation practice? Test with users with articulation problems How closely does ASR match intuitions of speech therapists about acceptable/unacceptable pronunciations?
Conversational Technologies More Information Contact: http://www.ncrrn.org/contact/fink http://www.mosstalk.com References to earlier versions of MossTalk Words ® Fink, R.B., Brecher, A.R., Montgomery, M., and Schwartz, M.F.( 2001). MossTalk Words [computer software manual]. Philadelphia: Albert Einstein Healthcare Network. Fink, R.B., Brecher, A., Schwartz, M.F., & Robey, R.R. (2002). A computer implemented protocol for treatment of naming disorders: Evaluation of clinician-guided and partially self-guided instruction. Aphasiology, 16, 1061-1086.