Presentation on theme: "Spoken Dialogue Technology Achievements and Challenges"— Presentation transcript:
1 Spoken Dialogue Technology Achievements and Challenges Michael McTearUniversity of Ulster
2 Overview Introduction - What is a spoken dialogue system? Examples of spoken dialogue systemsTechnical issues and challengesFuture Prospects
3 What is a spoken dialogue system? A spoken dialogue system is an automated system that engages in a dialogue with a human user using spoken language as the medium of interaction.
4 Types of dialogue system Two main types of spoken dialogue systemTask-oriented: involves the use of dialogues to accomplish a task, e.g. making a hotel booking, or planning a family holidayNon-task-oriented: engaging in conversational interaction, but without necessarily being involved in a task that needs to be accomplished e.g conversational companion for the elderly
5 Application Domains for SDS Telephone-based services and transactionsCall-routing, Directory assistance, Travel enquiries, Bank balance, Bank transactions, Flight / hotel / car rental reservationsIn-car interactive and entertainment systemsAutomated trouble-shootingSmart homes applicationsHealth-care systems e.g. patient monitoringEducational e,g. Intelligent Tutoring Systems, Foreign Language LearningComputer games
6 Three generations of task-oriented spoken dialogue system Informational – to retrieve information e.g. flight times, football scores, …Transactional – to assist the user to perform a transaction e.g. book a flight, pay a billProblem-solving – to support the user in solving a problem e.g. to troubleshoot a PC that is not working
7 Why is dialogue interesting? Fundamental aspect of human behaviourModel human conversational competenceSimulate human conversational behaviourProvide tool for interacting with data, services, resources on computersResearch challengesApplications in assistive and educational environmentsCommercial opportunities
8 Commercial Systems Focus on Business opportunities, return on investment (ROI)Benefits for end usersBenefits for providersHuman factors: performance, usabilityTools and languages for design and maintainabilityApplication areas: call centre, enquiries, transactions, healthcare, …
9 Academic Systems Focus on Technologies: speech recognition, spoken language understanding, dialogue managementAI inspired: planning, reasoning, machine learningStatistical v symbolic approachesAdvanced dialogue control, error handling, adaptivity, context representation
10 Overview Introduction - What is a spoken dialogue system? Examples of spoken dialogue systemsTechnical issues and challengesFuture Prospects
11 Example 1: Voice Menu System: Hello and welcome …. Main menu. For customer service, say ‘service’.To enquire about an existing order, say ‘order’ …User: ServiceSystem: Customer service. Would you like to report a fault or enquire about an extended warranty?User: FaultSystem: Do you have a PC or a laptop?User: LaptopSystem: And the name of the manufacturer?User: SonySystem: Thank you. Please hold while I transfer you to the Sony …
12 Example 2: Research System (Mercury: MIT) Open ended promptHow may I help you?Disfluencies in inputAugust twenty-first no August twelfthI'd like to fly from Boston to Minneapolis on Tuesday no Wednesday November 21stInexact responsePrompt: Can you provide the approximate departure time or airline preferenceUser: Yeah I'd like to fly United and I'd like to leave in the afternoon
13 Example 2: continued Response generation There are more than 3 flights.The earliest departure leaves at 1.45 pm.Mixed initiative: user asks questionDo you have something leaving around 4.45?Relative date referenceI’d like to return the following Tuesday
14 Example 3: Voice Search GOOG411 GOOG-411 (or Google Voice Local Search) is Google'snew 411 service.With GOOG-411, you can find local business informationcompletely free, directly from your phone.You can access GOOG-411 from any phone,anywhere, at anytime.
15 GOOG411: Prompts What city and state? What business name or category? (Lists services) Number one, …..Connects to requested service
16 GOOG411: What can you say? At any point in the call: To go back say "go back"To start over say "start over" or press *All phonesWhen asked for a city and state: Say the full names for example, "Palo Alto California“To enter a zip code say it or enter with keypadWhen asked for business name or category: Say the full names for example, "Joe's Pizzaria" or "Pizza“When given results: To navigate between results say or press the listing numberTo receive an SMS say "text message"To receive a map say "map it"To get more details say "details"
17 Overview Introduction - What is a spoken dialogue system? Examples of spoken dialogue systemsTechnical issues and challengesFuture Prospects
18 Architecture of a spoken dialogue system SpeechRecognition(ASR)BackendResponseGenerationText to SpeechSynthesis(TTS)a --> xuSpokenLanguageUnderstanding(SLU)yu, cã, cConceptsWordsAudioHMMAcousticModelN-GramDialogue Manager (DM)DialogueControlContext Modela user dialogue act (intended ) c confidenceã user dialogue act (interpreted)xu user acoustic signalyu speech recognition hypothesis (words)
19 Component Technologies Automatic Speech Recognition (ASR)Spoken Language Understanding (SLU)Response Generation (RG)Text to speech synthesis (TTS)Dialogue Management (DM)
20 Issues in ASR for Dialogue recognising spontaneous speech in noisy environmentsword accuracy does not have to be 100%use of confidence scores in combination with other information to determine DM actionsuse of additional information (ASR and parse probabilities, semantic and contextual features) to re-score recognition hypotheses
21 Issues in SLU for Dialogue grammars and parsers for spontaneous speech (disfluencies, errors)robust understandingproblems with hand-crafted approachesuse of statistical/ data-driven methodscombined approaches e.g TINA (MIT)hand-crafted rules with trained probabilitiesrobust strategy – if full sentence cannot be parsed, parse and combine fragments, else use word spotting
22 Issues in Response Generation for Dialogue Content selectionDetermining what to say, selecting and ranking optionsDiscourse planningdiscourse relations e.g. comparison, contrastuser-adapted informationPresentation orderingReferring expression generationAggregation – grouping propositions into clauses and sentencesUse of discourse cues (e.g. firstly, finally, however, moreover, …)
23 Issues in Dialogue Management Dialogue ControlScripts, frames, intelligent agentsRepresentationsInformation State TheoryError handlingDialogue designTraditional approachesStatistical approachesReinforcement learningCorpus / example based approaches
24 Overview Introduction - What is a spoken dialogue system? Examples of spoken dialogue systemsTechnical issues and challengesFuture Prospects
25 A vision for the futureDevelop systems that can interact intelligently and co-operatively across a range of environments using a range of appropriate modalities to support people in the activities of their daily lives.
26 Fundamental research topics Modelling human conversational competenceDialogue-related issues for ASR, SLU, NLG, TTSComparison of methods for dialogue management: rule-based v stochasticRepresentation and use of contextual informationIntegration and usage of modalities to complement and supplement speechIncremental processing in dialogue
27 Areas of application Voice search Dialogue in vehicles Mobile speech applicationsMultimodal embodied and situated systemsTroubleshooting applicationsDialogue systems for ambient intelligence and as assistive technologies
28 Concluding remarks Spoken Dialogue Technology embraces a range of speech and language technologiesposes lots of theoretical as well as practical challengesis interesting for commercial developers as well as academic researchershas a wide range of potential applications
29 Recommended readingMcTear, M. (2004) Spoken Dialogue Technology. Springer.Lopez Cozar, R. & Araki, M. (2005) Spoken, multilingual and multimodal dialogue systems. John Wiley & Sons.Aghajan, H., Augusto, J.C., Lopez Cozar, R. (2009) Human-Centric Interfaces for Ambient Intelligence. Elsevier.Jokinen, K. & McTear, M. (2010) Spoken Dialogue Systems. Morgan Claypool Publishers.Wilks, Y. (ed.) (2010) Close Engagements with Artificial Companions: Key social, psychological, ethical and design issues. John Benjamins Publishing Company.