Integrated Interpretation and Generation of Task-Oriented Dialogue Alfredo Gabaldon 1 Ben Meadows 2 Pat Langley 1,2 1 Carnegie Mellon Silicon Valley 2.

Slides:

Advertisements

Similar presentations

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Advertisements

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Pat Langley Arizona State University and Institute for the Study of Learning and Expertise Expertise, Transfer, and Innovation in.

Pat Langley Institute for the Study of Learning and Expertise Palo Alto, CA Cumulative Learning of Relational and Hierarchical Skills.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Varieties of Problem Solving in a Unified Cognitive Architecture.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Extending the I CARUS Cognitive Architecture Thanks to D. Choi,

Pat Langley Dongkyu Choi Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Mental Simulation and Learning in the I CARUS Architecture.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Modeling Social Cognition in a Unified Cognitive Architecture.

Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning Hierarchical Reactive Skills from Reasoning and Experience.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona A Cognitive Architecture for Integrated.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA A Unified Cognitive Architecture for Embodied Agents Thanks.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Cognitive Architectures and Virtual Intelligent Agents Thanks.

Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California A Cognitive Architecture for Complex Learning.

Cognitive Systems, ICANN panel, Q1 What is machine intelligence, as beyond pattern matching, classification and prediction. What is machine intelligence,

ARCHITECTURES FOR ARTIFICIAL INTELLIGENCE SYSTEMS

Ch:8 Design Concepts S.W Design should have following quality attribute: Functionality Usability Reliability Performance Supportability (extensibility,

Chapter 11 user support. Issues –different types of support at different times –implementation and presentation both important –all need careful design.

Addressing Patient Motivation In Virtual Reality Based Neurocognitive Rehabilitation A.S.Panic - M.Sc. Media & Knowledge Engineering Specialization Man.

Social Planning: Achieving Goals by Altering Others’ Mental States Chris Pearce Ben Meadows Pat Langley Mike Barley Department of Computer Science Thanks.

Mixed-Initiative Planning Yolanda Gil USC CS 541 Fall 2003.

Methods: Deciding What to Design In-Young Ko iko.AT. icu.ac.kr Information and Communications University (ICU) iko.AT. icu.ac.kr Fall 2005 ICE0575 Lecture.

Object-Oriented Analysis and Design

Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.

Designing Help… Mark Johnson Providing Support Issues –different types of support at different times –implementation and presentation both important.

Cognitive Science Overview Design Activity Cognitive Apprenticeship Theory Cognitive Flexibility Theory.

Chapter 10: Architectural Design

1 User Interface Design CIS 375 Bruce R. Maxim UM-Dearborn.

Intelligent Tutoring Systems Traditional CAI Fully specified presentation text Canned questions and associated answers Lack the ability to adapt to students.

Architectural Design.

Incorporating Tutorial Strategies Into an Intelligent Assistant Jim R. Davies, Neal Lesh, Charles Rich, Candace L. Sidner, Abigail S. Gertner, Jeff Rickel.

1 ISE 412 Human-Computer Interaction Design process Task and User Characteristics Guidelines Evaluation.

The Cognitive Load Theory

Pat Langley Department of Computer Science University of Auckland New Zealand User Modeling in Conversational Interfaces We thank Alfredo Gabaldon, Mehmet.

Understanding Social Interactions Using Incremental Abductive Inference Benjamin Meadows Pat Langley Miranda Emery Department of Computer Science The University.

ISO Layer Model Lecture 9 October 16, The Need for Protocols Multiple hardware platforms need to have the ability to communicate. Writing communications.

Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009.

1 Human-Computer Interaction  Design process  Task and User Characteristics  Guidelines  Evaluation.

© 2007 Tom Beckman Features:  Are autonomous software entities that act as a user’s assistant to perform discrete tasks, simplifying or completely automating.

SOFTWARE DESIGN.

Recognizing Activities of Daily Living from Sensor Data Henry Kautz Department of Computer Science University of Rochester.

Putting Research to Work in K-8 Science Classrooms Ready, Set, SCIENCE.

Synthetic Cognitive Agent Situational Awareness Components Sanford T. Freedman and Julie A. Adams Department of Electrical Engineering and Computer Science.

Designing for Attention With Sound: Challenges and Extensions to Ecological Interface Design Marcus O. Watson and Penelope M. Sanderson HUMAN FACTORS,

Artificial Intelligence By Michelle Witcofsky And Evan Flanagan.

GRASP: Designing Objects with Responsibilities

Dept. of Computer Science University of Rochester Rochester, NY By: James F. Allen, Donna K. Byron, Myroslava Dzikovska George Ferguson, Lucian Galescu,

卓越發展延續計畫分項三 User-Centric Interactive Media ~ 主持人 : 傅立成共同主持人 : 李琳山，歐陽明，洪一平，陳祝嵩水美溫泉會館研討會

PSY 369: Psycholinguistics Conversation & Dialog: Language Production and Comprehension in conjoined action.

A Common Ground for Virtual Humans: Using an Ontology in a Natural Language Oriented Virtual Human Architecture Arno Hartholt (ICT), Thomas Russ (ISI),

Chap#11 What is User Support?

Agents that Reduce Work and Information Overload and Beyond Intelligent Interfaces Presented by Maulik Oza Department of Information and Computer Science.

Reading Flash. Training target: Read the following reading materials and use the reading skills mentioned in the passages above. You may also choose some.

Course: COMS-E6125 Professor: Gail E. Kaiser Student: Shanghao Li (sl2967)

Dr. Rebhi S. Baraka Advanced Topics in Information Technology (SICT 4310) Department of Computer Science Faculty of Information Technology.

AAAI Fall Symposium on Mixed-Initiative Problem-Solving Assistants 1 Mixed-Initiative Dialogue Systems for Collaborative Problem-Solving George Ferguson.

RULES Patty Nordstrom Hien Nguyen. "Cognitive Skills are Realized by Production Rules"

Human-Computer Interaction Design process Task and User Characteristics Guidelines Evaluation ISE

Banaras Hindu University. A Course on Software Reuse by Design Patterns and Frameworks.

Artificial Intelligence

Cognitive Architectures and General Intelligent Systems Pay Langley 2006 Presentation : Suwang Jang.

WHAT ARE PLANS FOR? Philip E. Agre David Chapman October 1989 CS 790X ROBOTICS Presentation by Tamer Uz.

Understanding Naturally Conveyed Explanations of Device Behavior Michael Oltmans and Randall Davis MIT Artificial Intelligence Lab.

WP6 Emotion in Interaction Embodied Conversational Agents WP6 core task: describe an interactive ECA system with capabilities beyond those of present day.

Architecture Components

Chapter 10: Process Implementation with Executable Models

Chapter 11 user support.

Presentation transcript:

Integrated Interpretation and Generation of Task-Oriented Dialogue Alfredo Gabaldon 1 Ben Meadows 2 Pat Langley 1,2 1 Carnegie Mellon Silicon Valley 2 University of Auckland 1 IWSDS 2014

The Problem: Conversational Assistance Interpret users’ utterances and other environmental input; Infer common ground (Clark, 1996) about joint beliefs/goals; Generate utterances to help users achieve the joint task. 2 We want systems that interact with humans in natural language to help them carry out complex tasks. Such a system should: In this talk, we report on two systems for task-oriented dialogue and the architecture that supports them.

Target Domains Two cooperating agents adopt and pursue a shared goal; The agents construct models of each others’ mental states; One agent (the advisor) has expertise to solve the problem but cannot act directly; and Another agent (the novice) lacks expertise but can act when provided with instruction. 3 We have focused on task-oriented dialogue settings in which: However, the architecture is not limited to these settings and should apply to other areas that involve social cognition.

A Medic Assistant 4 The medic has limited training but can provide situational information and affect the environment; The system has medical expertise, but cannot sense or alter the environment directly; it can only offer instructions; The medic and system collaborate to achieve their shared goal of helping the injured person. One application focuses on scenarios in which a human medic helps injured teammates with system assistance: The system uses a Web interface similar to a messaging app; we hope to replace with a spoken-language interface in the future.

Sample Dialogue * for Medic Assistant M: We have a man injured! A: Ok. What type of injury? M: He’s bleeding. A: How bad is the bleeding? M: Pretty bad. I think it is the artery. A: Ok. Where is the injury? M: It’s on the left leg. A: Apply pressure on the leg’s  pressure point. M: Roger that. A: Has the bleeding stopped? M: No. He’s still bleeding. A: Ok. Apply a tourniquet. M: Where do I put the tourniquet? A: Just below the joint above the  wound. M: Ok. The bleeding stopped. A: Good job. 5 M: medic A: advisor * We provide the system with the meaning for each of M’s utterances.

A System for Meeting Support 6 Participate in the meeting from remote locations; Share the goal of completing a physical therapy task; Contribute to a dialogue that includes our system; Interact with sensors (e.g., motion detector) and effectors (e.g., a video player). A second system supports cyber-physical meetings in which people and devices interact to carry out a joint task. One scenario is a physical therapy session in which a therapist and a patient: Users communicate by sending controlled English sentences via a menu-based phone interface (also to be replaced).

Sample Dialogue for Meeting Support T: John, please do leg lifts. P: Okay. I will do leg lifts. TV: [starts playing video tutorial] MS: [detects movement; sends signal] Sys: Great start John! MS: [sending more movement signals] Sys: You are doing great, John! P: I can’t do any more leg lifts. T: It’s okay, John. That was good. Sys: John, you did eight repetitions. P: How many did I do in the previous  session? Sys: In the previous session you did  five repetitions. 7 T: therapist P: patient MS: motion sensor Sys: meeting support system

Observations about Sample Dialogues Behavior is goal-directed and involves communicative actions about joint activity; Participants develop common ground for the situation that includes each other’s beliefs and goals; Many beliefs and goals are not explicitly stated but rather inferred by participating agents; and The overall process alternates between drawing inferences and executing goal-directed activities. 8 We have focused on these issues in our work, but not on other important challenges, such as speech recognition. These sample dialogues, although simple, raise several issues:

System Requirements Incrementally interpret both the physical situation and others’ goals and beliefs; Carry out complex goal-directed activities, including utterance generation, in response to this understanding; Utilize both domain and dialogue-level knowledge to support such inference and execution. 9 We have developed an integrated architecture that responds to these requirements. These observations suggest that a dialogue system should have the ability to:

Dialogue Architecture Both systems are based on a novel architecture for task-oriented dialogue that combines:  Content specific to a particular domain;  Generic knowledge about dialogues. 10 We will describe in turn the architecture’s representations and the mechanisms that operate over them. The new dialogue architecture integrates: Dialogue-level and domain-level knowledge; Their use for both understanding and execution.

Representing Beliefs and Goals Domain content: reified relations/events in the form of triples: [inj1, type, injury] [inj1, location, left_leg] Speech acts: inform, acknowledge, question, accept, etc. inform(medic, computer, [inj1, location, left_leg]) Dialogue-level predicates: belief(medic, [inj1, type, injury], ts, te) goal(medic, [inj1, status, stable], ts, te) belief(medic, goal(computer, [inj1, status, stable], t1, t2), ts, te) goal(medic, belief(computer, [inj1, location, torso], t1, t2), ts, te) belief(computer, belief_if(medic, [inj1, location, torso], t1, t2), ts, te) belief(computer, belief_wh(medic, [inj1, location], t1, t2), ts, te) belief(computer,inform(medic,computer,[inj1,location,left_leg]),ts,te) 11

Representing Domain Knowledge Conceptual rules that associate situational patterns with predicates; and Skills that associate conditions, effects, and subskills with predicates. 12 Our framework assumes that domain-level knowledge includes: Both concepts and skills are organized into hierarchies, with complex structures defined in terms of simpler ones.

Representing Dialogue Knowledge Speech-act rules that link belief/goal patterns with act type; Skills that specify domain-independent conditions, effects, subskills (e.g., a skill to communicate a command); and A dialogue grammar that states relations among speech acts (e.g., 'question' followed by 'inform' with suitable content). 13 Together, these provide the background content needed to carry out high-level dialogues about joint activities. Our framework assumes three kinds of dialogue knowledge:

Example of Dialogue Knowledge Speech-act rules associate a speech act with a pattern of beliefs and goals, as in: 14 This rule refers to the content, X, of the speech act, that occurs with the given pattern of beliefs and goals. inform(S, L, X) ← belief(S, X, T0, T4), goal(S, belief(L, X, T3,T7), T1, T5), belief(S, inform * (S, L, X), T2, T6), belief(S, belief(L, X, T3, T7), T2, T8), T0 < T1, T1 < T2, T2 < T3, T1 < T4, T6 < T5. The content is a variable and the pattern of beliefs and goals is domain independent.

Architectural Overview 15 Our architecture operates in discrete cycles during which it: At a high level, it operates in a manner similar to production- system architectures like Soar and ACT-R. Observes new speech acts, including ones it generates itself; Uses inference to update beliefs/goals in working memory; Executes hierarchical skills to produce utterances based on this memory state. Speech act observation Conceptual inference Skill execution

Dialogue Interpretation 16 Our inference module accepts environmental input (speech acts and sensor values) and incrementally: This abductive process carries out heuristic search for a coherent explanation of observed events. The resulting inferences form a situation model that influence system behavior. Retrieves rule instances connected to working memory elements; Uses coherence and other factors to select a rule to apply; and Makes default assumptions about agent beliefs/goals as needed.

Dialogue Generation 17 On each architectural cycle, the hierarchical execution module: Selects a top-level goal that is currently unsatisfied; Finds a skill that should achieve the goal whose conditions match elements in the situation model; Selects a path downward through the skill hierarchy that ends in a primitive skill; and Executes this primitive skill in the external environment. In the current setting, execution involves producing speech acts. The system generates utterances by filling in templates for the selected type of speech act.

Claims about Approach 18 In our architecture, integration occurs along two dimensions: Dialogue-level knowledge is useful across distinct, domain- specific knowledge bases; The architecture’s mechanisms operate effectively over both forms of content. Knowledge integration at the domain and dialogue levels; Processing integration of understanding and execution. We make two claims about this integrated architecture: We have tested these claims by running our dialogue systems over sample conversations in different domains.

Evaluation of Dialogue Architecture 19 We have tested our dialogue architecture on three domains: Medic scenario: 30 domain predicates and 10 skills; Elder assistance: six domain predicates and 16 skills; Emergency calls: 16 domain predicates and 12 skills. Dialogue knowledge consists of about 60 rules that we held constant across domains. Domain knowledge is held constant across test sequences of speech acts within each domain.

Test Protocol 20 In each test, we provide the system with a file that contains the speech acts of the person who needs assistance. Speech acts are interleaved with the special speech acts over and out, which simplify turn taking. In a test run, the system operates in discrete cycles that: The speech acts from the file, together with speech acts that the system generates, should form a coherent dialogue. Read speech acts from the file until the next over; Add these speech acts to working memory; and Invoke the inference and execution modules.

Test Results 21 Using this testing regimen, the integrated system successfully: The results are encouraging but we should extend the system to handle a broader range of speech acts and dialogues. Produces coherent task-oriented dialogues; and Infers the participants’ mental states at each stage. Separating generic dialogue knowledge from domain content supports switching domains with relative ease; Integration of inference and execution in the architecture operates successfully over both kinds of knowledge. We believe these tests support our claims and suggest that:

Related Research Dialogue systems: – TRIPS (Ferguson & Allen, 1998) – Collagen (Rich, Sidner, & Lesh, 2001) – WITAS dialogue system (Lemon et al., 2002) – RavenClaw (Bohus & Rudnicky, 2009) Integration of inference and execution: – Many robotic architectures – I CARUS (Langley, Choi, & Rogers, 2009) 22 A number of earlier efforts have addressed similar research issues: Our approach incorporates many of their ideas, but incorporates a uniform representation for dialogue and domain knowledge.

Plans for Future Work Dialogue processing with more types of speech acts; Interpretation and generation of subdialogues; Mechanisms for recovering from misunderstandings; and Belief revision to recover from faulty assumptions. 23 In future research, we plan to extend our architecture to address: These steps will further integrate cognitive processes and increase the architecture’s generality.

Summary Remarks We have developed an architecture for task-oriented dialogue that:  Cleanly separates domain-level from dialogue-level content;  Integrates inference for situation understanding with execution for goal achievement;  Utilizes these mechanisms to process both forms of content. 24 Our experimental runs with the architecture demonstrate that: Dialogue-level content works with different domain content; Inference and execution operate over both types of knowledge. These encouraging results suggest our approach is worth pursuing.

End of Presentation