Presentation on theme: "1 Team Talk – A Report. 2 Introduction Project done as part of 11-754, Spring ’03. Design and implementation of a spoken dialog system. Thrust of project:"— Presentation transcript:
2 Introduction Project done as part of 11-754, Spring ’03. Design and implementation of a spoken dialog system. Thrust of project: multi-participant dialog. Dialog involves: –a single user, and… –two agents.
3 Domain Description Guide two robots through a maze –Using only spoken natural language. Maze consists of interconnecting hallways. Robots do not have a map of the maze Robots are only aware of what they “see”: –which hallway they are in currently. –the hall’s north/south/east/west orientation. –where the exits lie.
4 Examples of user-robot dialog Asking robot about location: –User: “Clyde, where are you?” –Robot Clyde: “At the north end of hallway 5300” Making the robot go somewhere: –User: “Bashful, go to the north end of the hall” –Robot Bashful: “Sorry, already at the north end of the hall”; or –Robot Bashful: “Going towards the north end of the hall”
5 List of User Commands “Where are you?” “Go [relative distance]? towards [the direction-end-of]? [goal]” –[relative distance] = “all the way”; “halfway”; “a third of the way”; “two-thirds of the way”; “a fourth of the way”; “three-fourths of the way” –[direction] = “north”; “south”; “east”; “west” “Move [direction] [x] meters” –[direction] as above; [x] = 1…20
6 Who is Being Addressed? User communicates with the two robots by using their names – “Bashful” and “Clyde” Each robot is represented by a separate Dialog Manager (DM). Both DMs input user’s parsed utterance and figure out if it is being “addressed”: –addressed if and only if its own name (e.g. “Bashful” or “Clyde”) or a group designation (e.g. “everyone”) was the last name uttered by user.
7 DM’s Interaction with the Backend Each DM is connected its robot through a “backend server” (BE) Backend server has two main functions: –Maintain status (e.g. position) of robot –Convert user’s commands into low-level robot commands (e.g. move vector) Addressed dialog manager passes user command to its backend server. Backend server sends low-level messages to robot, and returns a message to the DM –E.g.: “moving to the north end of the hallway”, “sorry, already at the north end of the hallway”, etc).
8 Overall System Architecture ASR TTS Dialog Manager 1 Dialog Manager 2 Backend Server 1 Backend Server 2 Robot 1 Robot 2 User
9 Research Questions How to figure which robot is being addressed? –Our solution works for user initiative dialog. –Our solution not elegant for mixed initiative. E.g.: If user was addressing Bashful, and Clyde barged in, and then user starts talking again without explicitly addressing a robot, who is she addressing? How can a robot barge in optimally? –A big issue only for mixed initiative dialog. How can user tell which robot is speaking? –Our solution: give the two robots different voices. –This solution does not scale up beyond a few robots.
10 Software Components of the System ASR Sphinx 2. TTS Festival. Dialog Manager RavenClaw. Robot software platform Carmen.
11 Hardware Components Robot hardware Pioneer 1 (2) Computers IBM ThinkPad (3) –One each for the two robots running Linux –One for Sphinx/Festival/RavenClaw etc running Windows. Wireless connectivity external Orinoco wireless cards (3) Serial connection from laptops to robots PCMCIA card (2)