Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning to Interpret Natural Language Instructions Monica Babeş-Vroman +, James MacGlashan *, Ruoyuan Gao +, Kevin Winner * Richard Adjogah *, Marie desJardins.

Similar presentations


Presentation on theme: "Learning to Interpret Natural Language Instructions Monica Babeş-Vroman +, James MacGlashan *, Ruoyuan Gao +, Kevin Winner * Richard Adjogah *, Marie desJardins."— Presentation transcript:

1 Learning to Interpret Natural Language Instructions Monica Babeş-Vroman +, James MacGlashan *, Ruoyuan Gao +, Kevin Winner * Richard Adjogah *, Marie desJardins *, Michael Littman + and Smaranda Muresan ++ *Department of Computer Science and Electrical Engineering University of Maryland, Baltimore County +Computer Science Department, Rutgers University ++School of Communication and Information, Rutgers University NAACL-2012 WORKSHOP From Words to Actions: Semantic Interpretation in an Actionable Context

2 Outline Motivation and Problem Our approach Pilot study Conclusions and future work

3 Motivation Bring me the red mug that I left in the conference room

4 Motivation

5

6 Train an artificial agent to learn to carry out complex multistep tasks specified in natural language. Problem

7 Another Example of A Task of pushing an object to a room. Ex : square and red room Abstract task: move obj to color room move square to red roommove star to green roomgo to green room

8 Outline What is the problem? Our approach Pilot study Conclusions and future work

9 Training Data: S2S2 F1F2F3F4F5F6 S3S3 S4S4 S1S1 Object-oriented Markov Decision Process (OO-MDP) [Diuk et al., 2008]

10 “Push the star into the teal room”

11 “Push the star into the teal room” Task Learning from Demonstrations Semantic Parsing Task Abstraction F2 F6 Our Approach push(star1, room1) P1(room1, teal) F1F2F3F4F5F6

12 “Push the star into the teal room” Task Learning from Demonstrations Semantic Parsing Task Abstraction F2 F6 Our Approach “teal” push(star1, room1) P1(room1, teal) P1 color “push“ objToRoom

13 “Push the star into the teal room” Task Learning from Demonstrations Semantic Parsing Task Abstraction Our Approach

14 π*π* S1S1 S2S2 S3S3 S4S4 30% 70% 0% 100% 0% RL Task Learning from Demonstration = Inverse Reinforcement Learning (IRL) But first let’s see Reinforcement Learning Problem S1S1 S2S2 S3S3 S4S4 MDP (S, A, T, γ ) R

15 πEπE S1S1 S2S2 S3S3 S4S4 30% 70% 0% 100% 0% Task Learning From Demonstrations Inverse Reinforcement Learning

16 πEπE S1S1 S2S2 S3S3 S4S4 30% 70% 0% 100% 0% Task Learning From Demonstrations Inverse Reinforcement Learning

17 πEπE S1S1 S2S2 S3S3 S4S4 30% 70% 0% 100% 0% R Task Learning From Demonstrations Inverse Reinforcement Learning S1S1 S2S2 S3S3 S4S4 MDP (S, A, T, γ ) IRL

18 Maximum Likelihood IRL [M. Babeş, V. Marivate, K. Subramanian and M. Littman 2011] θ Pr( ) R π Boltzmann Distribution update in the direction of ∇ Pr( ) φ MLIRL Assumption: Reward function is a linear combination of a known set of features. (e.g., ) Goal: Find θ to maximize the probability of the observed trajectories. (s,a): (state, action) pair R: reward function φ: feature vector π: policy Pr( ): probability of demonstrations

19 “Push the star into the teal room” Task Learning from Demonstrations Semantic Parsing Task Abstraction Our Approach

20 Semantic model (ontology) NP (w, ) → Adj (w 1, ), NP (w 2, ): Φ i NP (w, ) → Det (w 1, ), NP (w 2, ): Φ i … Syntax + Semantics + Ontological Constraints Grammar PARSINGPARSING Semantic representation text Semantic Parsing Lexicalized Well-Founded Grammars (LWFG) [Muresan, 2006; 2008; 2012]

21 Empty Semantic model Grammar PARSINGPARSING X1.is=teal, X.P1=X1, X.isa=room teal room Semantic Parsing Lexicalized Well-Founded Grammars (LWFG) [Muresan, 2006; 2008; 2012] room1 teal P1 uninstantiated variable ≅ NP (w, ) → Adj (w 1, ), NP (w 2, ): Φ i NP (w, ) → Det (w 1, ), NP (w 2, ): Φ i …

22 Syntax/Semantics/Ontological Constraints Grammar PARSINGPARSING X1.is=green, X.color=X1, X.isa=room teal room Semantic Parsing Lexicalized Well-Founded Grammars (LWFG) [Muresan, 2006; 2008; 2012] room1 color ≅ Semantic model (ontology) green NP (w, ) → Adj (w 1, ), NP (w 2, ): Φ i NP (w, ) → Det (w 1, ), NP (w 2, ): Φ i …

23 Learning LWFG [Muresan and Rambow 2007; Muresan, 2010; 2011] Currently learning is done offline No assumption of existing semantic model (loud noise, ) (the noise, ) … Small Training Data RELATIONAL LEARNING MODEL LWFG NP PP Representative Examples (loud noise) (the noise) (loud clear noise) (the loud noise) … Generalization Set NP NP (w, ) → Adj (w 1, ), NP (w 2, ):Φ i NP (w, ) → Det (w 1, ), NP (w 2, ):, Φ i …

24 Using LWFG for Semantic Parsing Obtain representation with same underlying meaning despite different syntactic forms “go to the green room” vs “go to the room that is green” – Dynamic Semantic Model Can start with an empty semantic model Augmented based on feedback from the TA component push  blockToRoom, teal  green

25 “Push the star into the teal room” Task Learning from Demonstrations Semantic Parsing Task Abstraction Our Approach

26 Task Abstraction Creates an abstract task to represent a set of similar grounded tasks – Specifies goal conditions and reward function in terms of OO-MDP propositional functions Combines information from SP and IRL and provides feedback to both – Provides SP domain specific semantic information – Provides IRL relevant features

27 Similar Grounded Tasks Goal Condition objectIn(o2, l1) Other Terminal Facts isSquare(o2) isYellow(o2) isRed(l1) isGreen(l2) agentIn(o46, l2) … Goal Condition objectIn(o15, l2) Other Terminal Facts isStar(o15) isYellow(o15) isGreen(l2) isRed(l3) objectIn(o31, l3) … Some grounded tasks are similar in logical goal conditions, but different in object references and facts about those referenced objects

28 Abstract Task Definition Define task conditions as a partial first order logic expression The expression takes propositional function parameters to complete it

29 System Structure Semantic Parsing Inverse Reinforcement Learning (IRL) Task Abstraction

30 Pilot Study Unigram Language Model Data: (S i, T i ). Hidden Data: z ji : Pr(R j | (S i, T i ) ). Parameters: x kj : Pr(w k | R j ).

31 Pilot Study Data: (S i, T i ). Hidden Data: z ji : Pr(R j | (S i, T i ) ). Parameters: x kj : Pr(w k | R j ). Unigram Language Model Pr(“push”|R1)Pr(“room”|R1)... Pr(“push”|R 2 )Pr(“room”|R 2 )...

32 Pilot Study New Instruction, S N : “Go to the green room.” Pr(“push”|R1)Pr(“room”|R1)... Pr(“push”|R 2 )Pr(“room”|R 2 )...

33 Pilot Study New Instruction, S’ N : “Go with the star to green.” Pr(“push”|R1)Pr(“room”|R1)... Pr(“push”|R 2 )Pr(“room”|R 2 )...

34 Outline What is the problem? Our approach Pilot study Conclusions and future work

35 Conclusions Proposed new approach for training an artificial agent to learn to carry out complex multistep tasks specified in natural language, from pairs of instructions and demonstrations Showed pilot study with a simplified model

36 Current/Future Work SP, TA and IRL integration High-level multistep tasks Probabilistic LWFGs – Probabilistic domain (semantic) model – Learning/Parsing algorithms with both hard and soft constraints TA – add support for hierarchical task definitions that can handle temporal conditions IRL – find a partitioning of the execution trace where each partition has it own reward function

37 Thank you!


Download ppt "Learning to Interpret Natural Language Instructions Monica Babeş-Vroman +, James MacGlashan *, Ruoyuan Gao +, Kevin Winner * Richard Adjogah *, Marie desJardins."

Similar presentations


Ads by Google