Presentation is loading. Please wait.

Presentation is loading. Please wait.

Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen.

Similar presentations


Presentation on theme: "Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen."— Presentation transcript:

1 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen Chang Connecting Learning and Logic

2 Eyal Amir, Cambridge Present., May Problem: Learn Actions’ Effects Given: a sequence of observations over timeGiven: a sequence of observations over time –Example: Action a was executed –Example: State feature f has value T Want: an estimate of actions’ effect modelWant: an estimate of actions’ effect model –Example: a is executable if the state satisfies some property –Example: under condition _, a has effect _

3 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Example: Light Switch Time ActionObserve (after action) Posn. Bulb Switch 0 E ~up 0 E ~up 1 go-W~E ~on 1 go-W~E ~on 2 sw-up~E ~on FAIL 2 sw-up~E ~on FAIL 3 go-E E ~up 3 go-E E ~up 4 sw-up E up 4 sw-up E up 5 go-W~E on 5 go-W~E on

4 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Demo

5 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Example: Light Switch State 1State 2 west east west east west east west east ~up ^ ~on ^ E  up ^ on ^ E ~up ^ ~on ^ E  up ^ on ^ E Flipping the switch changes world stateFlipping the switch changes world state We do not observe the state fullyWe do not observe the state fully ~up ~on up on

6 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Motivation: Exploration Agents Exploring partially observable domainsExploring partially observable domains –Interfaces to new software –Game-playing/companion agents –Robots exploring buildings, cities, planets –Agents acting in the WWW Difficulties:Difficulties: –No knowledge of actions’ effects apriori –Many features –Partially observable domain

7 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Rest of This Talk 1.Actions in partially observed domains 2.Efficient learning algorithms 3.Related Work & Conclusions 4.[Theory behind Algorithms]

8 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Learning: Update knowledge of the transition relation and state of the worldLearning: Update knowledge of the transition relation and state of the world Learning Transition Models k4k3k2k1 a1 Knowledge state s1s4s3s2 World state a2a3a4 Action TransitionRelation TransitionKnowledge 3132

9 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Action Model: Set Problem: n world features  2^(2^n) transitions T1+T1+T1+T1+ T3+T3+T3+T3+ T2+T2+T2+T2+ T3+T3+T3+T3+ T1+T1+T1+T T1+T1+T1+T1+ T2+T2+T2+T2+ T2+T2+T2+T2+ T3+T3+T3+T3+ T3+T3+T3+T3+ T2+T2+T2+T2+ T3+T3+T3+T

10 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Rest of This Talk 1.Actions in partially observed domains 2.Efficient algorithms 1.Updating a Directed Acyclic Graph (DAG) 2.Factored update (flat formula repn.) 3.Related Work & Conclusions 4.[Theory behind Algorithms]

11 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Compact Encoding (Sometimes) Transition Belief State = a logical formula (transition relation and state)Transition Belief State = a logical formula (transition relation and state) Observation = logical state formulaeObservation = logical state formulae

12 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Compact Encoding (Sometimes) Transition Belief State = a logical formula (transition relation and state)Transition Belief State = a logical formula (transition relation and state) Observation = logical state formulaeObservation = logical state formulae Actions = propositional symbols assert effect rulesActions = propositional symbols assert effect rules –“sw-up causes on ^ up if E” –“go-W keeps up” (= “go-W causes up if up” …) –Prop symbol: go-W ≈up, sw-up on E, sw-up up E

13 Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 tr 1 tr 2 init locked PressB causes ¬ locked if locked PressB causes locked if ¬ locked expl(0) Updating the Status of “Locked” Time 0

14 Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 tr 1 tr 2 init locked PressB causes ¬ locked if locked PressB causes locked if ¬ locked expl(0) Updating the Status of “Locked” Time t expl(t)

15 Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 tr 1 tr 2 init locked PressB causes ¬ locked if locked PressB causes locked if ¬ locked expl(0) Updating the Status of “Locked” Time t+1 expl(t)   .... ¬ expl(t+1) “locked” holds because PressB caused it ¬  .... “locked” holds because PressB did not change it

16 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Algorithm: Algorithm: Update of a DAG 1.Given: action a, observation o, transition-belief formula φ t 2. 2.for each fluent f, a. a.kb:= kb Λ logic formula “a is executable” b. b.expl'f := logical formula for the possible explanations for f’s value after action a c. c.replace every fluent g in expl’f with a pointer to explg d. d.update explf := expl'f 3.φ t+1 is result of 2 together with o

17 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Fast Update: DAG Action Model DAG-update algorithm takes constant time (using hash table) to update formulaDAG-update algorithm takes constant time (using hash table) to update formula Algorithm is exactAlgorithm is exact Result DAG has size O(Tn k +|φ 0 |)Result DAG has size O(Tn k +|φ 0 |) –T steps, n features, k features in action preconditions –Still only n features/variables Use φ t with a DAG-DPLL SAT-solverUse φ t with a DAG-DPLL SAT-solver

18 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Experiments: DAG Update

19 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Experiments: DAG Update

20 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Experiments: DAG Queries

21 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Rest of This Talk 1.Actions in partially observed domains 2.Efficient algorithms 1.Updating a Directed Acyclic Graph (DAG) 2.Factored update (flat formula repn.) 3.Related Work & Conclusions 4.[Theory behind Algorithms]

22 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Distribution for Some Actions Project[a](  Project[a](  Project[a](  ) Project[a](  Project[a](  Project[a](  ) Project[a](  Project[a](  Project[a](  ) Project[a](  Project[a](  Project[a](  )  Project[a](   Project[a](   Project[a](  Project[a](TRUE) Compute update for literals in the formula separately, and combine the resultsCompute update for literals in the formula separately, and combine the results Known Success/FailureKnown Success/Failure 1:1 Actions 1:1 Actions

23 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Actions that map states 1:1 Reason for distribution over Reason for distribution over  Project[a](  Project[a](  Project[a](  ) 1:1 Non-1:1

24 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Algorithm: Factored Learning Given: action a, observation o, transition-belief formula φ tGiven: action a, observation o, transition-belief formula φ t 1.Precompute update for every literal 2.Decompose φ t recursively, update every literal separately, and combine the results 3.Conjoin the result of 2. with o, producing φ t+1

25 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Fast Update of Action Model Factored Learning algorithm takes time O(|φ t |) to update formulaFactored Learning algorithm takes time O(|φ t |) to update formula Algorithm is exact whenAlgorithm is exact when –We know that actions are 1:1 mappings between states –Actions’ effects are always the same Otherwise, approximate result: includes exact action model, but also othersOtherwise, approximate result: includes exact action model, but also others Resulting representation is flat (CNF)Resulting representation is flat (CNF)

26 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Compact Flat Representation: How? Keep some property of invariant, e.g.,Keep some property of invariant, e.g., –K-CNF (CNF with k literals per clause) –#clauses bounded Factored Learning: compact repn. ifFactored Learning: compact repn. if –We know if action succeeded, or –Action failure leaves affected propositions in a specified nondeterministic state, or –Approximate: We discard large clauses (allows more states)

27 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Compact Representation in CNF Action affects and depends on ≤ k features  |φ t+1 | ≤ |φ t |·n k(k+1)Action affects and depends on ≤ k features  |φ t+1 | ≤ |φ t |·n k(k+1) Actions always have same effectActions always have same effect  |φ t+1 | ≤ O(t·n) –If also every feature observed every ≤ k steps  |φ t+1 | ≤ O(n k+1 ) –If (instead) the number of actions ≤ k  |φ t+1 | ≤ O(n·2 klogk )  |φ t+1 | ≤ O(n·2 klogk )

28 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Experiments: Factored Learning

29 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Summary Learning of effects and preconditions of actions in partially observable domainsLearning of effects and preconditions of actions in partially observable domains Showed in this talk:Showed in this talk: –Exact DAG update for any action –Exact CNF update, if actions 1:1 or w/o conditional effects –Can update model efficiently without increase in #variables in belief state –Compact representation Adventure games, virtual worlds, robotsAdventure games, virtual worlds, robots

30 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Innovation in this Research First scalable learning algorithm for partially observable dynamic domainsFirst scalable learning algorithm for partially observable dynamic domains Algorithm (DAG)Algorithm (DAG) –Always exact and optimal –Takes constant update time Algorithm (Factored)Algorithm (Factored) –Exact for actions that always have same effect –Takes polynomial update time Can solve problems with n>1000 domain features (> states)Can solve problems with n>1000 domain features (> states)

31 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Current Approaches and Work Reinforcement Learning & HMMsReinforcement Learning & HMMs –[Chrisman’92], [McCallum’95], [Boyen & Koller ’98], [Murphy etal. ’00], [Kearns, Mansour, Ng ’00] –Maintain probability distribution over current state –Problem: Exact solution intractable for domains of high (>100) dimensionality –Problem: Approximate solutions have unbounded errors, or make strong mixing assumptions Learning AI-Planning operatorsLearning AI-Planning operators –[Wang ’95], [Benson ’95], [Pasula etal. ’04],… –Problem: Assume fully observable domain

32 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Open Problems Efficient Inference with learned formulaEfficient Inference with learned formula Compact, efficient stochastic learningCompact, efficient stochastic learning Average case of formula size?Average case of formula size? Dynamic observation models, filtering in expanding worldsDynamic observation models, filtering in expanding worlds Software:

33 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Acknowledgements Dafna ShahafDafna Shahaf Megan NanceMegan Nance Brian HlubockyBrian Hlubocky Allen ChangAllen Chang … and the rest of my incredible group of students… and the rest of my incredible group of students

34 Connecting Learning and Logic Eyal Amir, Cambridge Present., May THE END

35 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Talk Outline 1.Actions in partially observed domains 2.Representation and update of models 3.Efficient algorithms 4.Related Work & Conclusions

36 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Compact Encoding (Sometimes) Transition Belief State = a logical formula (transition relation and state)Transition Belief State = a logical formula (transition relation and state) Observation = logical state formulaeObservation = logical state formulae

37 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Compact Encoding (Sometimes) Transition Belief State = a logical formula (transition relation and state)Transition Belief State = a logical formula (transition relation and state) Observation = logical state formulaeObservation = logical state formulae Actions = propositional symbols assert effect rulesActions = propositional symbols assert effect rules –“sw-up causes on ^ up if E” –“go-W keeps up” (= “go-W causes up if up” …) –Prop symbol: go-W ≈up, sw-up on E, sw-up up E

38 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Example: Light Switch Initial belief state (time 0) = set of pairs:Initial belief state (time 0) = set of pairs: {, }  all transition rels. {, }  all transition rels. Space = O(2^(2^n)) New encoding:New encoding: E  ~up Space = 2 Question: how to update new representation?Question: how to update new representation?

39 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Updating Action Model Transition belief state represented by Transition belief state represented by  Action-Definition(a) t,t+1 Action-Definition(a) t,t+1   ((a t  (a f v (a f f  f t ))  f t+1 )   ((a t  (a f v (a f f  f t ))  f t+1 )  f  (a t  f t+1  (a f v (a f f  f t ))) (effect axioms + explanation closure) (effect axioms + explanation closure) Update: Project[a](  t ) = logical results t+1 of  t  Action-Definition(a) t,t+1Update: Project[a](  t ) = logical results t+1 of  t  Action-Definition(a) t,t+1

40 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Example Update: Light Switch Transition belief state:  t = E t  ~up tTransition belief state:  t = E t  ~up t Project[sw-on](  t ) =Project[sw-on](  t ) = (E t+1  sw-on E E  sw-on E )  (~up t+1  sw-on ~up ~up  sw-on ~up )  … Update: Project[a](  t ) = logical results t+1 of  t  Action-Definition(a) t,t+1Update: Project[a](  t ) = logical results t+1 of  t  Action-Definition(a) t,t+1

41 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Updating Action Model Transition belief state represented by Transition belief state represented by   t+1 = Update[o](Project[a](  t ))  t+1 = Update[o](Project[a](  t )) Actions: Project[a](  t ) = logical results t+1 of  t  Action-Definition(a) t,t+1Actions: Project[a](  t ) = logical results t+1 of  t  Action-Definition(a) t,t+1 Observations: Update[o](  ) =   oObservations: Update[o](  ) =   o Theorem: formula filtering equivalent to -set semantics Theorem: formula filtering equivalent to -set semantics

42 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Larger Picture: An Exploration Agent World Model Interface Module Learning Module Filtering Module Decision Making Module Knowledge Base Commonsense extraction

43 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Example: Light Switch Initial belief state (time 0) = set of pairs:Initial belief state (time 0) = set of pairs: {, }  all transition rels. {, }  all transition rels. Apply actiona = go-WApply actiona = go-W. Resulting belief state (after action) Resulting belief state (after action) { } x { transitions map to same state } { } x { transitions map to same state } { } x { transitions set position to ~E } { } x { transitions set position to ~E } …. ….

44 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Example: Light Switch Resulting belief state (after action) Resulting belief state (after action) { } x { transitions map to same state } { } x { transitions map to same state } { } x { transitions set position to ~E } { } x { transitions set position to ~E } …. …. Observe: ~E, ~on Observe: ~E, ~on

45 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Experiments w/DAG-Update

46 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Some Learned Rules Pickup(b1) causes Holding(b1)Pickup(b1) causes Holding(b1) Stack(b3,b5) causes On(b3,b5)Stack(b3,b5) causes On(b3,b5) Pickup() does not cause Arm-EmptyPickup() does not cause Arm-Empty Move(room1,room4) causes At(book5,room4) if In-Briefcase(book5)Move(room1,room4) causes At(book5,room4) if In-Briefcase(book5) Move(room1,room4) does not cause At(book5,room4) if ¬In-Briefcase(book5)Move(room1,room4) does not cause At(book5,room4) if ¬In-Briefcase(book5)

47 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Approximate Learning Always  result of Factored-Learning ( φ t ) includes exact action modelAlways  result of Factored-Learning ( φ t ) includes exact action model Same compactness results applySame compactness results apply Approximation decreases size: Discard clauses >k (allows more action models),  |φ t | = O(n^k)Approximation decreases size: Discard clauses >k (allows more action models),  |φ t | = O(n^k)

48 Connecting Learning and Logic Eyal Amir, Cambridge Present., May More in the Paper Algorithm that uses deduction for exact updating the model representation alwaysAlgorithm that uses deduction for exact updating the model representation always Arbitrary preconditions and conditional effectsArbitrary preconditions and conditional effects Formal justification of algorithms and complexity resultsFormal justification of algorithms and complexity results

49 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Experiments

50 Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 DAG-SLAF: The Algorithm Input: a formula φ, an action-observation sequence, i=1..t Initialize: for each fluent f, expl f := init f kb:= φ, where each f is replaced by init f Process Sequence: for i=1..t do Update-Belief( a i,o i ) return kb Λ base Λ (f ↔ expl f )

51 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Experiments: Same-Effect Actions Average space per step Update step Update space (literals) ~210 features ~185 features ~160 features ~135 features ~110 features

52 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Current Game + Translation LambdaMOOLambdaMOO –MUD code base –Uses database to store game world, –Emphasis on player-world interaction –Powerful in-game programming language Game sends agents logical description of worldGame sends agents logical description of world

53 Connecting Learning and Logic Eyal Amir, Cambridge Present., May

54 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Example: Light Switch Time ActionObserve (after action) Posn. Bulb Switch 0 E ~up 0 E ~up 1 go-W~E ~on 1 go-W~E ~on 2 sw-up~E ~on FAIL 2 sw-up~E ~on FAIL 3 go-E E ~up 3 go-E E ~up 4 sw-up E up 4 sw-up E up 5 go-W~E on 5 go-W~E on

55 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Current Approaches and Work Reinforcement Learning & HMMsReinforcement Learning & HMMs –[Chrisman’92], [McCallum’95], [Boyen & Koller ’98], [Murphy etal. ’00], [Kearns, Mansour, Ng ’00] –Maintain probability distribution over current state –Problem: Exact solution intractable for domains of high (>100) dimensionality –Problem: Approximate solutions have unbounded errors, or make strong mixing assumptions Learning AI-Planning operatorsLearning AI-Planning operators –[Wang ’95], [Benson ’95], [Pasula etal. ’04],… –Problem: Assume fully observable domain

56 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Present Contribution Identify (useful) sufficient conditions for efficient computation of action modelsIdentify (useful) sufficient conditions for efficient computation of action models –Actions map states 1:1 –Deterministic actions with limited effect Polynomial-time algorithms for exact update of action modelPolynomial-time algorithms for exact update of action model (Useful) sufficient conditions for compact representation of action model(Useful) sufficient conditions for compact representation of action model

57 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Present Contribution Identify (useful) sufficient conditions for efficient computation of action modelsIdentify (useful) sufficient conditions for efficient computation of action models –Actions map states 1:1 –Deterministic actions with limited effect Polynomial-time algorithms for exact update of action modelPolynomial-time algorithms for exact update of action model (Useful) sufficient conditions for compact representation of action model(Useful) sufficient conditions for compact representation of action model

58 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Effect of Concept Expansion

59 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Text Translation to Logic Difficulties:Difficulties: –Counterintuitive LambdaMOO actions –Enumerate observations for action result 1.Create small game world 2.Predicates needed to describe world 3.Define STRIPS-like actions required to interact with game world 4.MUD outputs logical descriptions of world

60 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Map Propositions to Semantics E.g., assume that every action a is non- failing, deterministic, non-conditionalE.g., assume that every action a is non- failing, deterministic, non-conditional –For every domain description D, –T D = Rules D   (a ≈f v a f v a -f ) f –Rules D = { a f | “a causes f if g”  D } –Rules D = { a f g | “a causes f if g”  D }

61 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Mapping Theory to Semantics Every set of state-transition pairs represented using a logical formulaEvery set of state-transition pairs represented using a logical formula Theorem: Every consistent propositional theory maps to a set of pairs and vice versaTheorem: Every consistent propositional theory maps to a set of pairs and vice versa Have formulas for deterministic actionsHave formulas for deterministic actions –Conditional effect, sometimes inexecutable –Non-conditional, sometimes inexecutable –Non-conditional, always executable

62 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Distribution Properties Project[a](  Project[a](  Project[a](  ) Project[a](  Project[a](  Project[a](  ) Filtering a DNF belief state by factoringFiltering a DNF belief state by factoring

63 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Distribution Properties Project[a](  Project[a](  Project[a](  ) Project[a](  Project[a](  Project[a](  ) Project[a](  Project[a](  Project[a](  ) Project[a](  Project[a](  Project[a](  )  Project[a](   Project[a](  Project[a](TRUE) Filtering a DNF belief state by factoringFiltering a DNF belief state by factoring

64 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Knowledge About Actions Goal 1: conclude that one of sw-up, go-W, go-E causes onGoal 1: conclude that one of sw-up, go-W, go-E causes on Goal 2: show that sw-up is possible only when E is true (with some assumptions)Goal 2: show that sw-up is possible only when E is true (with some assumptions)

65 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Challenges (disregarding NLP for now): Partially observable worldPartially observable world –E.g., Cannot see the light bulb in east room Incomplete knowledge about state of the world and effects of actionsIncomplete knowledge about state of the world and effects of actions –E.g., do not know the effect/preconditions of flipping switch

66 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Challenges (disregarding NLP for now): Partially observable worldPartially observable world –E.g., Cannot see the light bulb in east room Incomplete knowledge about state of the world and effects of actionsIncomplete knowledge about state of the world and effects of actions –E.g., do not know the effect/preconditions of flipping switch

67 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Relating Effect Propositions Action-Definition(a) t,t+1 Action-Definition(a) t,t+1   ((a t  (a f v (a f f  f t ))  f t+1 )   ((a t  (a f v (a f f  f t ))  f t+1 )  f  (a t  f t+1  (a f v (a f f  f t )))

68 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Example: Moving Items Initial belief state:Initial belief state: set of all world states  all transition rels. Apply actionApply action move(r,closet,room) t Resulting belief stateResulting belief state set of all world states  all transition rels. set of all world states  all transition rels. Apply observationApply observation at(r,closet,room) at(r,closet,room) Resulting belief stateResulting belief state set of { at(r,closet,room)}  all transition rels. set of { at(r,closet,room)}  all transition rels.

69 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Example: Moving Items From here: Assume that actions non- conditional, always succeedFrom here: Assume that actions non- conditional, always succeed Initial belief state:Initial belief state: set of { at(r,closet,room)}  all transition rels. set of { at(r,closet,room)}  all transition rels. Apply actiona = move(r,closet,room) tApply actiona = move(r,closet,room) t Resulting belief stateResulting belief state { at(r,closet,room) } x { a at(r,closet,room) at(r,closet,room) …} { at(r,closet,room) } x { a at(r,closet,room) at(r,closet,room) …} { -at(r,closet,room) } x { a -at(r,closet,room) at(r,closet,room)..} { -at(r,closet,room) } x { a -at(r,closet,room) at(r,closet,room)..} …. ….

70 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Example: Moving Items Initial belief state:Initial belief state: { at(r,closet,room) } x { a at(r,closet,room) at(r,closet,room) …} { at(r,closet,room) } x { a at(r,closet,room) at(r,closet,room) …} { -at(r,closet,room) } x { a -at(r,closet,room) at(r,closet,room)..} { -at(r,closet,room) } x { a -at(r,closet,room) at(r,closet,room)..} …. …. Apply observation-at(r,closet,room) tApply observation-at(r,closet,room) t Resulting belief stateResulting belief state { -at(r,closet,room) } x { a -at(r,closet,room) at(r,closet,room)..} { -at(r,closet,room) } x { a -at(r,closet,room) at(r,closet,room)..} …. ….

71 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Filtering Stochastic Processes Dynamic Bayes Nets (DBNs): factored representationDynamic Bayes Nets (DBNs): factored representation s1s4s3s2s5s1s4s3s2s5s1s4s3s2s5s1s4s3s2s5

72 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Filtering Stochastic Processes Dynamic Bayes Nets (DBNs): factored representationDynamic Bayes Nets (DBNs): factored representation s4s3s2s5 s4s3s2s5 s4s3s2s5 s4s3s2s5

73 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Filtering Stochastic Processes Dynamic Bayes Nets (DBNs): factored representationDynamic Bayes Nets (DBNs): factored representation s4s3s5 s4s3s5 s4s3s5 s4s3s5

74 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Filtering Stochastic Processes Dynamic Bayes Nets (DBNs): factored representationDynamic Bayes Nets (DBNs): factored representation s4s5 s4s5 s4s5 s4s5 O(2 n ) space O(2 2n ) time

75 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Filtering Stochastic Processes Dynamic Bayes Nets (DBNs): factored representation: O(2 n ) space, O(2 2n ) timeDynamic Bayes Nets (DBNs): factored representation: O(2 n ) space, O(2 2n ) time Kalman Filter: Gaussian belief state and linear transition modelKalman Filter: Gaussian belief state and linear transition model s1s4s3s2s5s1s4s3s2s5s1s4s3s2s5

76 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Filtering Stochastic Processes Dynamic Bayes Nets (DBNs): factored representation: O(2 n ) space, O(2 2n ) timeDynamic Bayes Nets (DBNs): factored representation: O(2 n ) space, O(2 2n ) time Kalman Filter: Gaussian belief state and linear transition modelKalman Filter: Gaussian belief state and linear transition model s4s5 s4s5 s4s5 O(n 2 ) space O(n 3 ) time

77 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Example: Moving Items Initial belief state:Initial belief state: set of all world states Apply actionApply action move(r,closet,room) t Resulting belief stateResulting belief state all states that satisfy  in(r,closet) t+1 all states that satisfy  in(r,closet) t+1 Reason:Reason: –If initially  in(r,closet), then still  in(r,closet) –If initially in(r,closet), then now  in(r,closet)

78 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Example: Filtering a Literal Initial knowledge (time t):Initial knowledge (time t):in(r,closet) Apply move(r,closet,room)Apply move(r,closet,room) Preconds: in(r,closet)  locked(closet) Effects: in(r,room)  in(r,closet) Resulting knowledge (time t+1):Resulting knowledge (time t+1): in(r,room)  in(r,closet)  in(r,room)  in(r,closet)  locked(closet)  in(r,closet) locked(closet)  in(r,closet)

79 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Example: Filtering a Formula Initial knowledge (time t):Initial knowledge (time t): in(broom,closet)  locked(closet) in(broom,closet)  locked(closet) Apply move(r,closet,room)Apply move(r,closet,room) Preconds: in(r,closet)  locked(closet) Effects: in(r,room)  in(r,closet) Resulting knowledge (time t+1):Resulting knowledge (time t+1): in(r,room)  in(r,closet)  locked(closet) in(r,room)  in(r,closet)  locked(closet)

80 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Brief Outline of Future Effort Filtering FOL representationsFiltering FOL representations Compact reductions from FOL to Propositional LogicCompact reductions from FOL to Propositional Logic More classes of filtering that maintains compact representationMore classes of filtering that maintains compact representation Learning world models in partially observable domainsLearning world models in partially observable domains Stochastic filteringStochastic filtering

81 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Open Problems More families of actions/observationsMore families of actions/observations –Stochastic conditions on observations –Different data structures (BDDs? Horn?) Compact, efficient stochastic filteringCompact, efficient stochastic filtering Average case?Average case? Relational / first-order filteringRelational / first-order filtering Dynamic observation models, filtering in expanding worldsDynamic observation models, filtering in expanding worlds

82 Connecting Learning and Logic Eyal Amir, Cambridge Present., May STRIPS-Filter: Experimental Results [A. & Russell ’03]

83 Connecting Learning and Logic Eyal Amir, Cambridge Present., May STRIPS-Filter: Experimental Results [A. & Russell ’03]

84 Connecting Learning and Logic Eyal Amir, Cambridge Present., May Example: Text Adventure Games You enter Research MUD.You enter Research MUD. Plain Room [exits: west] This is a small, bare room. There is an open door in the wall in front of you, a switch next to it, and a small table underneath the switch. This is a small, bare room. There is an open door in the wall in front of you, a switch next to it, and a small table underneath the switch. > flip switch > go west You go through a corridor and enter: Plain Room [exits: north, east] There is a light bulb in the ceiling. It is on. There is a light bulb in the ceiling. It is on. > go east You go through a corridor and enter:…


Download ppt "Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen."

Similar presentations


Ads by Google