Download presentation

Presentation is loading. Please wait.

Published byWhitney Robinson Modified over 2 years ago

1
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 1 Eyal Amir U. of Illinois, Urbana-Champaign Joint work with: Dafna Shahaf, Allen Chang Connecting Learning and Logic

2
Eyal Amir, Cambridge Present., May 2006 2 Problem: Learn Actions’ Effects Given: a sequence of observations over timeGiven: a sequence of observations over time –Example: Action a was executed –Example: State feature f has value T Want: an estimate of actions’ effect modelWant: an estimate of actions’ effect model –Example: a is executable if the state satisfies some property –Example: under condition _, a has effect _

3
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 3 Example: Light Switch Time ActionObserve (after action) Posn. Bulb Switch 0 E ~up 0 E ~up 1 go-W~E ~on 1 go-W~E ~on 2 sw-up~E ~on FAIL 2 sw-up~E ~on FAIL 3 go-E E ~up 3 go-E E ~up 4 sw-up E up 4 sw-up E up 5 go-W~E on 5 go-W~E on

4
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 4 Demo

5
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 5 Example: Light Switch State 1State 2 west east west east west east west east ~up ^ ~on ^ E up ^ on ^ E ~up ^ ~on ^ E up ^ on ^ E Flipping the switch changes world stateFlipping the switch changes world state We do not observe the state fullyWe do not observe the state fully ~up ~on up on

6
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 6 Motivation: Exploration Agents Exploring partially observable domainsExploring partially observable domains –Interfaces to new software –Game-playing/companion agents –Robots exploring buildings, cities, planets –Agents acting in the WWW Difficulties:Difficulties: –No knowledge of actions’ effects apriori –Many features –Partially observable domain

7
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 7 Rest of This Talk 1.Actions in partially observed domains 2.Efficient learning algorithms 3.Related Work & Conclusions 4.[Theory behind Algorithms]

8
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 8 Learning: Update knowledge of the transition relation and state of the worldLearning: Update knowledge of the transition relation and state of the world Learning Transition Models k4k3k2k1 a1 Knowledge state s1s4s3s2 World state a2a3a4 Action TransitionRelation TransitionKnowledge 3132

9
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 9 Action Model: Set Problem: n world features 2^(2^n) transitions T1+T1+T1+T1+ T3+T3+T3+T3+ T2+T2+T2+T2+ T3+T3+T3+T3+ T1+T1+T1+T1+ 1 1 2 2 2 T1+T1+T1+T1+ T2+T2+T2+T2+ T2+T2+T2+T2+ T3+T3+T3+T3+ T3+T3+T3+T3+ T2+T2+T2+T2+ T3+T3+T3+T3+ 3 1 2 3 1 1 2

10
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 10 Rest of This Talk 1.Actions in partially observed domains 2.Efficient algorithms 1.Updating a Directed Acyclic Graph (DAG) 2.Factored update (flat formula repn.) 3.Related Work & Conclusions 4.[Theory behind Algorithms]

11
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 11 Compact Encoding (Sometimes) Transition Belief State = a logical formula (transition relation and state)Transition Belief State = a logical formula (transition relation and state) Observation = logical state formulaeObservation = logical state formulae

12
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 12 Compact Encoding (Sometimes) Transition Belief State = a logical formula (transition relation and state)Transition Belief State = a logical formula (transition relation and state) Observation = logical state formulaeObservation = logical state formulae Actions = propositional symbols assert effect rulesActions = propositional symbols assert effect rules –“sw-up causes on ^ up if E” –“go-W keeps up” (= “go-W causes up if up” …) –Prop symbol: go-W ≈up, sw-up on E, sw-up up E

13
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 tr 1 tr 2 init locked PressB causes ¬ locked if locked PressB causes locked if ¬ locked expl(0) Updating the Status of “Locked” Time 0

14
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 tr 1 tr 2 init locked PressB causes ¬ locked if locked PressB causes locked if ¬ locked expl(0) Updating the Status of “Locked” Time t expl(t)

15
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 tr 1 tr 2 init locked PressB causes ¬ locked if locked PressB causes locked if ¬ locked expl(0) Updating the Status of “Locked” Time t+1 expl(t) .... ¬ expl(t+1) “locked” holds because PressB caused it ¬ .... “locked” holds because PressB did not change it

16
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 16 Algorithm: Algorithm: Update of a DAG 1.Given: action a, observation o, transition-belief formula φ t 2. 2.for each fluent f, a. a.kb:= kb Λ logic formula “a is executable” b. b.expl'f := logical formula for the possible explanations for f’s value after action a c. c.replace every fluent g in expl’f with a pointer to explg d. d.update explf := expl'f 3.φ t+1 is result of 2 together with o

17
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 17 Fast Update: DAG Action Model DAG-update algorithm takes constant time (using hash table) to update formulaDAG-update algorithm takes constant time (using hash table) to update formula Algorithm is exactAlgorithm is exact Result DAG has size O(Tn k +|φ 0 |)Result DAG has size O(Tn k +|φ 0 |) –T steps, n features, k features in action preconditions –Still only n features/variables Use φ t with a DAG-DPLL SAT-solverUse φ t with a DAG-DPLL SAT-solver

18
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 18 Experiments: DAG Update

19
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 19 Experiments: DAG Update

20
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 20 Experiments: DAG Queries

21
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 21 Rest of This Talk 1.Actions in partially observed domains 2.Efficient algorithms 1.Updating a Directed Acyclic Graph (DAG) 2.Factored update (flat formula repn.) 3.Related Work & Conclusions 4.[Theory behind Algorithms]

22
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 22 Distribution for Some Actions Project[a]( Project[a]( Project[a]( ) Project[a]( Project[a]( Project[a]( ) Project[a]( Project[a]( Project[a]( ) Project[a]( Project[a]( Project[a]( ) Project[a]( Project[a]( Project[a]( Project[a](TRUE) Compute update for literals in the formula separately, and combine the resultsCompute update for literals in the formula separately, and combine the results Known Success/FailureKnown Success/Failure 1:1 Actions 1:1 Actions

23
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 23 Actions that map states 1:1 Reason for distribution over Reason for distribution over Project[a]( Project[a]( Project[a]( ) 1:1 Non-1:1

24
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 24 Algorithm: Factored Learning Given: action a, observation o, transition-belief formula φ tGiven: action a, observation o, transition-belief formula φ t 1.Precompute update for every literal 2.Decompose φ t recursively, update every literal separately, and combine the results 3.Conjoin the result of 2. with o, producing φ t+1

25
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 25 Fast Update of Action Model Factored Learning algorithm takes time O(|φ t |) to update formulaFactored Learning algorithm takes time O(|φ t |) to update formula Algorithm is exact whenAlgorithm is exact when –We know that actions are 1:1 mappings between states –Actions’ effects are always the same Otherwise, approximate result: includes exact action model, but also othersOtherwise, approximate result: includes exact action model, but also others Resulting representation is flat (CNF)Resulting representation is flat (CNF)

26
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 26 Compact Flat Representation: How? Keep some property of invariant, e.g.,Keep some property of invariant, e.g., –K-CNF (CNF with k literals per clause) –#clauses bounded Factored Learning: compact repn. ifFactored Learning: compact repn. if –We know if action succeeded, or –Action failure leaves affected propositions in a specified nondeterministic state, or –Approximate: We discard large clauses (allows more states)

27
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 27 Compact Representation in CNF Action affects and depends on ≤ k features |φ t+1 | ≤ |φ t |·n k(k+1)Action affects and depends on ≤ k features |φ t+1 | ≤ |φ t |·n k(k+1) Actions always have same effectActions always have same effect |φ t+1 | ≤ O(t·n) –If also every feature observed every ≤ k steps |φ t+1 | ≤ O(n k+1 ) –If (instead) the number of actions ≤ k |φ t+1 | ≤ O(n·2 klogk ) |φ t+1 | ≤ O(n·2 klogk )

28
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 28 Experiments: Factored Learning

29
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 29 Summary Learning of effects and preconditions of actions in partially observable domainsLearning of effects and preconditions of actions in partially observable domains Showed in this talk:Showed in this talk: –Exact DAG update for any action –Exact CNF update, if actions 1:1 or w/o conditional effects –Can update model efficiently without increase in #variables in belief state –Compact representation Adventure games, virtual worlds, robotsAdventure games, virtual worlds, robots

30
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 30 Innovation in this Research First scalable learning algorithm for partially observable dynamic domainsFirst scalable learning algorithm for partially observable dynamic domains Algorithm (DAG)Algorithm (DAG) –Always exact and optimal –Takes constant update time Algorithm (Factored)Algorithm (Factored) –Exact for actions that always have same effect –Takes polynomial update time Can solve problems with n>1000 domain features (>2 1000 states)Can solve problems with n>1000 domain features (>2 1000 states)

31
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 31 Current Approaches and Work Reinforcement Learning & HMMsReinforcement Learning & HMMs –[Chrisman’92], [McCallum’95], [Boyen & Koller ’98], [Murphy etal. ’00], [Kearns, Mansour, Ng ’00] –Maintain probability distribution over current state –Problem: Exact solution intractable for domains of high (>100) dimensionality –Problem: Approximate solutions have unbounded errors, or make strong mixing assumptions Learning AI-Planning operatorsLearning AI-Planning operators –[Wang ’95], [Benson ’95], [Pasula etal. ’04],… –Problem: Assume fully observable domain

32
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 32 Open Problems Efficient Inference with learned formulaEfficient Inference with learned formula Compact, efficient stochastic learningCompact, efficient stochastic learning Average case of formula size?Average case of formula size? Dynamic observation models, filtering in expanding worldsDynamic observation models, filtering in expanding worlds Software: http://www.cs.uiuc.edu/~eyalSoftware: http://www.cs.uiuc.edu/~eyal

33
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 33 Acknowledgements Dafna ShahafDafna Shahaf Megan NanceMegan Nance Brian HlubockyBrian Hlubocky Allen ChangAllen Chang … and the rest of my incredible group of students… and the rest of my incredible group of students

34
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 34 THE END

35
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 35 Talk Outline 1.Actions in partially observed domains 2.Representation and update of models 3.Efficient algorithms 4.Related Work & Conclusions

36
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 36 Compact Encoding (Sometimes) Transition Belief State = a logical formula (transition relation and state)Transition Belief State = a logical formula (transition relation and state) Observation = logical state formulaeObservation = logical state formulae

37
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 37 Compact Encoding (Sometimes) Transition Belief State = a logical formula (transition relation and state)Transition Belief State = a logical formula (transition relation and state) Observation = logical state formulaeObservation = logical state formulae Actions = propositional symbols assert effect rulesActions = propositional symbols assert effect rules –“sw-up causes on ^ up if E” –“go-W keeps up” (= “go-W causes up if up” …) –Prop symbol: go-W ≈up, sw-up on E, sw-up up E

38
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 38 Example: Light Switch Initial belief state (time 0) = set of pairs:Initial belief state (time 0) = set of pairs: {, } all transition rels. {, } all transition rels. Space = O(2^(2^n)) New encoding:New encoding: E ~up Space = 2 Question: how to update new representation?Question: how to update new representation?

39
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 39 Updating Action Model Transition belief state represented by Transition belief state represented by Action-Definition(a) t,t+1 Action-Definition(a) t,t+1 ((a t (a f v (a f f f t )) f t+1 ) ((a t (a f v (a f f f t )) f t+1 ) f (a t f t+1 (a f v (a f f f t ))) (effect axioms + explanation closure) (effect axioms + explanation closure) Update: Project[a]( t ) = logical results t+1 of t Action-Definition(a) t,t+1Update: Project[a]( t ) = logical results t+1 of t Action-Definition(a) t,t+1

40
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 40 Example Update: Light Switch Transition belief state: t = E t ~up tTransition belief state: t = E t ~up t Project[sw-on]( t ) =Project[sw-on]( t ) = (E t+1 sw-on E E sw-on E ) (~up t+1 sw-on ~up ~up sw-on ~up ) … Update: Project[a]( t ) = logical results t+1 of t Action-Definition(a) t,t+1Update: Project[a]( t ) = logical results t+1 of t Action-Definition(a) t,t+1

41
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 41 Updating Action Model Transition belief state represented by Transition belief state represented by t+1 = Update[o](Project[a]( t )) t+1 = Update[o](Project[a]( t )) Actions: Project[a]( t ) = logical results t+1 of t Action-Definition(a) t,t+1Actions: Project[a]( t ) = logical results t+1 of t Action-Definition(a) t,t+1 Observations: Update[o]( ) = oObservations: Update[o]( ) = o Theorem: formula filtering equivalent to -set semantics Theorem: formula filtering equivalent to -set semantics

42
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 42 Larger Picture: An Exploration Agent World Model Interface Module Learning Module Filtering Module Decision Making Module Knowledge Base Commonsense extraction

43
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 43 Example: Light Switch Initial belief state (time 0) = set of pairs:Initial belief state (time 0) = set of pairs: {, } all transition rels. {, } all transition rels. Apply actiona = go-WApply actiona = go-W. Resulting belief state (after action) Resulting belief state (after action) { } x { transitions map to same state } { } x { transitions map to same state } { } x { transitions set position to ~E } { } x { transitions set position to ~E } …. ….

44
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 44 Example: Light Switch Resulting belief state (after action) Resulting belief state (after action) { } x { transitions map to same state } { } x { transitions map to same state } { } x { transitions set position to ~E } { } x { transitions set position to ~E } …. …. Observe: ~E, ~on Observe: ~E, ~on

45
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 45 Experiments w/DAG-Update

46
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 46 Some Learned Rules Pickup(b1) causes Holding(b1)Pickup(b1) causes Holding(b1) Stack(b3,b5) causes On(b3,b5)Stack(b3,b5) causes On(b3,b5) Pickup() does not cause Arm-EmptyPickup() does not cause Arm-Empty Move(room1,room4) causes At(book5,room4) if In-Briefcase(book5)Move(room1,room4) causes At(book5,room4) if In-Briefcase(book5) Move(room1,room4) does not cause At(book5,room4) if ¬In-Briefcase(book5)Move(room1,room4) does not cause At(book5,room4) if ¬In-Briefcase(book5)

47
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 47 Approximate Learning Always result of Factored-Learning ( φ t ) includes exact action modelAlways result of Factored-Learning ( φ t ) includes exact action model Same compactness results applySame compactness results apply Approximation decreases size: Discard clauses >k (allows more action models), |φ t | = O(n^k)Approximation decreases size: Discard clauses >k (allows more action models), |φ t | = O(n^k)

48
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 48 More in the Paper Algorithm that uses deduction for exact updating the model representation alwaysAlgorithm that uses deduction for exact updating the model representation always Arbitrary preconditions and conditional effectsArbitrary preconditions and conditional effects Formal justification of algorithms and complexity resultsFormal justification of algorithms and complexity results

49
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 49 Experiments

50
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 DAG-SLAF: The Algorithm Input: a formula φ, an action-observation sequence, i=1..t Initialize: for each fluent f, expl f := init f kb:= φ, where each f is replaced by init f Process Sequence: for i=1..t do Update-Belief( a i,o i ) return kb Λ base Λ (f ↔ expl f )

51
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 51 Experiments: Same-Effect Actions Average space per step Update step Update space (literals) ~210 features ~185 features ~160 features ~135 features ~110 features

52
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 52 Current Game + Translation LambdaMOOLambdaMOO –MUD code base –Uses database to store game world, –Emphasis on player-world interaction –Powerful in-game programming language Game sends agents logical description of worldGame sends agents logical description of world

53
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 53

54
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 54 Example: Light Switch Time ActionObserve (after action) Posn. Bulb Switch 0 E ~up 0 E ~up 1 go-W~E ~on 1 go-W~E ~on 2 sw-up~E ~on FAIL 2 sw-up~E ~on FAIL 3 go-E E ~up 3 go-E E ~up 4 sw-up E up 4 sw-up E up 5 go-W~E on 5 go-W~E on

55
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 55 Current Approaches and Work Reinforcement Learning & HMMsReinforcement Learning & HMMs –[Chrisman’92], [McCallum’95], [Boyen & Koller ’98], [Murphy etal. ’00], [Kearns, Mansour, Ng ’00] –Maintain probability distribution over current state –Problem: Exact solution intractable for domains of high (>100) dimensionality –Problem: Approximate solutions have unbounded errors, or make strong mixing assumptions Learning AI-Planning operatorsLearning AI-Planning operators –[Wang ’95], [Benson ’95], [Pasula etal. ’04],… –Problem: Assume fully observable domain

56
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 56 Present Contribution Identify (useful) sufficient conditions for efficient computation of action modelsIdentify (useful) sufficient conditions for efficient computation of action models –Actions map states 1:1 –Deterministic actions with limited effect Polynomial-time algorithms for exact update of action modelPolynomial-time algorithms for exact update of action model (Useful) sufficient conditions for compact representation of action model(Useful) sufficient conditions for compact representation of action model

57
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 57 Present Contribution Identify (useful) sufficient conditions for efficient computation of action modelsIdentify (useful) sufficient conditions for efficient computation of action models –Actions map states 1:1 –Deterministic actions with limited effect Polynomial-time algorithms for exact update of action modelPolynomial-time algorithms for exact update of action model (Useful) sufficient conditions for compact representation of action model(Useful) sufficient conditions for compact representation of action model

58
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 58 Effect of Concept Expansion

59
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 59 Text Translation to Logic Difficulties:Difficulties: –Counterintuitive LambdaMOO actions –Enumerate observations for action result 1.Create small game world 2.Predicates needed to describe world 3.Define STRIPS-like actions required to interact with game world 4.MUD outputs logical descriptions of world

60
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 60 Map Propositions to Semantics E.g., assume that every action a is non- failing, deterministic, non-conditionalE.g., assume that every action a is non- failing, deterministic, non-conditional –For every domain description D, –T D = Rules D (a ≈f v a f v a -f ) f –Rules D = { a f | “a causes f if g” D } –Rules D = { a f g | “a causes f if g” D }

61
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 61 Mapping Theory to Semantics Every set of state-transition pairs represented using a logical formulaEvery set of state-transition pairs represented using a logical formula Theorem: Every consistent propositional theory maps to a set of pairs and vice versaTheorem: Every consistent propositional theory maps to a set of pairs and vice versa Have formulas for deterministic actionsHave formulas for deterministic actions –Conditional effect, sometimes inexecutable –Non-conditional, sometimes inexecutable –Non-conditional, always executable

62
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 62 Distribution Properties Project[a]( Project[a]( Project[a]( ) Project[a]( Project[a]( Project[a]( ) Filtering a DNF belief state by factoringFiltering a DNF belief state by factoring

63
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 63 Distribution Properties Project[a]( Project[a]( Project[a]( ) Project[a]( Project[a]( Project[a]( ) Project[a]( Project[a]( Project[a]( ) Project[a]( Project[a]( Project[a]( ) Project[a]( Project[a]( Project[a](TRUE) Filtering a DNF belief state by factoringFiltering a DNF belief state by factoring

64
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 64 Knowledge About Actions Goal 1: conclude that one of sw-up, go-W, go-E causes onGoal 1: conclude that one of sw-up, go-W, go-E causes on Goal 2: show that sw-up is possible only when E is true (with some assumptions)Goal 2: show that sw-up is possible only when E is true (with some assumptions)

65
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 65 Challenges (disregarding NLP for now): Partially observable worldPartially observable world –E.g., Cannot see the light bulb in east room Incomplete knowledge about state of the world and effects of actionsIncomplete knowledge about state of the world and effects of actions –E.g., do not know the effect/preconditions of flipping switch....................

66
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 66 Challenges (disregarding NLP for now): Partially observable worldPartially observable world –E.g., Cannot see the light bulb in east room Incomplete knowledge about state of the world and effects of actionsIncomplete knowledge about state of the world and effects of actions –E.g., do not know the effect/preconditions of flipping switch....................

67
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 67 Relating Effect Propositions Action-Definition(a) t,t+1 Action-Definition(a) t,t+1 ((a t (a f v (a f f f t )) f t+1 ) ((a t (a f v (a f f f t )) f t+1 ) f (a t f t+1 (a f v (a f f f t )))

68
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 68 Example: Moving Items Initial belief state:Initial belief state: set of all world states all transition rels. Apply actionApply action move(r,closet,room) t Resulting belief stateResulting belief state set of all world states all transition rels. set of all world states all transition rels. Apply observationApply observation at(r,closet,room) at(r,closet,room) Resulting belief stateResulting belief state set of { at(r,closet,room)} all transition rels. set of { at(r,closet,room)} all transition rels.

69
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 69 Example: Moving Items From here: Assume that actions non- conditional, always succeedFrom here: Assume that actions non- conditional, always succeed Initial belief state:Initial belief state: set of { at(r,closet,room)} all transition rels. set of { at(r,closet,room)} all transition rels. Apply actiona = move(r,closet,room) tApply actiona = move(r,closet,room) t Resulting belief stateResulting belief state { at(r,closet,room) } x { a at(r,closet,room) at(r,closet,room) …} { at(r,closet,room) } x { a at(r,closet,room) at(r,closet,room) …} { -at(r,closet,room) } x { a -at(r,closet,room) at(r,closet,room)..} { -at(r,closet,room) } x { a -at(r,closet,room) at(r,closet,room)..} …. ….

70
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 70 Example: Moving Items Initial belief state:Initial belief state: { at(r,closet,room) } x { a at(r,closet,room) at(r,closet,room) …} { at(r,closet,room) } x { a at(r,closet,room) at(r,closet,room) …} { -at(r,closet,room) } x { a -at(r,closet,room) at(r,closet,room)..} { -at(r,closet,room) } x { a -at(r,closet,room) at(r,closet,room)..} …. …. Apply observation-at(r,closet,room) tApply observation-at(r,closet,room) t Resulting belief stateResulting belief state { -at(r,closet,room) } x { a -at(r,closet,room) at(r,closet,room)..} { -at(r,closet,room) } x { a -at(r,closet,room) at(r,closet,room)..} …. ….

71
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 71 Filtering Stochastic Processes Dynamic Bayes Nets (DBNs): factored representationDynamic Bayes Nets (DBNs): factored representation s1s4s3s2s5s1s4s3s2s5s1s4s3s2s5s1s4s3s2s5

72
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 72 Filtering Stochastic Processes Dynamic Bayes Nets (DBNs): factored representationDynamic Bayes Nets (DBNs): factored representation s4s3s2s5 s4s3s2s5 s4s3s2s5 s4s3s2s5

73
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 73 Filtering Stochastic Processes Dynamic Bayes Nets (DBNs): factored representationDynamic Bayes Nets (DBNs): factored representation s4s3s5 s4s3s5 s4s3s5 s4s3s5

74
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 74 Filtering Stochastic Processes Dynamic Bayes Nets (DBNs): factored representationDynamic Bayes Nets (DBNs): factored representation s4s5 s4s5 s4s5 s4s5 O(2 n ) space O(2 2n ) time

75
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 75 Filtering Stochastic Processes Dynamic Bayes Nets (DBNs): factored representation: O(2 n ) space, O(2 2n ) timeDynamic Bayes Nets (DBNs): factored representation: O(2 n ) space, O(2 2n ) time Kalman Filter: Gaussian belief state and linear transition modelKalman Filter: Gaussian belief state and linear transition model s1s4s3s2s5s1s4s3s2s5s1s4s3s2s5

76
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 76 Filtering Stochastic Processes Dynamic Bayes Nets (DBNs): factored representation: O(2 n ) space, O(2 2n ) timeDynamic Bayes Nets (DBNs): factored representation: O(2 n ) space, O(2 2n ) time Kalman Filter: Gaussian belief state and linear transition modelKalman Filter: Gaussian belief state and linear transition model s4s5 s4s5 s4s5 O(n 2 ) space O(n 3 ) time

77
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 77 Example: Moving Items Initial belief state:Initial belief state: set of all world states Apply actionApply action move(r,closet,room) t Resulting belief stateResulting belief state all states that satisfy in(r,closet) t+1 all states that satisfy in(r,closet) t+1 Reason:Reason: –If initially in(r,closet), then still in(r,closet) –If initially in(r,closet), then now in(r,closet)

78
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 78 Example: Filtering a Literal Initial knowledge (time t):Initial knowledge (time t):in(r,closet) Apply move(r,closet,room)Apply move(r,closet,room) Preconds: in(r,closet) locked(closet) Effects: in(r,room) in(r,closet) Resulting knowledge (time t+1):Resulting knowledge (time t+1): in(r,room) in(r,closet) in(r,room) in(r,closet) locked(closet) in(r,closet) locked(closet) in(r,closet)

79
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 79 Example: Filtering a Formula Initial knowledge (time t):Initial knowledge (time t): in(broom,closet) locked(closet) in(broom,closet) locked(closet) Apply move(r,closet,room)Apply move(r,closet,room) Preconds: in(r,closet) locked(closet) Effects: in(r,room) in(r,closet) Resulting knowledge (time t+1):Resulting knowledge (time t+1): in(r,room) in(r,closet) locked(closet) in(r,room) in(r,closet) locked(closet)

80
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 80 Brief Outline of Future Effort Filtering FOL representationsFiltering FOL representations Compact reductions from FOL to Propositional LogicCompact reductions from FOL to Propositional Logic More classes of filtering that maintains compact representationMore classes of filtering that maintains compact representation Learning world models in partially observable domainsLearning world models in partially observable domains Stochastic filteringStochastic filtering

81
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 81 Open Problems More families of actions/observationsMore families of actions/observations –Stochastic conditions on observations –Different data structures (BDDs? Horn?) Compact, efficient stochastic filteringCompact, efficient stochastic filtering Average case?Average case? Relational / first-order filteringRelational / first-order filtering Dynamic observation models, filtering in expanding worldsDynamic observation models, filtering in expanding worlds

82
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 82 STRIPS-Filter: Experimental Results [A. & Russell ’03]

83
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 83 STRIPS-Filter: Experimental Results [A. & Russell ’03]

84
Connecting Learning and Logic Eyal Amir, Cambridge Present., May 2006 84 Example: Text Adventure Games You enter Research MUD.You enter Research MUD. Plain Room [exits: west] This is a small, bare room. There is an open door in the wall in front of you, a switch next to it, and a small table underneath the switch. This is a small, bare room. There is an open door in the wall in front of you, a switch next to it, and a small table underneath the switch. > flip switch > go west You go through a corridor and enter: Plain Room [exits: north, east] There is a light bulb in the ceiling. It is on. There is a light bulb in the ceiling. It is on. > go east You go through a corridor and enter:…

Similar presentations

OK

1 P P := the class of decision problems (languages) decided by a Turing machine so that for some polynomial p and all x, the machine terminates after at.

1 P P := the class of decision problems (languages) decided by a Turing machine so that for some polynomial p and all x, the machine terminates after at.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google