Presentation is loading. Please wait.

Presentation is loading. Please wait.

Towards Model-lite Planning A Proposal For Learning & Planning with Incomplete Domain Models Sungwook Yoon Subbarao Kambhampati Supported by DARPA Integrated.

Similar presentations


Presentation on theme: "Towards Model-lite Planning A Proposal For Learning & Planning with Incomplete Domain Models Sungwook Yoon Subbarao Kambhampati Supported by DARPA Integrated."— Presentation transcript:

1 Towards Model-lite Planning A Proposal For Learning & Planning with Incomplete Domain Models Sungwook Yoon Subbarao Kambhampati Supported by DARPA Integrated Learning Program

2 A Planning Problem Towards Model-lite Planning - Sungwook Yoon Suppose you have a super fast planner and a target application. What is the first problem you have to solve? Is it a problem from the application? Domain Engineering is hard  Model-lite Planning

3 Snapshot of the talk This is a proposal. We formulate learning and planning problems and solution methods for them. We tested our idea on some problems. But the verification is still an undergoing process We propose – Representation for model-lite planning probabilistic logic, incompleteness is quantified Explicit consideration of domain invariant – Learning of the domain model Update of the probability and finding of the new axioms – Planning with the model Deterministic planning domain needs probabilistic planning Most plausible plan that respects the current domain model Towards Model-lite Planning - Sungwook Yoon

4 Representation Precondition Axiom: p A i, A → pre i Uncertainty is quantified as a probability Effect Axiom: e A i, A → effect i Facilitates learning Towards Model-lite Planning - Sungwook Yoon

5 Domain Model - Blocksworld 0.9, Pickup (x) -> armempty() 1, Pickup (x) -> clear(x) 1, Pickup (x) -> ontable(x) 0.8, Pickup (x) –> holding(x) 0.8, Pickup (x) -> not armempty() 0.8, Pickup (x) -> not ontable(x) Precondition Axiom: Relates Actions with Current state facts Effect Axiom: Relates Actions with Next state facts Towards Model-lite Planning - Sungwook Yoon

6 Representation One modeling problem Conjunction of the effect have different semantics, if the probability of each effect is independently specified Add hidden variable, O, (e, A → O), then add deterministic axioms for each effect, (1,O → eff1), (1,O → eff2), … We can alleviate this problem also with explicit domain invariant property Writing explicit domain invariant property is easier than writing initial state generator and a set of operators that respects such property Towards Model-lite Planning - Sungwook Yoon 0.8, Pickup (x) –> holding(x) 0.8, Pickup (x) -> not armempty() 0.8, Pickup (x) -> not ontable(x) Effect Axiom: Relates Actions with Next state facts 1, holding(x) -> not armempty() 1, holding(x) -> not ontable(x) Static Property: Relates Facts in a State

7 Learning the domain model Given a trajectory of states and actions, S 1,A 1,S 2,A 2, …, S n,A n,S n+1 – We can learn precondition axioms from (S 1,A 1 ), (S 2,A 2 ), …, (S n,A n ) – We can learn effect axioms from (A 1,S 2 ), (A 2,S 3 ), …, (A n,S n+1 ) – We can learn domain invariant properties from each state (S 1 ), …, (S n+1 ) – The weights (probabilities) of the axioms can be updated with simple perceptron update There are readily available package for weighted logic learning – Alchemy (MLN) – Problog Structure learning – Alchemy provides structure learning too – We can also enumerate all the possible axioms (very costly for planning) Towards Model-lite Planning - Sungwook Yoon

8 Model-lite planning  Probabilistic Planning As stated before, with incomplete domain knowledge, a deterministic planning domain should be treated as a probabilistic domain The resulting plan should be maximally consistent with the current domain model We develop a planning technique for this purpose – A plan that is maximally plausible, given the probabilistic axioms, initial state and goal MPE solution to a Bayes Net problem – Build on plangraph Towards Model-lite Planning - Sungwook Yoon

9 Probabilistic Plangraph AB A B clear_a clear_b armempty ontable_a ontable_b pickup_a pickup_b clear_a clear_b armempty ontable_a ontable_b holding_a holding_b pickup_a pickup_b stack_a_b stack_b_a clear_a clear_b armempty ontable_a ontable_b holding_a holding_b on_a_b on_b_a noop_clear_a noop_clear_b noop_armempty noop_ontable_a noop_ontable_b noop_clear_a noop_clear_b noop_armempty noop_ontable_a noop_ontable_b noop_holding_a noop_holding_b 0.8 How do we generate a weighted clause? 0.95, pickup_b’ v holding_b Red lines indicate Mutexes 0.8 Domain Invariant Property Can be asserted too Towards Model-lite Planning - Sungwook Yoon

10 AB A B clear_a clear_b armempty ontable_a ontable_b pickup_a pickup_b clear_a clear_b armempty ontable_a ontable_b holding_a holding_b pickup_a pickup_b stack_a_b stack_b_a clear_a clear_b armempty ontable_a ontable_b holding_a holding_b on_a_b on_b_a noop_clear_a noop_clear_b noop_armempty noop_ontable_a noop_ontable_b noop_clear_a noop_clear_b noop_armempty noop_ontable_a noop_ontable_b noop_holding_a noop_holding_b 0.8 Can we view the probabilistic plangraph as Bayes net? Evidence Variables How we find a solution? MPE (most probabilistic explanation) There are some solvers out there 0.5 0.8 Domain Invariant Property Can be asserted too, 0.9 Towards Model-lite Planning - Sungwook Yoon

11 MPE as Maxsat There has been a work by James D. Park, AAAI 2002 Set –log(P) as the weight of the clauses A/BP TT0.7 FT0.3 TF0.2 FF0.8 Weighted Clauses -log0.7 -A v –B -log0.3 A V –B -log0.2 –A v B -log0.8 A v B Intuitive explanation Violating the clause is easier for High probability instances Thus the MaxSat Problem Gives you the highest probability instantiations A->B, T T 1, T F 0, InfinityWeight for –A v B, (complies with our intuitive understanding) Towards Model-lite Planning - Sungwook Yoon

12 AB A B clear_a clear_b armempty ontable_a ontable_b pickup_a pickup_b clear_a clear_b armempty ontable_a ontable_b holding_a holding_b pickup_a pickup_b stack_a_b stack_b_a clear_a clear_b armempty ontable_a ontable_b holding_a holding_b on_a_b on_b_a noop_clear_a noop_clear_b noop_armempty noop_ontable_a noop_ontable_b noop_clear_a noop_clear_b noop_armempty noop_ontable_a noop_ontable_b noop_holding_a noop_holding_b -log0.8 Probabilistic Plangraph to MaxSat Evidence Variables -log0.5 For each probabilistic weight, we give –log(1-p)! That’s it. -log0.8 Domain Invariant Property Can be asserted too, -log0.9 Towards Model-lite Planning - Sungwook Yoon

13 Exploding Blocksworld Towards Model-lite Planning - Sungwook Yoon

14 Current Status (ongoing) Learning test – Generated Blocksworld Random Wandering Data and feed them to Alchemy with correct and incorrect axioms – Alchemy found higher weight on the correct axioms and lower weight on the incorrect axioms Planning test – Tested on probabilistic planning problems – Hand tested on a couple of instances of Slippery Gripper Domain Hand encoded the clauses and assigned the weight Put the resulting clauses to MaxSat solve Got desired results – On Exploding Blocksworld Implemented generic MaxSat encoder for probabilistic planning problems Tested on a couple of problems from Exploding Blocksworld Finds desired output frequently (not always) Towards Model-lite Planning - Sungwook Yoon

15 Summary We can learn precondition axioms and effect axioms separately. – A -> Prec, A->Effect – Facilitates the learning Domain axiom or Invariant Property can be, provided, learned and used explicitly – It is better for domain modeler For planning, we can apply probabilistic plangraph approach – We proposed using MaxSat to solve probabilistic planning problems – Interesting parallel to deterministic planning to SAT Towards Model-lite Planning - Sungwook Yoon

16 Domain Learning – Related Work Logical Filtering (Chang & Eyal, ICAPS’06) – Update belief state and domain transition model – Experiments involved planning Probabilistic operator learning (Zettlemoyer, Pasula and Kaelbling, AAAI’05) – Experiments involved planning ARMS (Yang, Wu and Jiang, ICAPS ‘05) – No observation besides initial state and goal Towards Model-lite Planning - Sungwook Yoon

17 Probabilistic Planning in Plangraph – Related Work Pgraphplan, Paragraph Both search plans in the graphplan framework. pGraphplan searches for a consistent plan that maximizes the goal-reaching probability – Forward probability propagation Paragraph searches for a plan that minimizes the cost to reach the goal – Backward plan search Towards Model-lite Planning - Sungwook Yoon


Download ppt "Towards Model-lite Planning A Proposal For Learning & Planning with Incomplete Domain Models Sungwook Yoon Subbarao Kambhampati Supported by DARPA Integrated."

Similar presentations


Ads by Google