Presentation is loading. Please wait.

Presentation is loading. Please wait.

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Jan 28 th My lab was hacked and the systems are being rebuilt.. Homepage is.

Similar presentations


Presentation on theme: "A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Jan 28 th My lab was hacked and the systems are being rebuilt.. Homepage is."— Presentation transcript:

1 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Jan 28 th My lab was hacked and the systems are being rebuilt.. Homepage is up now—but may be down intermittently

2 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Agenda G Action representation –PDDL G Planning strategies –Progression/regression proofs –Progression/regression planning strategies –Causal proof and associated planning algorithm (Sketch) G Heuristics –Set-difference –Planning graph heuristics »Mutex analysis –Using planning graph itself as a basis for planning »SAT/CSP/IP encodings.. G Refinement planning framework to look at them all..

3 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati State-Variable Models GStates are modeled in terms of (binary) state-variables -- Complete initial state, partial goal state GActions are modeled as state transformation functions -- Syntax: ADL language (Pednault) -- Apply(A,S) = (S \ eff(A)) + eff(A) (If Precond(A) hold in S) Load(o 1 ) In(o 1 ) At(o 1,l 1 ), At(R,l 1 ) At(R,E) Fly() At(R,M), ¬At(R,E)  x In(x)  At(x,M) & ¬At(x, E) Unload(o 1 ) In(o 1 ) ¬In(o 1 ) Earth At(A,E), At(B,E),At(R,E) At(A,M),At(B,M) ¬In(A), ¬In(B) Effects Prec. Appolo 13

4 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Blocks world State variables: Ontable(x) On(x,y) Clear(x) hand-empty holding(x) Stack(x,y) Prec: holding(x), clear(y) eff: on(x,y), ~cl(y), ~holding(x), hand-empty Unstack(x,y) Prec: on(x,y),hand-empty,cl(x) eff: holding(x),~clear(x),clear(y),~hand-empty Pickup(x) Prec: hand-empty,clear(x),ontable(x) eff: holding(x),~ontable(x),~hand-empty,~Clear(x) Putdown(x) Prec: holding(x) eff: Ontable(x), hand-empty,clear(x),~holding(x) Initial state: Complete specification of T/F values to state variables --By convention, variables with F values are omitted Goal state: A partial specification of the desired state variable/value combinations Init: Ontable(A),Ontable(B), Clear(A), Clear(B), hand-empty Goal: ~clear(B), hand-empty

5 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Why is this more compact? (than explicit transition systems) G In explicit transition systems actions are represented not as state-to-state transitions where in each action will be represented by an incidence matrix of size |S|x|S| G In state-variable model, actions are represented only in terms of state variables whose values they care about, and whose value they affect. G Consider a state space of 1024 states. It can be represented by log 2 1024=10 state variables. If an action needs variable v1 to be true and makes v7 to be false, it can be represented by just 2 bits (instead of a 1024x1024 matrix) –Of course, if the action has a complicated mapping from states to states, in the worst case the action rep will be just as large –The assumption being made here is that the actions will have effects on a small number of state variables.

6 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Some notes on action representation G STRIPS Assumption: Actions must specify all the state variables whose values they change... G No disjunction allowed in effects –Conditional effects are NOT disjunctive »(antecedent refers to the previous state & consequent refers to the next state) G Quantification is over finite universes –essentially syntactic sugaring G All actions can be compiled down to a canonical representation where preconditions and effects are propositional –Exponential blow-up may occur (e.g removing conditional effects) »We will assume the canonical representation

7 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati PDDL—a standard for representing actions

8 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati PDDL Domains

9 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Problems

10 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Gripper World

11 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Gripper Actions

12 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati

13 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati How do we do planning? G Obvious idea –Think of planning as search in the space of states of the transition graph »Go “forward” in the graph (progression) »Go “backward” in the graph (regression) G More general idea –Think of planning as a search in the space of “partial plans” »Progression corresponds to searching in the space of “prefix” plans »Regression corresponds to searching in the space “suffix” plans »We can also search in the space of “precedence-constrained” plans.. (Plan-space refinement) l “Refinement planning” is my idea of trying to think of all of this from one unified perspective

14 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Checking correctness of a plan: The State-based approaches G Progression Proof: Progress the initial state over the action sequence, and see if the goals are present in the result At(A,E) At(R,E) At(B,E) Load(A) progress Load(B) At(B,E) At(R,E) In(A) At(R,E) In(B) progress G Regression Proof: Regress the goal state over the action sequence, and see if the initial state subsumes the result regress At(A,E) At(R,E) At(B,E) Load(A)Load(B) At(B,E) At(R,E) In(A) In(B) regress Easy to verify Resource Capacity Constraints

15 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Checking correctness of a plan: The Causal Approach G Causal Proof: Check if each of the goals and preconditions of the action are »“established” : There is a preceding step that gives it »“unclobbered”: No possibly intervening step deletes it l Or for every preceding step that deletes it, there exists another step that precedes the conditions and follows the deleter adds it back. Causal proof is –“local” (checks correctness one condition at a time) –“state-less” (does not need to know the states preceding actions) »Easy to extend to durative actions –“incremental” with respect to action insertion »Great for replanning Contd.. Load(B)Load(A) In(A) In(B) At(B,E) At(R,E) At(A,E) At(R,E) At(A,E) At(B,E) At(R,E) In(A) ~At(A,E) In(B) ~At(B,E)

16 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati 1/30 Homework 1—additional questions added and the homework is closed; Due next Tuesday **No office hours today—I dash from class

17 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Operator expressiveness

18 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Partial Order Plan

19 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Progression: An action A can be applied to state S iff the preconditions are satisfied in the current state The resulting state S’ is computed as follows: --every variable that occurs in the actions effects gets the value that the action said it should have --every other variable gets the value it had in the state S where the action is applied Ontable(A) Ontable(B), Clear(A) Clear(B) hand-empty holding(A) ~Clear(A) ~Ontable(A) Ontable(B), Clear(B) ~handempty Pickup(A) Pickup(B) holding(B) ~Clear(B) ~Ontable(B) Ontable(A), Clear(A) ~handempty

20 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Regression: A state S can be regressed over an action A (or A is applied in the backward direction to S) Iff: --There is no variable v such that v is given different values by the effects of A and the state S --There is at least one variable v’ such that v’ is given the same value by the effects of A as well as state S The resulting state S’ is computed as follows: -- every variable that occurs in S, and does not occur in the effects of A will be copied over to S’ with its value as in S -- every variable that occurs in the precondition list of A will be copied over to S’ with the value it has in in the precondition list ~clear(B) hand-empty Putdown(A) Stack(A,B) ~clear(B) holding(A) clear(B) Putdown(B)??

21 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Plan Space Planning: Terminology G Step: a step in the partial plan—which is bound to a specific action G Orderings: s1<s2 s1 must precede s2 G Open Conditions: preconditions of the steps (including goal step) G Causal Link (s1—p—s2): a commitment that the condition p, needed at s2 will be made true by s1 –Requires s1 to “cause” p »Either have an effect p »Or have a conditional effect p which is FORCED to happen l By adding a secondary precondition to S1 G Unsafe Link: (s1—p—s2; s3) if s3 can come between s1 and s2 and undo p (has an effect that deletes p). G Empty Plan: { S:{I,G}; O:{I<G}, OC:{g1@G;g2@G..}, CL:{}; US:{}}

22 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Partial plan representation P = (A,O,L,OC,UL) A: set of action steps in the plan S 0,S 1,S 2 …,S inf O: set of action ordering S i < S j,… L : set of causal links OC: set of open conditions (subgoals remain to be satisfied) UL: set of unsafe links where p is deleted by some action S k p SiSi SjSj p SiSi SjSj S0S0 S1S1 S2S2 S3S3 S inf p ~p g1g1 g2g2 g2g2 oc 1 oc 2 G={g 1,g 2 }I={q 1,q 2 } q1q1 Flaw: Open condition OR unsafe link Solution plan: A partial plan with no remaining flaw Every open condition must be satisfied by some action No unsafe links should exist (i.e. the plan is consistent) POP background

23 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Algorithm 1. Let P be an initial plan 2. Flaw Selection: Choose a flaw f ( either open condition or unsafe link ) 3. Flaw resolution: If f is an open condition, choose an action S that achieves f If f is an unsafe link, choose promotion or demotion Update P Return NULL if no resolution exist 4. If there is no flaw left, return P else go to 2. S0S0 S1S1 S2S2 S3S3 S in f p ~p g1g1 g2g2 g2g2 oc 1 oc 2 q1q1 Choice points Flaw selection (open condition? unsafe link?) Flaw resolution (how to select (rank) partial plan?) Action selection (backtrack point) Unsafe link selection (backtrack point) S0S0 S inf g1g2g1g2 1. Initial plan: 2. Plan refinement (flaw selection and resolution): POP background

24 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Feb 4 th Least Commitment Planning Heuristics for controlling planners

25 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Example of PO Planning G Sussman anomaly example from Weld’s intro to least commitment planning G An example about use of confrontation using rocket domain –Both A and B are in the rocket, and all of them are on earth. We want B on moon, and A on Earth.

26 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Partially instantiated actions In regression planning --”states” contain -partially instantiated actions --equality/disequality constraints on variables --The regression operation needs to be extended such that a. An action is considered possibly relevant to a state S, if it has an effect E that “unifies” with some literal in S --We then add the unifier as the binding constraints b. An action is possibly illegal if it has an effect E that “unifies” with the negation of some literal S --We add the negation of the unifier as the “non-codesignation (disequality) constraints” --Checking the consistency of the states is much harder --inconsitency can occur either because two literals cannot hold together in any legal state (Checking this in the worst caseis as hard as planning) --Or because the set of equality/disequality constraints cannot be satisfied (Checking this, in the worst case, is as hard as CSP—NP complete) In Least-commitment planning --Partial plans will now have, [[in addition to steps, (partially instantiated) actions, precedence constraints, causal links and flaws (open conditions/unsafe links)]] equality/disequality constraints on variables -- When establishing a causal link, we add equality constraints (to make an effect unify with some conditions --When resolving an unsafe link, in addition to promotion, demotion and confrontation, we also need to handle “SEPARATION”—which involves adding a disequality constraint such that the offending effect will not unify with the condition being supported by a causal link In both regression and least-commitment planning, using partially instantiated action corresponds to an additional form of least commitment. It has the same sort of tradeoffs—ie. Reduces the branching factor, but increases the per-node cost.

27 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Possible (dis)advantages of partial order planning G B-D arguments –PO planners branch in terms of ways of supporting open conditions »Regression planners consider action position and relevance together »Progression planners don’t even consider relevance l They tend to have higher branching factors –PO planners will have search trees that are as deep as the number of open conditions + Number of unsafe links »Progression/regression have search trees whose solution depth is equal to the length of the plan (# actions) l For many problems former may be larger than the latter.. But consider a durative action where effects may occur in as little as 1 msec durations. An action that is 20min long will have to be handled as 120,000 action slices… So, here # action slices in the plan can be larger than the # open cond G Potential utility for more expressive planning problems –Maintenance goals (e.g. keep something on all during the plan) can be handled by just putting a causal link S0 to Sinfty –Temporal planning »Durative actions don’t need any special treatment »Arbitrary concurrency between actions is allowed by PO planning –The causal link/flaw resolution methodology lends itself to all sorts of creative extensions »There are PO extensions for conditiona planning, conformant planning and metric temporal planning The Heuristic Angle Estimating the distance of a partial plan from a Flaw-less solution plan is conceptually harder Than estimating the distance of a set of states from The init state which in turn is harder than estimating The cost of a single state from the goal state The Commitment angle Progression/regression planners commit to both Position and relevance. PS planners only commit To relevance. --Unnecessary commitments increase the chance of backtracking >>But also make it easier to validate/evalute the partial plan

28 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Tradeoffs among Basic Strategies Progression/regression must commit to both position and relevance of actions (Regression can judge relevance— sort of-- but handles sets of states) + Give state information (Easier plan validation) - Leads to premature commitment >but better heuristic guidance - Too many states when actions have durations Plan-space refinement (PSR) avoids constraining position + Reduces commitment (large candidate set /branch) >But harder to get heuristic estimate - Increases plan-validation costs + Easily extendible to actions with duration The Tastes great/ Less Filling holy wars

29 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati A recent (turbulent) history of planning 1995 Advent of CSP style compilation approach: Graphplan [Blum & Furst] SATPLAN [Kautz & Selman] Use of reachability analysis and Disjunctive constraints 1970s-1995 UCPOP, Zeno [Penberthy &Weld] IxTeT [Ghallab et al] The whole world believed in POP and was happy to stack 6 blocks! UCPOP Domination of heuristic state search approach: HSP/R [Bonet & Geffner] UNPOP [McDermott]: POP is dead! Importance of good Domain-independent heuristics 1997 UNPOP 2000 - Hoffman’s FF – a state search planner won the AIPS-00 competition! … but NASA’s highly publicized RAX still a POP dinosaur! POP believed to be good framework to handle temporal and resource planning [Smith et al, 2000] RePOP

30 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati In the beginning it was all POP. Then it was cruelly UnPOPped The good times return with Re(vived)POP

31 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Spare Tire Example

32 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Spare Tire Example

33 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Plan-space Planning

34 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Plan-space planning: Example

35 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati How do we “GUIDE” these planners? Or HEURISTICS/HEURISTICS/HEURISTICS AHOY

36 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Metrics for Plan Optimality G # actions in the solution G Make-span of the solution G Cumulative cost of the actions in the solution G Others? G Default metric is # actions in the solution G But, really, speed is more important…

37 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati

38 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati h* h1 h4 h5 Admissibility/Informedness h2 h3 Max(h2,h3)

39 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Where do heuristics (bounds) come from? From relaxed problems (the more relaxed, the easier to compute heuristic, but the less accurate it is) For path planning on the plane (with obstacles)? For 8-puzzle problem? For Traveling sales person? Assume away obstacles. The distance will then be The straightline distance Assume ability to move the tile directly to the place distance= # misplaced tiles Assume ability to move only one position at a time distance = Sum of manhattan distances. Relax the “circuit” requirement. Minimum spanning tree

40 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati

41 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Heuristics to guide Progression/Regression Set difference heuristic Intution: The cost of a state is the number of goals that are not yet present in it. Progression: The cost of a state S is | G \ S | The number of state-variable value pairs in G which are not present in S Regression: The cost of a state S is | S \ I | The number of state-variable value pairs in S that are not present in the initial state Problems with Set difference heuristic: 1. Every literal is given the same cost. Some literals are harder to achieve than others! 2. It is assumed that the cost of achieving n-literals together is n This ignores the interactions between literals (“subgoals”). -- It may be easier to achieve a set of literals together than to achieve each of them separately (+ve interactions) -- It may be harder to achieve a set of literals together than to achieve them separately. (-ve interactions)

42 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Subgoal interactions: Suppose we have a set of subgoals G 1,….G n Suppose the length of the shortest plan for achieving the subgoals in isolation is l 1,….l n We want to know what is the length of the shortest plan for achieving the n subgoals together, l 1…n If subgoals are independent: l 1..n = l 1 +l 2 +…+l n If subgoals have +ve interactions alone: l 1..n < l 1 +l 2 +…+l n If subgoals have -ve interactions alone: l 1..n > l 1 +l 2 +…+l n

43 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Estimating the cost of achieving individual literals (subgoals) Idea: Unfold a data structure called “planning graph” as follows: 1. Start with the initial state. This is called the zeroth level proposition list 2. In the next level, called first level action list, put all the actions whose preconditions are true in the initial state -- Have links between actions and their preconditions 3. In the next level, called first level propostion list, put: Note: A literal appears at most once in a proposition list. 3.1. All the effects of all the actions in the previous level. Links the effects to the respective actions. (If multiple actions give a particular effect, have multiple links to that effect from all those actions) 3.2. All the conditions in the previous proposition list (in this case zeroth proposition list). Put persistence links between the corresponding literals in the previous proposition list and the current proposition list. *4. Repeat steps 2 and 3 until there is no difference between two consecutive proposition lists. At that point the graph is said to have “leveled off” The next 2 slides show this expansion upto two levels

44 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati onT-A onT-B cl-A cl-B he Pick-A Pick-B onT-A onT-B cl-A cl-B he h-A h-B ~cl-A ~cl-B ~he

45 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati onT-A onT-B cl-A cl-B he Pick-A Pick-B onT-A onT-B cl-A cl-B he h-A h-B ~cl-A ~cl-B ~he St-A-B St-B-A Ptdn-A Ptdn-B Pick-A onT-A onT-B cl-A cl-B he h-A h-B ~cl-A ~cl-B ~he on-A-B on-B-A Pick-B

46 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Using the planning graph to estimate the cost of single literals: 1. We can say that the cost of a single literal is the index of the first proposition level in which it appears. --If the literal does not appear in any of the levels in the currently expanded planning graph, then the cost of that literal is: -- l+1 if the graph has been expanded to l levels, but has not yet leveled off -- Infinity, if the graph has been expanded (basically, the literal cannot be achieved from the current initial state) Examples: h({~he}) = 1 h ({On(A,B)}) = 2 h({he})= 0

47 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Estimating the cost of a set of literals (e.g. a state in regression search) Idea 0. [Max Heuristic] H max ({p,q,r..}) = max{h(p),h(q),….} Admissible, but very weak in practice Idea 2. [Sum Heuristic] Make subgoal independence assumption h ind ({p,q,r,...}) = h(p)+h(q)+h(r)+… Much better than set-difference heuristic in practice. --Ignores +ve interactions h({~he,h-A}) = h(~he) + h(h-A) = 1+1=2 But, we can achieve both the literals with just a single action, Pickup(A). So, the real cost is 1 --Ignores -ve interactions h({~cl(B),he}) = 1+0 = 1 But, there is really no plan that can achieve these two literals in this problem So, the real cost is infinity!

48 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati We can do a better job of accounting for +ve interactions if we define the cost of a set of literals in terms of the level h lev ({p,q,r})= The index of the first level of the PG where p,q,r appear together so, h({~he,h-A}) = 1 Interestingly, h lev is an admissible heuristic, even though h ind is not! (Prove) To better account for -ve interactions, we need to start looking into feasibility of subsets of literals actually being true together in a proposition level. Specifically, in each proposition level, we want to mark not just which individual literals are feasible, but also which pairs, which triples, which quadruples, and which n-tuples are feasible. (It is quite possible that two literals are independently feasible in level k, but not feasible together in that level) --The idea then is to say that the cost of a set of S literals is the index of the first level of the planning graph, where no subset of S is marked infeasible --The full scale mark-up is very costly, and makes the cost of planning graph construction equal the cost of enumerating the full progression search tree. -- Since we only want estimates, it is okay if talk of feasibility of upto k-tuples -- For the special case of feasibility of k=2 (2-sized subsets), there are some very efficient marking and propagation procedures. This is the idea of marking and propagating mutual exclusion relations.


Download ppt "A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Jan 28 th My lab was hacked and the systems are being rebuilt.. Homepage is."

Similar presentations


Ads by Google