18 th Feb Using reachability heuristics for PO planning Planning using Planning Graphs.

Slides:

Advertisements

Similar presentations

Hybrid BDD and All-SAT Method for Model Checking Orna Grumberg Joint work with Assaf Schuster and Avi Yadgar Technion – Israel Institute of Technology.

Advertisements

Causal-link Planning II José Luis Ambite. 2 CS 541 Causal Link Planning II Planning as Search State SpacePlan Space AlgorithmProgression, Regression POP.

Constraint Based Reasoning over Mutex Relations in Graphplan Algorithm Pavel Surynek Charles University, Prague Czech Republic.

CLASSICAL PLANNING What is planning ?  Planning is an AI approach to control  It is deliberation about actions  Key ideas  We have a model of the.

Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:

1 Graphplan José Luis Ambite * [* based in part on slides by Jim Blythe and Dan Weld]

Effective Approaches for Partial Satisfaction (Over-subscription) Planning Romeo Sanchez * Menkes van den Briel ** Subbarao Kambhampati * * Department.

Graph-based Planning Brian C. Williams Sept. 25 th & 30 th, J/6.834J.

Planning Graphs * Based on slides by Alan Fern, Berthe Choueiry and Sungwook Yoon.

Planning and Scheduling. 2 USC INFORMATION SCIENCES INSTITUTE Some background Many planning problems have a time-dependent component –  actions happen.

Best-First Search: Agendas

4 th Nov, Oct 23 rd Happy Deepavali!. 10/23 SAT & CSP.

Planning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 11.

3/25  Monday 3/31 st 11:30AM BYENG 210 Talk by Dana Nau Planning for Interactions among Autonomous Agents.

Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 1 Ryan Kinworthy CSCE Advanced Constraint Processing.

CSE 5731 Lecture 21 State-Space Search vs. Constraint- Based Planning CSE 573 Artificial Intelligence I Henry Kautz Fall 2001.

1 Planning. R. Dearden 2007/8 Exam Format  4 questions You must do all questions There is choice within some of the questions  Learning Outcomes: 1.Explain.

Computational Methods for Management and Economics Carla Gomes

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Jan 28 th My lab was hacked and the systems are being rebuilt.. Homepage is.

Constraint Satisfaction Problems

¾: The Saddam and George show Blair: Let's move on. Saddam, are you willing to destroy your stockpile of Samoud 2 missiles in accordance with UN weapons.

Planning II CSE 473. © Daniel S. Weld 2 Logistics Tournament! PS3 – later today Non programming exercises Programming component: (mini project) SPAM detection.

25 th Feb EBL example GP-CSP Other compilation schemes Setting up encodings.

1 BLACKBOX: A New Approach to the Application of Theorem Proving to Problem Solving Bart Selman Cornell University Joint work with Henry Kautz AT&T Labs.

Chapter 5 Outline Formal definition of CSP CSP Examples

Planning as Satisfiability CSE 574 April 8, 2003 Dan Weld.

RePOP: Reviving Partial Order Planning

Daniel Kroening and Ofer Strichman Decision Procedures An Algorithmic Point of View Deciding ILPs with Branch & Bound ILP References: ‘Integer Programming’

Constraint Satisfaction Problems

Classical Planning Chapter 10.

Decision Procedures An Algorithmic Point of View

Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:

Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:

Homework 1 ( Written Portion )  Max : 75  Min : 38  Avg : 57.6  Median : 58 (77%)

Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:

Midterm Review Prateek Tandon, John Dickerson. Basic Uninformed Search (Summary) b = branching factor d = depth of shallowest goal state m = depth of.

CP Summer School Modelling for Constraint Programming Barbara Smith 2. Implied Constraints, Optimization, Dominance Rules.

CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.

15.053Tuesday, April 9 Branch and Bound Handouts: Lecture Notes.

On the Relation between SAT and BDDs for Equivalence Checking Sherief Reda Rolf Drechsler Alex Orailoglu Computer Science & Engineering Dept. University.

Branch-and-Cut Valid inequality: an inequality satisfied by all feasible solutions Cut: a valid inequality that is not part of the current formulation.

Chapter 5 Constraint Satisfaction Problems

AI Lecture 17 Planning Noémie Elhadad (substituting for Prof. McKeown)

RePOP: Reviving Partial Order Planning XuanLong Nguyen & Subbarao Kambhampati Yochan Group:

Problem Reduction So far we have considered search strategies for OR graph. In OR graph, several arcs indicate a variety of ways in which the original.

Robust Planning using Constraint Satisfaction Techniques Daniel Buettner and Berthe Y. Choueiry Constraint Systems Laboratory Department of Computer Science.

© Daniel S. Weld 1 Logistics Travel Wed class led by Mausam Week’s reading R&N ch17 Project meetings.

Graphplan: Fast Planning through Planning Graph Analysis Avrim Blum Merrick Furst Carnegie Mellon University.

1 CMSC 471 Fall 2004 Class #21 – Thursday, November 11.

Automated Planning and Decision Making Prof. Ronen Brafman Automated Planning and Decision Making Graphplan Based on slides by: Ambite, Blyth and.

Graphplan CSE 574 April 4, 2003 Dan Weld. Schedule BASICS Intro Graphplan SATplan State-space Refinement SPEEDUP EBL & DDB Heuristic Gen TEMPORAL Partial-O.

Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:

Heuristic Search Planners. 2 USC INFORMATION SCIENCES INSTITUTE Planning as heuristic search Use standard search techniques, e.g. A*, best-first, hill-climbing.

EBL & DDB for Graphplan (P lanning Graph as Dynamic CSP: Exploiting EBL&DDB and other CSP Techniques in Graphplan) Subbarao Kambhampati Arizona State University.

1 Chapter 6 Planning-Graph Techniques. 2 Motivation A big source of inefficiency in search algorithms is the branching factor  the number of children.

Hybrid BDD and All-SAT Method for Model Checking

Planning as Satisfiability

Planning as Search State Space Plan Space Algorihtm Progression

RePOP: Reviving Partial Order Planning

Class #17 – Thursday, October 27

Graphplan/ SATPlan Chapter

Class #19 – Monday, November 3

Class #20 – Wednesday, November 5

Constraint satisfaction problems

Graphplan/ SATPlan Chapter

Graphplan/ SATPlan Chapter

An Introduction to Planning Graph

[* based in part on slides by Jim Blythe and Dan Weld]

Constraint satisfaction problems

Presentation transcript:

18 th Feb Using reachability heuristics for PO planning Planning using Planning Graphs

In the beginning it was all POP. Then it was cruelly UnPOPped The good times return with Re(vived)POP

A recent (turbulent) history of planning 1995 Advent of CSP style compilation approach: Graphplan [Blum & Furst] SATPLAN [Kautz & Selman] Use of reachability analysis and Disjunctive constraints 1970s-1995 UCPOP, Zeno [Penberthy &Weld] IxTeT [Ghallab et al] The whole world believed in POP and was happy to stack 6 blocks! UCPOP Domination of heuristic state search approach: HSP/R [Bonet & Geffner] UNPOP [McDermott]: POP is dead! Importance of good Domain-independent heuristics 1997 UNPOP Hoffman’s FF – a state search planner won the AIPS-00 competition! … but NASA’s highly publicized RAX still a POP dinosaur! POP believed to be good framework to handle temporal and resource planning [Smith et al, 2000] RePOP

To show that POP can be made very efficient by exploiting the same ideas that scaled up state search and Graphplan planners –Effective heuristic search control –Use of reachability analysis –Handling of disjunctive constraints RePOP, implemented on top of UCPOP –Dramatically better than all known partial-order planners –Outperforms Graphplan and competitive with state search planners in many (parallel) domains Outline RePOP: A revival for partial order planning

Partial plan representation P = (A,O,L,OC,UL) A: set of action steps in the plan S 0,S 1,S 2 …,S inf O: set of action ordering S i < S j,… L : set of causal links OC: set of open conditions (subgoals remain to be satisfied) UL: set of unsafe links where p is deleted by some action S k p SiSi SjSj p SiSi SjSj S0S0 S1S1 S2S2 S3S3 S inf p ~p g1g1 g2g2 g2g2 oc 1 oc 2 G={g 1,g 2 }I={q 1,q 2 } q1q1 Flaw: Open condition OR unsafe link Solution plan: A partial plan with no remaining flaw Every open condition must be satisfied by some action No unsafe links should exist (i.e. the plan is consistent) POP background

Algorithm 1. Let P be an initial plan 2. Flaw Selection: Choose a flaw f ( either open condition or unsafe link ) 3. Flaw resolution: If f is an open condition, choose an action S that achieves f If f is an unsafe link, choose promotion or demotion Update P Return NULL if no resolution exist 4. If there is no flaw left, return P else go to 2. S0S0 S1S1 S2S2 S3S3 S in f p ~p g1g1 g2g2 g2g2 oc 1 oc 2 q1q1 Choice points Flaw selection (open condition? unsafe link?) Flaw resolution (how to select (rank) partial plan?) Action selection (backtrack point) Unsafe link selection (backtrack point) S0S0 S inf g1g2g1g2 1. Initial plan: 2. Plan refinement (flaw selection and resolution): POP background

Our approach (main ideas) 1. Ranking partial plans: use an effective distance-based heuristic estimator. 2. Exploit reachability analysis: use invariants to discover implicit conflicts in the plan. 3. Unsafe links are resolved by posting disjunctive ordering constraints into the partial plan: avoid unnecessary and exponential multiplication of failures due to promotion/demotion splitting State-space idea of distance heuristic CSP ideas of consistency enforcement

1. Ranking partial plans using distance-based heuristic 1.Ranking Function: f(P) = g(P) + w h(P) g(P): number of actions in P h(P): estimate of number of new actions needed to refine P to become a solution plan w: increase the greediness of the heuristic search 2. Estimating h(P) h(P)  |O’|  Estimating |O’| Difficulty: How to account for positive and negative - Interactions among actions in O’ - Interactions among actions in P - Interactions between O’ and P S0S0 S1S1 S2S2 S3S3 p ~p g1g1 g2g2 g2g2 qrqr q1q1 S inf S4S4 P h(P)  |O’| = 2 S5S5 O’

Estimating h(P) Assumption: Negative effects of actions are relaxed (which are to be dealt with later in unsafe link set)  P has no unsafe link flaws no negative interactions among actions in P no negative interactions between O’ and P |O’| ~ cost(S) needed to achieve the set of open conditions S from the initial state Any state-space distance heuristic can be adapted Informedness of heuristic estimate can be improved by using weaker relaxation assumption S0S0 S1S1 S2S2 S3S3 p ~p g1g1 g2g2 g2g2 qrqr q1q1 S inf S4S4 P S5S5 O’ Open condition set S={p,q,r,..}

Distance-based heuristic estimate using length of relaxed plans (adapted from state-space heuristics extracted from planning graphs [Nguyen & Kambhampati 2000], [Hoffman 2000],…) Estimate h(P) = cost(S) 1. Build a planning graph PG from the initial state. 2. Cost(S) := 0 if all subgoals in S are in level Let p be a subgoal in S that appears last in PG. 4. Pick an action a in the graph that first achieves p 5. Update cost(S) := cost(a) + cost(S+Prec(a) – Eff(a)) where cost(a) = 0 if a  P, and 1 otherwise 6. Replace S = S+Prec(a) – Eff(a), goto 2 p a 0123 SS+Prec(a)-Eff(a) a

2. Handling unsafe link flaws SiSi SkSk SjSj p ~p q Prec(a) 1. For each unsafe link threatened by another step S k : Add disjunctive constraint to O S k < S i V S j < S k 2. Whenever a new ordering constraint is introduced to O (or whenever you feel like it), perform the constraint propagations: S 1 < S 2 V S 3 < S 4 ^ S 4 < S 3  S 1 < S 2 S 1 < S 2 ^ S 2 < S 3  S 1 < S 3 S 1 < S 2 ^ S 2 < S 1  False p SiSi SjSj Avoid the unnecessary exponential multiplication of failing partial plans

3. Detecting indirect conflicts using reachability analysis SkSk Prec(S k ) 1.Reachability analysis to detect inconsistency on(a,b) and clear(b) How to get state information in a partial plan? 3. Cutset: Set of literals that must be true at some point during execution of plan For each action a, pre-C(S k ) = Prec(S k ) U {p | is a link and S i < S k < S j } post-C(S k ) = Eff(S k ) U {p | is a link and S i < S k < S j } 4. If there exists a cutset that violates of an invariant the partial plan is invalid and should be pruned p SiSi SjSj p SiSi SjSj SmSm SnSn q SiSi SjSj p Eff(S k ) Disadvantage: Inconsistency checking is passive and maybe expensive Prec(S k ) + p + qEff(S k ) + p + q

Detecting indirect conflicts using reachability analysis SkSk Prec(S k ) 1.Generalizing unsafe link: S k threatens iff p is mutually exclusive (mutex) with either Prec(S k ) or Eff(S k ) 2.Unsafe link is resolved by posting disjunctive constraints (as before) S k < S i V S i < S j SmSm SnSn q SiSi SjSj p Eff(S k ) Detects indirect conflicts early Derives more disjunctive constraints to be propagated p SiSi SjSj

Experiments on RePOP RePOP is implemented on top of UCPOP planner using the three ideas presented –Written in Lisp, runs on Linux, 500MHz, 250MB –RePOP deals with set of totally instantiated actions thus avoids binding constraints Compared RePOP against UCPOP, Graphplan and AltAlt in a number of benchmark domains –Performance metrics Time Solution quality

Comparing planning time (time in seconds) Repop vs. UCPOP Graphplan AltAlt ProblemUCPOPRePOPGraphplanAltAlt Gripper Gripper min1.15 Gripper Rocket-a Rocket-b Logistics-a Logistics-b Logistics-c Logistics-d Bw-large-a45.78(5.23) Bw-large-b-(18.86) Bw-large-c-(137.84)

Comparing planning time (summary) 1.RePOP is very good in parallel domains (gripper, logistics, rocket, parallel blocks world) Completely dominates UCPOP Outperforms Graphplan in many domains Competitive with AltAlt 2.RePOP still inefficient in serial domains: Travel, Grid, 8-puzzle Repop vs. UCPOP Graphplan AltAlt

Some solution quality metrics 1. Number of actions 2. Makespan: minimum completion time (number of time steps) 3. Flexibility: Average number of actions that do not have ordering constraints with other actions Repop vs. UCPOP Graphplan AltAlt Num_act=4 Makespan=2 Flex = 1 4 Num_act=4 Makespan=2 Flex = Num_act=4 Makespan=4 Flex = 0

Comparing solution quality Number of actions/ time steps Flexibility degree ProblemRePOPGraphplanAltAltRePOPGraphplanAltAlt Gripper-821/ 1523/ 1521/ Gripper-1027/ 1929/ 1927/ Gripper-2059/ 39-59/ Rocket-a35/ 1640/ 736/ Rocket-b34/1530/ 734/ Logistics-a52/ 1380/ 1164/ Logistics-b42/ 1379/ 1353/ Logistics-c50/ 15-70/ Logistics-d69/ 33-85/ Bw-large-a(8/5) -11/ 49/ Bw-large-b(11/8) -18/ 511/ Bw-large-c(17/ 10) - -19/

Comparing solution quality (summary) RePOP generates partially ordered plans Number of actions: RePOP typically returns shortest plans Number of time steps (makespan): Graphplan produces optimal number of time steps (strictly when all actions have the same durations) RePOP comes close Flexibility: RePOP typically returns the most flexible plans

Ablation studies ProblemUCPOP+ CE+ HP+CE+HP (RePOP) Gripper-8*6557/ 3881*1299/ 698 Gripper-10*11407/ 6642*2215/ 1175 Gripper-12*17628/ 10147*3380/ 1776 Gripper-20***11097/ 5675 Rocket-a**30110/ / 4261 Rocket-b**85316/ / Logistics-a**411/ / 436 Logistics-b**920/ / 271 Logistics-c**4939/ / 4796 Logistics-d***16572/ CE: Consistency enforcement techniques (reachability analysis and disjunctive constraint handling HP: Distance-based heuristic

Conclusion Developed effective techniques for improving partial-order planners: –Ranking partial plan heuristics, –Disjunctive representation for unsafe links, –Use of reachability analysis Presented and evaluated RePOP –Brings POP to the realm of effective planning algorithms –Can now exploit the flexibility of POP without too much efficiency penalty Moral? –State-space vs. CSP vs. POP

Future Work Improve the efficiency of RePOP in serial domains –Serial domains may be an inherent weakness of POP Thankfully, Real-world domains tend to admit partially ordered plans (or there wouldn’t be any scheduling separate from planning!) Devise effective admissible heuristics for POP Extend RePOP to deal with –partially instantiated actions –time and resource constraints ReBuridan? ReZeno? ReIxTeT?

VHPOP: a successor to RePOP

Flaw Selection RePOP doesn’t particularly concentrate on flaw selection order Any order will guarantee completeness but different orders have different efficiency For RePOP, unsafe links are basically handled by disjunctive ordering constraints –So, we need an order for open conditions –Ideas: LIFO/FIFO Pick open conditions with the least # of resolution choices (LCFR) Pick open conditions that have the highest cost (in terms of reachability). Try a whole bunch in parallel! (this is what VHPOP does—although it doesn’t use reachability based ordering)

Summary (till now) Progression/Regression/Partial order planners Reachability heuristics for focusing them In practice, for classical planning, progression planners with reachability heuristics (e.g. FF) seem to do best –Assuming that we care mostly about “finding” a plan that is cheapest in terms of # actions (sort of) Open issues include: –Handling lifted actions (i.e. considering partially instantiated actions) –Handling optimality criteria other than # actions Minimal cost (assuming actions have non-uniform costs) Minimal make-span Maximal flexibility

Disjunctive planning/ Bounded length plan finding

PGs can be used as a basis for finding plans directly If there exists a k-length plan, it will be a subgraph of the k-length planning graph. (see the highlighted subgraph of the PG for our example problem)

20 th Feb

Finding the subgraphs that correspond to valid solutions.. --Can use specialized graph travesal techniques --start from the end, put the vertices corresponding to goals in. --if they are mutex, no solution --else, put at least one of the supports of those goals in --Make sure that the supports are not mutex --If they are mutex, backtrack and choose other set of supports. {No backtracking if we have no mutexes; basis for “relaxed plans”} --At the next level subgoal on the preconds of the support actions we chose. --The recursion ends at init level --Consider extracting the plan from the PG directly -- This search can also be cast as a CSP or SAT or IP The idea behind Graphplan

Backward search in Graphplan P1P1 P2P2 P3P3 P4P4 P5P5 P6P6 I1I1 I2I2 I3I3 X X X P1P1 P2P2 P3P3 P4P4 P5P5 P6P6 A5A5 A6A6 A7A7 A8A8 A9A9 A 10 A 11 G1G1 G2G2 G3G3 G4G4 A1A1 A2A2 A3A3 A4A4 P6P6 P1P1 Animated

Graphplan “History” Avrim Blum & Merrick Furst (1995) first came up with Graphplan idea—when planning community was mostly enamored with PO planning –Their original motivation was to develop a planner based on “max-flow” ideas Think of preconditions and effects as pipes and actions as valves… You want to cause maximal fluid flow from init state to a certain set of literals in the goal level Maxflow is polynomial (but planning isn’t— because of the nonlinearity caused by actions— unless ALL preconditions are in, the “action valve” won’t activate the effect pipes… So they wound up finding a backward search idea instead –Check out the animation…

The Story Behind Memos… Memos essentially tell us that a particular set S of conditions cannot be achieved at a particular level k in the PG. –We may as well remember this information—so in case we wind up subgoaling on any set S’ of conditions, where S’ is a superset of S, at that level, you can immediately declare failure “Nogood” learning—Storage/matching cost vs. benefit of reduced search.. Generally in our favor But, just because a set S={C1….C100} cannot be achieved together doesn’t necessarily mean that the reason for the failure has got to do with ALL those 100 conditions. Some of them may be innocent bystanders. –Suppose we can “explain” the failure as being caused by the set U which is a subset of S (say U={C45,C97})—then U is more powerful in pruning later failures –Idea called “Explanation based Learning” Improves Graphplan performance significantly…. [Rao, IJCAI-99; JAIR 2000]

Explaining Failures with Conflict Sets Conflict set for P 4 = P 4 Whenever P can’t be given a value v because it conflicts with the assignment of Q, add Q to P’s conflict set X X X P1P1 P2P2 P3P3 P4P4 P5P5 P6P6 A5A5 A6A6 A7A7 A8A8 A9A9 A 10 A 11 P2P2 P1P1

X X X P1P1 P2P2 P3P3 P4P4 P5P5 P6P6 A5A5 A6A6 A7A7 A8A8 A9A9 A 10 A 11 DDB & Memoization (EBL) with Conflict Sets When we reach a variable V with conflict set C during backtracking --Skip other values of V if V is not in C (DDB) --Absorb C into conflict set of V if V is in C --Store C as a memo if V is the first variable at this level Conflict set for P 3 = P 3 P2P2 P3P3 --Skip over P 3 when backtracking from P 4 Conflict set for P 4 = P 4 P2P2 P1P1 Conflict set for P 1 = P 4 P2P2 P1P1 P3P3 Conflict set for P 2 = P 4 P2P2 P1P1 Absorb conflict set being passed up P1P1 P2P2 P3P3 P4P4 Store P 1 P 2 P 3 P 4 as a memo

Regressing Conflict Sets P1P1 P2P2 P3P3 P4P4 P1P1 P2P2 P3P3 P4P4 P5P5 P6P6 G1G1 G2G2 G3G3 G4G4 A1A1 A2A2 A3A3 A4A4 P6P6 P1P1 P 1 P 2 P 3 P 4 regresses to G 1 G 2 -P 1 could have been regressed to G 4 but G 1 was assigned earlier --We can skip over G 4 & G 3 (DDB) Regression: What is the minimum set of goals at the previous level, whose chosen action supports generate a sub-goal set that covers the memo --Minimal set --When there is a choice, choose a goal that has been assigned earlier --Supports more DDB

Using EBL Memos If any stored memo is a subset of the current goal set, backtrack immediately Return the memo as the conflict set Smaller memos are more general and thus prune more failing branches Costlier memo-matching strategy --Clever indexing techniques available Set Enumeration Trees [Rymon, KRR92] UBTrees [Hoffman & Koehler, IJCAI-99] Allows generation of more effective memos at higher levels… Not possible with normal memoization

Speedups are correlated with memo-length reduction

Goal(Variable)/Action(Value) selection heuristics.. Pick hardest to satisfy variables (goals) first Pick easiest to satisfy values (actions) first –Hardness as Cardinality (goals that are supported by 15 actions are harder than those that can be supported by 17 actions) COST –Level of the goal (or set of action preconditions) in the PG –The length of the relaxed plan for supporting that goal in the PG [Romeo, AIPS-2000; also second part of AltAlt paper]

Level Heuristics help on solution bearing levels.

Level heuristics tend to be insensitive to length of the PG

More stuff on Graphplan Graphplan differentiated between Static Interference and Mutex –Two actions interfere statically if ones effects are inconsistent with the other actions preconditions/effects –Two actions are mutex if they are either statically interfering or have been marked mutex by the mutex propagation procedure As long as we have static interference relations marked, then we are guaranteed to find a solution with backward search! Mutex propagation only IMPROVES the efficiency of the backward search.. Mutex propagation is thus very similar to consistency enforcement in CSP Memoization improves it even further Efficient memoization can improve it even more further… Original Graphplan algorithm used “parallel planing graphs” (rather than serial planning graphs). –Not every pair of non-noop actions are marked mutex –This meant that you can get multiple actions per time step Serial PG has more mutex relations (apart from interferences that come because of precondition/effects, we basically are adding some sort of “resource-based” mutexes—saying the agent doesn’t have resources to do more than one action per level).

Optimality of Graphplan Original Graphplan will produce “step- optimal” plans –NOT optimal wrt #actions Can get it with serial Graphplan –NOT cost optimal Need Multi-PEGG..(according to Terry)

Graphplan and Termination Suppose we grew the graph to level-off and still did not find a solution. – Is the problem unsolvable? Example: Actions A1…A100 gives goals G1…G100. Can’t do more than one action at a level (assume we are using serial PG) Level at which G1..G100 are true=? Length of the plan=? One can see the process of extracting the plan as verifying that at least one execution thread is devoid of n-ary mutexes –Unsolvable if memos also do not change from level to level

Variables/Domains: ~cl-B-2: { #, St-A-B-2, Pick-B- 2} he-2: {#, St-A-B-2, St-B-A- 2,Ptdn-A-2,Ptdn-B-2} h-A-1: {#, Pick-A-1} h-B-1: {#,Pick-B-1 } …. Constraints: he-2 = St-A-B-2 => h-A-1 !=# {activation} On-A-B-2 = St-A-B-2 => On- B-A-2 != St-B-A-2 {mutex constraints} Goals: ~cl-B-2 != # he-2 !=# Conversion to CSP -- This search can also be cast as a CSP Variables: literals in proposition lists Values: actions supporting them Constraints: Mutex and Activation constraints

CSP Encodings can be faster than Graphplan Backward Search Do & Kambhampati, 2000 But but WHY? --We are taking the cost of converting PG into CSP (and also tend to lose the ability to use previous level search) --there is NO reason why the search for finding the valid subgraph has to go level-by-level and back to front. --CSP won’t be hobbled by level-by-level and back-to-front

Mutex propagation as CSP pre-processing Suppose we start with a PG that only marks every pair of “interfering” actions as mutex Any pair of non-noop actions are interfering Any pair of actions are interfering if one gives P and other gives or requires ~P No propagation is done –Converting this PG and CSP and solving it will still give a valid solution (if there is one) –So what is mutex propagation doing? It is “explicating” implicit constraints A special subset of “3-consistency” enforcement –Recall that enforcing k-consistency involves adding (k-1)-ary constraints –*Not* full 3-consistency (which can be much costlier) »So enforcing the consistency on PG is cheaper than enforcing it after conversion to CSP...

Alternative encodings.. The problem of finding a valid plan from the planning graph can be encoded on any combinatorial substrate Alternatives: –CSP [GP-CSP] –SAT [Blackbox; SATPLAN] –IP [Vossen et. Al]

Compilation to CSP [Do & Kambhampati, 2000] Variables: Propositions (In-A-1, In-B-1,..At-R-E-0 …) Domains: Actions supporting that proposition in the plan In-A-1 : { Load-A-1, #} At-R-E-1: {P-At-R-E-1, #} Constraints: Mutual exclusion ~[ ( In-A-1 = Load-A-1) & (At-R-M-1 = Fly-R-1)] ; etc.. Activation In-A-1 != # & In-B-1 != # (Goals must have action assignments) In-A-1 = Load-A-1 => At-R-E-0 != #, At-A-E-0 != # (subgoal activation constraints) [Corresponds to a regression-based proof] Goals: In(A),In(B) CSP: Given a set of discrete variables, the domains of the variables, and constraints on the specific values a set of variables can take in combination, FIND an assignment of values to all the variables which respects all constraints

Compilation to SAT Init: At-R-E-0 & At-A-E-0 & At-B-E-0 Goal: In-A-1 & In-B-1 Graph: “cond at k => one of the supporting actions at k-1” In-A-1 => Load-A-1 In-B-1 => Load-B-1 At-R-M-1 => Fly-R-1 At-R-E-1 => P-At-R-E-1 Load-A-1 => At-R-E-0 & At-A-E-0 “Actions => preconds” Load-B-1 => At-R-E-0 & At-B-E-0 P-At-R-E-1 => At-R-E-0h ~In-A-1 V ~ At-R-M-1 ~In-B-1 V ~At-R-M-1 “Mutexes” [Kautz & Selman] Goals: In(A),In(B) SAT is CSP with Boolean Variables

Compilation to Integer Linear Programming Motivations –Ability to handle numeric quantities, and do optimization –Heuristic value of the LP relaxation of ILP problems Conversion –Convert a SAT/CSP encoding to ILP inequalities E.g. X v ~Y v Z => x + (1 - y) + z >= 1 –Explicitly set up tighter ILP inequalities (Cutting constraints) If X,Y,Z are pairwise mutex, we can write x+y+z <= 1 (instead of x+y <=1 ; y+z <=1 ; z +x <= 1) [ Walser & Kautz; Vossen et. al; Bockmayr & Dimopolous] ILP: Given a set of real valued variables, a linear objective function on the variables, a set of linear inequalities on the variables, and a set of integrality restrictions on the variables, Find the values of the feasible variables for which the objective function attains the maximum value -- 0/1 integer programming corresponds closely to SAT problem

Compilation to Binary Decision Diagrams (BDDs) Idea: Represent disjunctive plans as BDDs and plan extension as BDD operations –Proposition list at level k is an approximation to the set of states reachable in k steps. The set can be represented compactly as BDDs Plan growth can be modeled as direct manipulations on BDD –Operations such as “action projection” need to be modeled as BDD modifications [ Cimatti et. al., Fourman, Hans-Peter] X1X1 X2X2 BDD for X 1 & X BDDs support compact representation and direct manipulation of boolean formulae on a finite set of propositions. (Popular in CAD community) Standard algorithms for converting a boolean formulae into BDDs and for supporting standard boolean operations on them [Bryant et. al.]

Relative Tradeoffs Offered by the various compilation substrates CSP encodings support implicit representations –More compact encodings [Do & Kambhampati, 2000] –Easier integration with Scheduling techniques ILP encodings support numeric quantities –Seamless integration of numeric resource constraints [Walser & Kautz, 1999] –Not competitive with CSP/SAT for problems without numeric constraints SAT encodings support axioms in propositional logic form –May be more natural to add (for whom ;-)

CSP Encodings can be more compact: GP-CSP Do & Kambhampati, 2000

Advantages of CSP encodings over SAT encodings: GP-CSP Size of learning: k = 10 for both size-based and relevance-based Speedup over GP-CSP up to 10x Faster than SAT in most cases, up to 70x over Blackbox

Direct vs. compiled solution extraction 8Need to adapt CSP/SAT techniques 4Can exploit approaches for compacting the plan 4Can make the search incremental across iterations 4Can exploit the latest advances in SAT/CSP solvers 8Compilation stage can be time consuming, leads to memory blow-up 8Makes it harder to exploit search from previous iterations 4Makes it easier to add declarative control knowledge DIRECT Compiled