Presentation is loading. Please wait.

Presentation is loading. Please wait.

Planning Chapter 11 Yet another popular formulation for AI – Logic-based language – One of the most structured formulations Can be translate into less.

Similar presentations


Presentation on theme: "Planning Chapter 11 Yet another popular formulation for AI – Logic-based language – One of the most structured formulations Can be translate into less."— Presentation transcript:

1 Planning Chapter 11 Yet another popular formulation for AI – Logic-based language – One of the most structured formulations Can be translate into less structured formulations such as state-space, CSP, SAT, etc.

2 What is Planning Generate sequences of actions to perform tasks and achieve objectives. – States, actions and goals Objective: John wants to go to NYC Plan: taxi(John,home,STL) board(John,plane) fly(plane,STL,JFK) exit(John,plane) taxi(John,JFK,5 th Ave) Classical planning environment: fully observable, deterministic, finite, and discrete.

3 Difficulty of real world problems What’s wrong with state-space search, e.g. A*? – Which actions are relevant? Too many irrelevant actions Board(Mary,plane), board(Tim,plane), taxi(John, home, six flags), fly(plane, LAX, SFO), …. – What is a good heuristic functions? It is difficult to come up with domain-independent heuristics if only a state-space description is given – How to decompose the problem? Most real-world problems are nearly decomposable. E.g. John, Mary, and Tim want to go to different cities

4 Planning language What is a good language? – Expressive enough to describe a wide variety of problems. – Restrictive enough to allow efficient algorithms to operate on it. – Planning algorithm should be able to take advantage of the logical structure of the problem. STRIPS

5 Example: air cargo transport Init(At(C1, SFO)  At(C2,JFK)  At(P1,SFO)  At(P2,JFK)  Cargo(C1)  Cargo(C2)  Plane(P1)  Plane(P2)  Airport(JFK)  Airport(SFO)) Goal(At(C1,JFK)  At(C2,SFO)) Action(Load(c,p,a) PRECOND: At(c,a)  At(p,a)  Cargo(c)  Plane(p)  Airport(a) EFFECT: ¬At(c,a)  In(c,p)) Action(Unload(c,p,a) PRECOND: In(c,p)  At(p,a)  Cargo(c)  Plane(p)  Airport(a) EFFECT: At(c,a)  ¬In(c,p)) Action(Fly(p,from,to) PRECOND: At(p,from)  Plane(p)  Airport(from)  Airport(to) EFFECT: ¬ At(p,from)  At(p,to)) [Load(C1,P1,SFO), Fly(P1,SFO,JFK),Unload(C1,P1,JFK), Load(C2,P2,JFK), Fly(P2,JFK,SFO), Unload(C2,P2,SFO)]

6 Example: air cargo transport Init(At(C1, SFO)  At(C2,JFK)  At(P1,SFO)  At(P2,JFK)  Cargo(C1)  Cargo(C2)  Plane(P1)  Plane(P2)  Airport(JFK)  Airport(SFO)) Goal(At(C1,JFK)  At(C2,SFO)  At(P1,SFO)) Action(Load(c,p,a) PRECOND: At(c,a)  At(p,a)  Cargo(c)  Plane(p)  Airport(a) EFFECT: ¬At(c,a)  In(c,p)) Action(Unload(c,p,a) PRECOND: In(c,p)  At(p,a)  Cargo(c)  Plane(p)  Airport(a) EFFECT: At(c,a)  ¬In(c,p)) Action(Fly(p,from,to) PRECOND: At(p,from)  Plane(p)  Airport(from)  Airport(to) EFFECT: ¬ At(p,from)  At(p,to)) [Load(C1,P1,SFO), Fly(P1,SFO,JFK),Unload(C1,P1,JFK), Load(C2,P2,JFK), Fly(P2,JFK,SFO), Unload(C2,P2,SFO), Fly(P1,JFK,SFO)]

7 Example: air cargo transport Init(At(C1, SFO)  At(C2,JFK)  At(P1,SFO)  At(P2,JFK)  Cargo(C1)  Cargo(C2)  Plane(P1)  Plane(P2)  Airport(JFK)  Airport(SFO)) Goal(At(C1,JFK)  At(C2,SFO)  At(P1,SFO)) Action(Load(c,p,a) PRECOND: At(c,a)  At(p,a)  Cargo(c)  Plane(p)  Airport(a) EFFECT: ¬At(c,a)  In(c,p)) Action(Unload(c,p,a) PRECOND: In(c,p)  At(p,a)  Cargo(c)  Plane(p)  Airport(a) EFFECT: At(c,a)  ¬In(c,p)) Action(Fly(p,from,to) PRECOND: At(p,from)  Plane(p)  Airport(from)  Airport(to) EFFECT: ¬ At(p,from)  At(p,to)) [Load(C2,P2,JFK), Fly(P2,JFK,SFO), Unload(C2,P2,SFO), Load(C1,P2,SFO), Fly(P2,SFO,JFK),Unload(C1,P2,JFK) ]

8 General STRIPS features Representation of states – Represent the state of the world as a conjunction of positive literals. Propositional literals: Poor  Unknown First order predicates: At(Plane1, Melbourne)  At(Plane2, Sydney) – Closed world assumption (anything not explicitly claimed true is assumed false) Representation of goals – Partially specified state and represented as a conjunction of positive ground literals At(John, NYC)  At(Mary, LA) – A goal is satisfied if the state contains all literals in goal

9 General STRIPS features Representations of actions – Action = PRECOND + EFFECT Action(Fly(p,from, to), PRECOND: At(p,from)  Plane(p)  Airport(from)  Airport(to) EFFECT: ¬AT(p,from)  At(p,to)) (p, from, to) need to be instantiated e.g. Fly(plane, STL, JFK) – Add-list vs delete-list in Effect

10 Language semantics? How do actions affect states? – An action is applicable in any state that satisfies the preconditions Suppose current state: At(P1,JFK)  At(P2,SFO)  Plane(P1)  Plane(P2)  Airport(JFK)  Airport(SFO) Satisfies PRECOD: At(p,from)  Plane(p)  Airport(from)  Airport(to) Thus the action fly(p1,JFK, SFO) is applicable.

11 Language semantics? The result of executing action a in state s is the state s’ – s’ is same as s except Any positive literal P in the effect of a is added to s’ Any negative literal ¬P is removed from s’ EFFECT: ¬AT(p,from)  At(p,to): At(P1,SFO)  At(P2,SFO)  Plane(P1)  Plane(P2)  Airport(JFK)  Airport(SFO) – STRIPS assumption: every literal NOT in the effect remains unchanged

12 Expressiveness and extensions STRIPS is simplified – Important limit: function-free literals Allows for propositional representation Function symbols lead to infinitely many states and actions Extension: Action Description language (ADL) Action(Fly(p:Plane, from: Airport, to: Airport), PRECOND: At(p,from)  (from  to) EFFECT: ¬At(p,from)  At(p,to)) Standardization : Planning domain definition language (PDDL)

13 Example: air cargo transport Init(At(C1, SFO)  At(C2,JFK)  At(P1,SFO)  At(P2,JFK)  Cargo(C1)  Cargo(C2)  Plane(P1)  Plane(P2)  Airport(JFK)  Airport(SFO)) Goal(At(C1,JFK)  At(C2,SFO)) Action(Load(c,p,a) PRECOND: At(c,a)  At(p,a)  Cargo(c)  Plane(p)  Airport(a) EFFECT: ¬At(c,a)  In(c,p)) Action(Unload(c,p,a) PRECOND: In(c,p)  At(p,a)  Cargo(c)  Plane(p)  Airport(a) EFFECT: At(c,a)  ¬In(c,p)) Action(Fly(p,from,to) PRECOND: At(p,from)  Plane(p)  Airport(from)  Airport(to) EFFECT: ¬ At(p,from)  At(p,to)) [Load(C1,P1,SFO), Fly(P1,SFO,JFK), Load(C2,P2,JFK), Fly(P2,JFK,SFO)]

14 Example: Spare tire problem Init(At(Flat, Axle)  At(Spare,trunk)) Goal(At(Spare,Axle)) Action(Remove(Spare,Trunk) PRECOND: At(Spare,Trunk) EFFECT: ¬At(Spare,Trunk)  At(Spare,Ground)) Action(Remove(Flat,Axle) PRECOND: At(Flat,Axle) EFFECT: ¬At(Flat,Axle)  At(Flat,Ground)) Action(PutOn(Spare,Axle) PRECOND: At(Spare,Groundp)  ¬At(Flat,Axle) EFFECT: At(Spare,Axle)  ¬At(Spare,Ground)) Action(LeaveOvernight PRECOND: EFFECT: ¬ At(Spare,Ground)  ¬ At(Spare,Axle)  ¬ At(Spare,trunk)  ¬ At(Flat,Ground)  ¬ At(Flat,Axle) ) This example goes beyond STRIPS: negative literal in pre-condition (ADL description)

15 Example: Blocks world Predicates: on (B, table), clear(B), on(D,A), on(C,D), on(A, table), clear(C) Goal: on(A,B) and on(B,C)

16 Example: Blocks world Init(On(A, Table)  On(B,Table)  On(C,Table)  Clear(A)  Clear(B)  Clear(C)) Goal(On(A,B)  On(B,C)) Action(Move(b,x,y) PRECOND: On(b,x)  Clear(b)  Clear(y) EFFECT: On(b,y)  Clear(x)  ¬ On(b,x)  ¬ Clear(y))

17 Example: Blocks world Init(On(A, Table)  On(B,Table)  On(C,Table)  Clear(A)  Clear(B)  Clear(C)) Goal(On(A,B)  On(B,C)) Action(Move(b,x,y) PRECOND: On(b,x)  Clear(b)  Clear(y) EFFECT: On(b,y)  Clear(x)  ¬ On(b,x)  ¬ Clear(y)) Action(MoveToTable(b,x) PRECOND: On(b,x)  Clear(b) EFFECT: On(b,Table)  Clear(x)  ¬ On(b,x)) Spurious actions are possible: Move(B,C,C), Move(B,C,Table)

18 Example: Blocks world Init(On(A, Table)  On(B,Table)  On(C,Table)  Block(A)  Block(B)  Block(C)  Clear(A)  Clear(B)  Clear(C)) Goal(On(A,B)  On(B,C)) Action(Move(b,x,y) PRECOND: On(b,x)  Clear(b)  Clear(y)  Block(b)  Block(x)  Block(y) EFFECT: On(b,y)  Clear(x)  ¬ On(b,x)  ¬ Clear(y)) Action(MoveToTable(b,x) PRECOND: On(b,x)  Clear(b)  Block(b)  Block(x) EFFECT: On(b,Table)  Clear(x)  ¬ On(b,x)) Spurious actions are possible: Move(B,C,C)

19 State-Space vs. Planning Formalisms State-Space FormalismSTRIPS Planning Atomic objectState, actionpropositional object StateAtomicConjunction of predicates ActionImplicitly defined by (state, state) transitions Explicitly defined by preconditions & effects Initial stateAtomicConjunction of predicates (full) Goal stateGoal testConjunction of predicates (partial) ExpressivenessHigherLower HeuristicDomain dependentDomain independent

20 Solving STRIPS planning First of all, look at the goals – Blocksworld: Goals are on(A,B) and on(B,C) Can be do linear planning? – Achieve goal1 – Achieve goal2 while keeping goal1 – Achieve goal3 while keeping goal1, goal2 ….. -Achieve goal-n while keeping goals 1 to n-1 -If so, a large planning task can be decomposed

21 The Sussman Anomaly

22 The Rocket Problem (Veloso) Objects: n boxes, Positions (Earth, Moon), one Rocket Actions: load/unload a box, move the Rocket (oneway: only from earth to moon, no way back!) Linear planning: to reach the goal, that Box1 is on the Moon, load it, shoot the Rocket, unload it, now no other Box can be transported!

23 Interleaving of Goals Linear planning is generally not viable Non-linear planning allows that a sequence of planning steps dealing with one goal is interrupted to deal with another goal. For the Sussman Anomaly, that means that after block C is put on the table pursuing goal on(A, B), the planner switches to the goal on(B, C). Non-linear planning corresponds to dealing with goals organized in a set.

24 Planning by state-space search Both forward and backward search possible – Using a state-space formulation for the planning problem Progression planners – forward state-space search – Consider the effect of all possible actions in a given state Regression planners – backward state-space search – To achieve a goal, what must have been true in the previous state.

25 Progression and regression

26 Progression algorithm Formulation as state-space search problem: – Initial state = initial state of the planning problem Literals not appearing are false – Actions = those whose preconditions are satisfied Add positive effects, delete negative – Goal test = does the state satisfy the goal – Step cost = each action costs 1 Any graph search that is complete is a complete planning algorithm. – E.g. A* Inefficient: – (1) irrelevant action problem – (2) good heuristic required for efficient search

27 Regression algorithm How to determine predecessors? – What are the states from which applying a given action leads to the goal? Goal state = At(C1, B)  At(C2, B)  …  At(C20, B) Relevant action for first conjunct: Unload(C1,p,B) Works only if pre-conditions are satisfied. Previous state= In(C1, p)  At(p, B)  At(C2, B)  …  At(C20, B) Subgoal At(C1,B) should not be present in this state. Actions must not undo desired literals (consistent) Load(C2,p,B) is not consistent Main advantage: only relevant actions are considered. – Often much lower branching factor than forward search.

28 Regression algorithm General process for predecessor construction – Give a goal description G – Let A be an action that is relevant and consistent – The predecessors is as follows: Any positive effects of A that appear in G are deleted. Each precondition literal of A is added, unless it already appears. Any standard search algorithm can be added to perform the search. Termination when predecessor satisfied by initial state.

29 Heuristics for state-space search Neither progression or regression are very efficient without a good heuristic. – How many actions are needed to achieve the goal? – Exact solution is NP hard, find a good estimate – Could use the optimal solution to the relaxed problem. Remove all preconditions from actions Remove all delete effects from actions

30 Planning graphs Blum & Furst (1996) – A significant breakthrough in planning research A structure that gives rich information of the planning domain Can be used to 1) directly extract solutions, or 2) achieve better heuristic estimates.

31 Planning graphs Consists of a sequence of levels that correspond to time steps in the plan. – Level 0 is the initial state. – Each level consists of a set of literals and a set of actions. Literals = all those that could be true at that time step, depending upon the actions executed at the preceding time step Actions = all those actions that could have their preconditions satisfied at that time step, depending on which of the literals actually hold

32 Planning graphs “Could”? – Records only a restricted subset of possible negative interactions among actions. Example: Init(Have(Cake)) Goal(Have(Cake)  Eaten(Cake)) Action(Eat(Cake), PRECOND: Have(Cake) EFFECT: ¬Have(Cake)  Eaten(Cake)) Action(Bake(Cake), PRECOND: ¬ Have(Cake) EFFECT: Have(Cake))

33 Cake example Start at level S0 and determine action level A0 and next level S1. – A0: all actions whose preconditions are satisfied in the previous level. – Connect precond and effect of actions S0 --> S1 – Inaction is represented by persistence actions. Level A0 contains the actions that could occur – Conflicts between actions are represented by mutex links

34 Cake example Level S1 contains all literals that could result from picking any subset of actions in A0 – Conflicts between literals that can not occur together (as a consequence of the selection action) are represented by mutex links. – S1 defines multiple states and the mutex links are the constraints that define this set of states. Continue until two consecutive levels are identical: leveled off

35 Cake example A mutex relation holds between two actions when: – Inconsistent effects: one action negates the effect of another. – Interference: one of the effects of one action is the negation of a precondition of the other. – Competing needs: one of the preconditions of one action is mutually exclusive with the precondition of the other. A mutex relation holds between two literals when (inconsistent support): – If one is the negation of the other OR – if each possible action pair that could achieve the literals is mutex.

36 Mutex Mutex in Graphplan is not complete (two actions that not mutex may not always be compatible) – But useful Original Mutex is level-dependent – But expensive to compute Used by GRAPHPLAN – Level independent mutex (called persistent mutex) is a subset of mutex but efficient to compute Used by some other planners such as LPG, SATPLAN

37 Example: Spare tire problem Init(At(Flat, Axle)  At(Spare,trunk)) Goal(At(Spare,Axle)) Action(Remove(Spare,Trunk) PRECOND: At(Spare,Trunk) EFFECT: ¬At(Spare,Trunk)  At(Spare,Ground)) Action(Remove(Flat,Axle) PRECOND: At(Flat,Axle) EFFECT: ¬At(Flat,Axle)  At(Flat,Ground)) Action(PutOn(Spare,Axle) PRECOND: At(Spare,Groundp)  ¬At(Flat,Axle) EFFECT: At(Spare,Axle)  ¬At(Spare,Ground)) Action(LeaveOvernight PRECOND: EFFECT: ¬ At(Spare,Ground)  ¬ At(Spare,Axle)  ¬ At(Spare,trunk)  ¬ At(Flat,Ground)  ¬ At(Flat,Axle) )

38 Another GRAPHPLAN example PG are monotonically increasing or decreasing: – Literals increase monotonically – Actions increase monotonically – Mutexes decrease monotonically Because of these properties and because there is a finite number of actions and literals, every PG will eventually level off !

39 The GRAPHPLAN Algorithm How to extract a solution directly from the PG function GRAPHPLAN(problem) return solution or failure graph  INITIAL-PLANNING-GRAPH(problem) goals  GOALS[problem] loop do if goals all non-mutex in last level of graph then do solution  EXTRACT-SOLUTION(graph, goals, LENGTH(graph)) if solution  failure then return solution else if NO-SOLUTION-POSSIBLE(graph) then return failure graph  EXPAND-GRAPH(graph, problem)

40 Example: Spare tire problem Init(At(Flat, Axle)  At(Spare,trunk)) Goal(At(Spare,Axle)) Action(Remove(Spare,Trunk) PRECOND: At(Spare,Trunk) EFFECT: ¬At(Spare,Trunk)  At(Spare,Ground)) Action(Remove(Flat,Axle) PRECOND: At(Flat,Axle) EFFECT: ¬At(Flat,Axle)  At(Flat,Ground)) Action(PutOn(Spare,Axle) PRECOND: At(Spare,Groundp)  ¬At(Flat,Axle) EFFECT: At(Spare,Axle)  ¬At(Spare,Ground)) Action(LeaveOvernight PRECOND: EFFECT: ¬ At(Spare,Ground)  ¬ At(Spare,Axle)  ¬ At(Spare,trunk)  ¬ At(Flat,Ground)  ¬ At(Flat,Axle) )

41 GRAPHPLAN example Initially the plan consist of 5 literals from the initial state and the CWA literals (S0). Add actions whose preconditions are satisfied by EXPAND-GRAPH (A0) Also add persistence actions and mutex relations. Add the effects at level S1 Repeat until goal is in level Si

42 GRAPHPLAN example EXPAND-GRAPH also looks for mutex relations – Inconsistent effects E.g. Remove(Spare, Trunk) and LeaveOverNight due to At(Spare,Ground) and not At(Spare, Ground) – Interference E.g. Remove(Flat, Axle) and LeaveOverNight At(Flat, Axle) as PRECOND and not At(Flat,Axle) as EFFECT – Competing needs E.g. PutOn(Spare,Axle) and Remove(Flat, Axle) due to At(Flat.Axle) and not At(Flat, Axle) – Inconsistent support E.g. in S2, At(Spare,Axle) and At(Flat,Axle)

43 GRAPHPLAN example In S2, the goal literals exist and are not mutex with any other – Solution might exist and EXTRACT-SOLUTION will try to find it EXTRACT-SOLUTION : – Initial state = last level of PG and goal goals of planning problem – Actions = select any set of non-conflicting actions that cover the goals in the state – Goal = reach level S0 such that all goals are satisfied – Cost = 1 for each action.

44 GRAPHPLAN example Termination? YES PG are monotonically increasing or decreasing: – Literals increase monotonically – Actions increase monotonically – Mutexes decrease monotonically Because of these properties and because there is a finite number of actions and literals, every PG will eventually level off !

45 PG and heuristic estimation PG’s provide information about the problem – A literal that does not appear in the final level of the graph cannot be achieved by any plan. Useful for backward search (cost = inf). – Level of appearance can be used as cost estimate of achieving any goal literals = level cost. – Small problem: several actions can occur Restrict to one action using serial PG (add mutex links between every pair of actions, except persistence actions). – Cost of a conjunction of goals? Max-level, sum-level, set-level heuristics – The Fast Forward heuristic (Hoffmann’00) Extract a plan from a relaxed planning graph (RPG) Not admissible but performs well

46 June 7th, 2006 ICAPS'06 Tutorial T6 46 Rover Domain

47 June 7th, 2006 ICAPS'06 Tutorial T6 47 Level Based Heuristics The distance of a proposition is the index of the first proposition layer in which it appears – Proposition distance changes when we propagate cost functions – described later What is the distance of a Set of propositions?? – Set-Level: Index of first proposition layer where all goal propositions appear Admissible Gets better with mutexes, otherwise same as max – Max: Maximum distance proposition – Sum: Summation of proposition distances

48 June 7th, 2006 ICAPS'06 Tutorial T6 48 Example of Level Based Heuristics set-level(s I, G) = 3 max(s I, G) = max(2, 3, 3) = 3 sum(s I, G) = 2 + 3 + 3 = 8

49 June 7th, 2006 ICAPS'06 Tutorial T6 49 How do Level-Based Heuristics Break? g A p1p1 p2p2 p3p3 p 99 p 100 p1p1 p2p2 p3p3 p 99 p 100 B1 q B2 B3 B99 B100 qq B1 B2 B3 B99 B100 A1A1 P1P1 P2P2 A0A0 P0P0 The goal g is reached at level 2, but requires 101 actions to support it.

50 June 7th, 2006 ICAPS'06 Tutorial T6 50 Relaxed Plan Heuristics When Level does not reflect distance well, we can find a relaxed plan. A relaxed plan is subgraph of the planning graph, where: – Every goal proposition is in the relaxed plan at the level where it first appears – Every proposition in the relaxed plan has a supporting action in the relaxed plan – Every action in the relaxed plan has its preconditions supported. Relaxed Plans are not admissible, but are generally effective. Finding the optimal relaxed plan is NP-hard, but finding a greedy one is easy.

51 June 7th, 2006 ICAPS'06 Tutorial T6 51 Example of Relaxed Plan Heuristic Count Actions RP(s I, G) = 8 Identify Goal Propositions Support Goal Propositions Individually

52 June 7th, 2006 ICAPS'06 Tutorial T6 52 Results Relaxed Plan Level-Based

53 June 7th, 2006 ICAPS'06 Tutorial T6 53 Results (cont’d) Relaxed Plan Level-Based

54 June 7th, 2006 ICAPS'06 Tutorial T6 54 Rover Cost Model

55 June 7th, 2006 ICAPS'06 Tutorial T6 55 Cost Propagation at(  ) sample(soil,  ) drive( ,  ) drive( ,  ) avail(soil,  ) avail(rock,  ) avail(image,  ) at(  ) avail(soil,  ) avail(rock,  ) avail(image,  ) at(  ) at(  ) have(soil) sample(soil,  ) drive( ,  ) drive( ,  ) at(  ) avail(soil,  ) avail(rock,  ) avail(image,  ) at(  ) at(  ) have(soil) drive( ,  ) drive( ,  ) drive( ,  ) commun(soil) sample(rock,  ) sample(image,  ) drive( ,  ) have(image) comm(soil) have(rock) 20 10 30 20 10 30 20 35 10 30 35 25 15 35 40 20 35 10 25 A1A1 A0A0 P1P1 P0P0 P2P2 Cost Reduces because Of different supporter At a later level

56 June 7th, 2006 ICAPS'06 Tutorial T6 56 at(  ) Cost Propagation (cont’d) avail(soil,  ) avail(rock,  ) avail(image,  ) at(  ) at(  ) have(soil) have(image) comm(soil) have(rock) commun(image) commun(rock) comm(image) comm(rock) sample(soil,  ) drive( ,  ) drive( ,  ) at(  ) avail(soil,  ) avail(rock,  ) avail(image,  ) at(  ) at(  ) have(soil) drive( ,  ) drive( ,  ) drive( ,  ) commun(soil) sample(rock,  ) sample(image,  ) drive( ,  ) have(image) comm(soil) have(rock) commun(image) commun(rock) comm(image) comm(rock) sample(soil,  ) drive( ,  ) drive( ,  ) at(  ) avail(soil,  ) avail(rock,  ) avail(image,  ) at(  ) at(  ) have(soil) drive( ,  ) drive( ,  ) drive( ,  ) commun(soil) sample(rock,  ) sample(image,  ) drive( ,  ) have(image) comm(soil) have(rock) 20 10 30 20 10 30 20 35 10 25 30 35 40 25 30 40 15 25 30 25 15 35 A2A2 P2P2 P3P3 A3A3 P4P4 1-lookahead

57 June 7th, 2006 ICAPS'06 Tutorial T6 57 Terminating Cost Propagation Stop when: – goals are reached (no-lookahead) – costs stop changing (∞-lookahead) – k levels after goals are reached (k-lookahead)

58 June 7th, 2006 ICAPS'06 Tutorial T6 58 Guiding Relaxed Plans with Costs Start Extract at first level goal proposition is cheapest

59 Planning with propositional logic Convert STRIPS encoding to SAT Here: test the satisfiability of a logical sentence: Sentence contains propositions for every action occurrence. – If the planning is unsolvable the sentence will be unsatisfiable. Benefit: take advantage of SAT solvers, e.g. DPLL algorithms, which can be pretty fast Limitation: expressiveness

60 SATPLAN algorithm function SATPLAN(problem, T max ) return solution or failure inputs: problem, a planning problem T max, an upper limit to the plan length for T= 0 to T max do cnf, mapping  TRANSLATE-TO_SAT(problem, T) assignment  SAT-SOLVER(cnf) if assignment is not null then return EXTRACT-SOLUTION(assignment, mapping) return failure Overall idea: try T=0,1,2, 3, …, until the first T that gives a solution

61 cnf, mapping  TRANSLATE-TO_SAT(problem, T) Distinct propositions for assertions about each time step. – Superscripts denote the time step At(P1,SFO) 0  At(P2,JFK) 0 – specify which propositions are not true ¬At(P1,SFO) 0  ¬At(P2,JFK) 0 – Unknown propositions are left unspecified. The goal is associated with a particular time-step – But which one?

62 cnf, mapping  TRANSLATE-TO_SAT(problem, T) How to determine the time step where the goal will be reached? – Start at T=0 Assert At(P1,JFK) 0  At(P2,SFO) 0 – Failure.. Try T=1 Assert At(P1,JFK) 1  At(P2,SFO) 1 – … – Repeat this until some minimal path length is reached. – Termination is ensured by T max

63 cnf, mapping  TRANSLATE-TO_SAT(problem, T) How to encode actions into PL? – Propositional versions of successor-state axioms At(P1,JFK) 1  (At(P1,JFK) 0  ¬(Fly(P1,JFK,SFO) 0 )  (Fly(P1,SFO,JFK) 0  At(P1,SFO) 0 ) – Such an axiom is required for each plane, airport and time step Once all these axioms are in place, the satisfiability algorithm can start to find a plan.

64 assignment  SAT-SOLVER(cnf) Multiple models can be found Are they all good?

65 assignment  SAT-SOLVER(cnf) Multiple models can be found They are NOT satisfactory: (for T=1) Fly(P1,SFO,JFK) 0  Fly(P1,JFK,SFO) 0  Fly(P2,JFK,SFO) 0 The second action is infeasible Yet the plan IS a model of the sentence Avoiding illegal actions: pre-condition axioms Fly(P1,SFO,JFK) 0  At(P1,JFK) 0 Exactly one model now satisfies all the axioms where the goal is achieved at T=1.

66 assignment  SAT-SOLVER(cnf) What if there is an LAX in the world?

67 assignment  SAT-SOLVER(cnf) What if there is an LAX in the world? A plane can fly to two destinations at once They are NOT satisfactory: (for T=1) Fly(P1,SFO,JFK) 0  Fly(P2,JFK,SFO) 0  Fly(P2,JFK,LAX) 0 The second action is infeasible Avoid spurious solutions: action mutual exclusion axioms ¬( Fly(P2,JFK,SFO) 0  Fly(P2,JFK,LAX))

68 SATPLAN Convert STRIPS to planning graph Convert planning graph to SAT Hybrid encoding – Literals for both actions and facts Action-based encoding: – Literals for actions only – Preconditions are implied by action relations e.g. a t  f t and f t  b 1 t-1  b 2 t-1  b 3 t-1 convert to: a t  b 1 t-1  b 2 t-1  b 3 t-1

69 Optimality of SATPLAN Optimal in terms of – The makespan (the length of a parallel plan) – An example solution plan Step 1: move(a,b,g), move(b,c,h) Step 2: move(a, g,table), move(c,s,d), move(e,f,m) Step 3: move(s,e,g) …. Step N: move(b,c,n), move(g,e,table) – SATPLAN finds the shortest parallel plan

70 Optimality of SATPLAN However, SATPLAN is not optimal for other preference metrics, such as – Total number of actions – Total action costs – Functional preferences f(a 1,a 2,…,a n ) A solution – Completely solve the SAT instance to optimize the given metric – Use bounding and pruning techniques to speedup search

71 Extend SATPLAN to Temporally Expressive Problems What is “Temporally Expressive” – Two actions need to be executed in parallel, in any valid plan. (required concurrency) adds condition f action2 deletes condition f Time action1 ff requires condition f

72 Facts about Temporally Expressive Planning Many existing temporal planners are not able to handle temporally expressive problems Most planning benchmarks (e.g. in IPC3-IPC5) are not temporally expressive [W. Cushing et al. :ICAPS-2007] Required concurrency is important in the real world – PDDL2.1 actually supports temporally expressive

73 Some Exsiting Temporally Expressive Planners State space search planners – Crikey2 and Crikey3, and variants Partial order planner – VHPOP Others – TLP-GP

74 SAT-Based Temporally Expressive Planning Discrete time Optimal in terms of overall timespan Transform the original temporal planning problem: – Model a durative action as a linked pair of simple actions – For each action, a new fact is used to indicate that it is executing [Long & Fox 2003] Formulate the transformed temporal planning problem into a SAT encoding Solve a sequence of SAT instances from time= 1 to time = N, with an increment of δ, until a solution is found [Ray & Ginsberg 2008]

75 SAT Encoding for Temporal Planning Variables

76 SAT Encoding for Temporal Planning Clauses

77 SAT Encoding for Temporal Planning  Clauses

78 – Large room to improve the overall network throughput Under topology constraints, file availability constraints, upload/download speed constraints Modeling P2P Network Optimization

79 Represent the problem using PDDL2.1 Optimize the overall time to get all the peers’ data delivered – Optimize overall network throughput Main constraints – To download a file, a computer must be routed to a serving computer that has a copy of the file and is serving the file – Each computer can serve at most one file at any time

80 Properties Different topologies of the networks Objects: – Files, with different sizes (takes different time to transfer) – Computers/Peers Initial State: – Each computer may have its own collection of files Goal: – Each computer want to download a set of files

81 Required Concurrency in P2P network Required Concurrency – The serving computers – The downloading computers Start to serve A Serving file A Serving ends Peer Downloading A Time Peer Downloading A

82 Temporal Dependencies

83 Results – (Original Match-lift Domain) Experimental results on the Match-lift domain. The numbers in Columns L, M, R, U represent the numbers of floors, matches, rooms and fuses. Low concurrency as M=U.

84 Results (Match-lift with more concurrencies) Experimental results on Match-Lift Variant, which has more concurrencies (less match, long duration). Look at #12.

85 Results – P2P Network Optimization Domain Experimental results on the P2P domain. Column ‘P’ is the instance ID. Columns ‘C’ and ‘F’ are the numbers of peers and files, respectively, in the network. Columns ‘Time’ and ‘QAL’ are the running time and time span of solutions, respectively. 7-12 are much harder (topology, files)

86 Driverslogshift Domain Long durative actions Crikey2 sometimes generates flawed plans

87 Search based planner can be very efficiently when the problems are of lower degree of required concurrencies Results  SAT-based planner is suitable for problems requiring a high degree of concurrency action1 a b action2 action1 a d action2 b c e f g e.g. original Match-lift domain e.g. P2P and Match-lift domain with more required concurrencies


Download ppt "Planning Chapter 11 Yet another popular formulation for AI – Logic-based language – One of the most structured formulations Can be translate into less."

Similar presentations


Ads by Google