# An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University

## Presentation on theme: "An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University"— Presentation transcript:

An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University menkes@asu.edu menkes@asu.edu Subbarao Kambhampati Department of Computer Science Arizona State University rao@asu.edu rao@asu.edu Thomas Vossen Leeds School of Business University of Colorado at Boulder vossen@colorado.edu vossen@colorado.edu J. Benton Department of Computer Science Arizona State University bentonj@asu.edu bentonj@asu.edu http://rakaposhi.eas.asu.edu/yochan/

What is automated planning? loc1loc2 loc1loc2 Initial state s 0  S Goal s *  S

What is automated planning? loc1loc2 loc1loc2 loc1 Initial state s 0  S Goal s *  S Action a =  pre, post, prevail 

What is automated planning? loc1loc2 loc1loc2 loc1 Initial state s 0  S Goal s *  S Action a =  pre, post, prevail  Plan P =  a 1, …, a n 

Motivation Why heuristics? –Heuristic state space search have been very successful in solving automated planning problems Why optimal planning? –Real-world planning applications require optimal or near-optimal solutions The difference between a (near) optimal solution and a feasible solution may be the difference between winning or losing the interest of an investor or strategic partner

LP-based heuristic Relax the ordering of the actions Setup an integer programming formulation Solve the LP-relaxation and use the objective function value as an admissible distance estimate Strengthen the formulation by adding valid inequalites

Action selection formulation Represent the planning problem as a set of loosely coupled network flow problems –Each state variable defines one network flow problem –Nodes correspond to the state variable values –Arcs correspond to state variable transitions

Action selection formulation Variables –x a  Z +, for a  A ; x a is equal to the number of times action a is executed Objective function –MIN  a  A x a Constraints, for all c  C, f  V c –  e  Vc+(f):a  AcE(e) x a –  e  Vc–(f):b  AcE(e) x b  –x a  M  e  Vc+(f):b  AcE(e) x b for all f  s 0 [c], a  A c V (f) 1 if f  s 0 [c], f = s * [c] –1 if f = s 0 [c], f  s * [c] 0 otherwise No time indices No upper bound

Preliminary results

Strengthening techniques Composition of state variables (i.e. fluent merging) –Given the domain transition graph ( DTG ) of two state variables c 1, c 2, the composition of DTG c1 and DTG c2 is the domain transition graph DTG c1||c2 = (V c1||c2, E c1||c2 ) where –V c1||c2 = V c1  V c2 –((f 1,g 1 ),(f 2,g 2 ))  E c1||c2 if f 1,f 2  V c1, g 1,g 2  V c2 and there exists an action a  A such that one of the following conditions hold pre[c 1 ] = f 1, post[c 1 ] = f 2, and pre[c 2 ] = g 1, post[c 2 ] = g 2 pre[c 1 ] = f 1, post[c 1 ] = f 2, and prevail[c 2 ] = g 1, g 1 = g 2 pre[c 1 ] = f 1, post[c 1 ] = f 2, and g 1 = g 2 The term composition is also used in model checking to define the parallel composition or the synchronized product of automata [Cassandras & Lafortune, 1999]

Example Two DTGs and their composition f3f3 f2f2 f1f1 g2g2 g1g1 b c d DTG c1 DTG c2 a b f 1,g 2 f 2,g 1 f 2,g 2 f 3,g 1 f 3,g 2 f 1,,g 1 DTG c1 || c2 a a b c c d d

Example Two DTGs and their composition –Small in-arcs denote the initial state –Double circles denote the goal f3f3 f2f2 f1f1 g2g2 g1g1 b c d DTG c1 DTG c2 a b f 1,g 2 f 2,g 1 f 2,g 2 f 3,g 1 f 1,,g 1 DTG c1 || c2 a a b c c d d

Simple logistics example loc1loc2 1,1 1,T 2,T 2,2 1,2 2,1 DTG Truck1 || Package1 Drive(l1,l2) Drive(l2,l1) Load(p1,t1,l1) Load(p1,t1,l2) Unload(p1,t1,l1) Unload(p1,t1,l2) Drive(l1,l2) Drive(l2,l1) Drive(l1,l2)Drive(l2,l1)

Simple logistics example 1,1 1,T 2,T 2,2 1,2 2,1 DTG Truck1 || Package1 LP solution x Drive(l2,l1) = 1 x Load(p1,t1,l1) = 1 x Drive(l1,l2) = 1 x Unload(p1,t1,l2) = 1 4 Drive(l2,l1) Load(p1,t1,l1) Drive(l1,l2) Unload(p1,t1,l2) Drive(l1,l2) Drive(l2,l1) Load(p1,t1,l2) Unload(p1,t1,l1) Unload(p1,t1,l2) Drive(l1,l2) Drive(l2,l1) Drive(l1,l2)Drive(l2,l1)

Another example Two DTGs and their composition f3f3 f2f2 f1f1 g3g3 g2g2 g1g1 f 1,g 2 f 1,g 3 f 2,g 1 f 2,g 2 f 2,g 3 f 3,g 1 f 3,g 2 f 3,g 3 f 1,,g 1 DTG c1 DTG c2 DTG c1 || c2

Another example Two DTGs and their composition –Solution to the individual state variables f3f3 f2f2 f1f1 g3g3 g2g2 g1g1 f 1,g 2 f 1,g 3 f 2,g 1 f 2,g 2 f 2,g 3 f 3,g 1 f 3,g 2 f 3,g 3 f 1,,g 1 b a a b DTG c1 DTG c2 DTG c1 || c2

Another example Two DTGs and their composition –Solution to the individual state variables represented in the composed state variable f3f3 f2f2 f1f1 g3g3 g2g2 g1g1 f 1,g 2 f 1,g 3 f 2,g 1 f 2,g 2 f 2,g 3 f 3,g 1 f 3,g 2 f 3,g 3 f 1,,g 1 b a a b DTG c1 DTG c2 DTG c1 || c2 b a

Another example Two DTGs and their composition –Solution to the individual state variables represented in the composed state variable f3f3 f2f2 f1f1 g3g3 g2g2 g1g1 f 1,g 2 f 1,g 3 f 2,g 1 f 2,g 2 f 2,g 3 f 3,g 1 f 3,g 2 f 3,g 3 f 1,,g 1 b a a b DTG c1 DTG c2 DTG c1 || c2 b a Violates balance of flow constraints

Another example Two DTGs and their composition –Adding new balance of flow constraints strengthens the formulation f3f3 f2f2 f1f1 g3g3 g2g2 g1g1 f 1,g 2 f 1,g 3 f 2,g 1 f 2,g 2 f 2,g 3 f 3,g 1 f 3,g 2 f 3,g 3 f 1,,g 1 b a a b DTG c1 DTG c2 DTG c1 || c2 b a c c e d d e

Identifying mergeable fluents When should we create a composition of two or more state variables? –Look at the causal graph –Look at the actions that introduce dependencies in the causal graph Person 1Person 2 Airplane 1Airplane 2 Fuel 1Fuel 2 Person 1Person 2 Airplane 1 Fuel1 Airplane 2 Fuel2

Experimental setup Objective –Minimize number of actions Domains –Selected domains from the International Planning Competition Logistics Freecell Driverlog Zenotravel TPP Blocksworld Resources –2.67Ghz Linux machine –1GB memory –15 minutes runtime –CPLEX 10.0

Experimental setup Distance estimates –LP Action selection formulation with strengthening –LP – Action selection formulation without strengthening –Lplan Step based integer programming formulation by Lplan [Bylander, 1997] –h + Optimal relaxed plan when the delete effects are ignored –h FF Inadmissible but efficient relaxed plan heuristic by FF [Hoffmann, and Nebel, 2001] –Optimal Optimal distance estimate given by Satplanner using the –opt flag [Rintanen, Heljanko, and Niemela, 2005]

Experimental results

Distance estimates from the initial state to the goal (highlighted values equal the optimal distance)

Experimental results Heuristic calculation time LogisticsFreecellDriverlogZenotravel TPP Blocks

Conclusions and future work LP-based heuristic that respects delete effects, but ignores action ordering shows very promising results –Finds the optimal distance estimate in several problem instances –Can be used to calculate admissible distance estimates for various optimization problems in planning –Ongoing work successfully incorporated our LP-based heuristic in a search algorithm that solves oversubscription planning Interesting directions for future work –Apply fluent merging more aggressively –Extend the formulation into a complete planning system

LP-based heuristic Relax the ordering of the actions Setup an integer programming formulation Solve the LP-relaxation and use the objective function value as an admissible distance estimate Strengthen the formulation by adding valid inequalites

Download ppt "An LP-Based Heuristic for Optimal Planning Menkes van den Briel Department of Industrial Engineering Arizona State University"

Similar presentations