1Nau: Univ of Alberta, 2004 Forward-Chaining Planning in Nondeterministic Domains Ugur Kuter and Dana Nau Department of Computer Science and Institute.

Slides:



Advertisements
Similar presentations
Artificial Intelligence: Knowledge Representation
Advertisements

Heuristic Search techniques
1 Hierarchical Task Network (HTN) Planning José Luis Ambite* [* Based in part on presentations by Dana Nau and Rao Kambhampati]
In the name of God An Application of Planning An Application of PlanningJSHOP BY: M. Eftekhari and G. Yaghoobi.
Hierarchical Task Network (HTN) Planning Hai Hoang 4/17/2007.
CS344 : Introduction to Artificial Intelligence
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
SHOP2: An HTN Planning System Nau, D.S., Au, T.C., Ilghami, O., Kuter, U., Murdock, J.W., Wu, D. and Yaman, F. (2003) "SHOP2: An HTN Planning System",
Automatic Verification Book: Chapter 6. What is verification? Traditionally, verification means proof of correctness automatic: model checking deductive:
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Situation Calculus for Action Descriptions We talked about STRIPS representations for actions. Another common representation is called the Situation Calculus.
Plan Generation & Causal-Link Planning 1 José Luis Ambite.
Fast Strong Planning for FOND Problems with Multi-Root DAGs Andres Calderon Jaramillo - Dr. Jicheng Fu Department of Computer Science, University of Central.
Best-First Search: Agendas
Knowledge Representation Meets Stochastic Planning Bob Givan Joint work w/ Alan Fern and SungWook Yoon Electrical and Computer Engineering Purdue University.
1 Chapter 4 State-Space Planning. 2 Motivation Nearly all planning procedures are search procedures Different planning procedures have different search.
1 Classical STRIPS Planning Alan Fern * * Based in part on slides by Daniel Weld.
Planning under Uncertainty
CPSC 322, Lecture 19Slide 1 Propositional Logic Intro, Syntax Computer Science cpsc322, Lecture 19 (Textbook Chpt ) February, 23, 2009.
3/25  Monday 3/31 st 11:30AM BYENG 210 Talk by Dana Nau Planning for Interactions among Autonomous Agents.
CPSC 322, Lecture 12Slide 1 CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12 (Textbook Chpt ) January, 29, 2010.
Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 1 Ryan Kinworthy CSCE Advanced Constraint Processing.
1 Planning. R. Dearden 2007/8 Exam Format  4 questions You must do all questions There is choice within some of the questions  Learning Outcomes: 1.Explain.
CSE 830: Design and Theory of Algorithms
Handling non-determinism and incompleteness. Problems, Solutions, Success Measures: 3 orthogonal dimensions  Incompleteness in the initial state  Un.
An Overview of MAXQ Hierarchical Reinforcement Learning Thomas G. Dietterich from Oregon State Univ. Presenter: ZhiWei.
Automated Planning and HTNs Planning – A brief intro Planning – A brief intro Classical Planning – The STRIPS Language Classical Planning – The STRIPS.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
State-Space Planning Sources: Ch. 3 Appendix A
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
(Classical) AI Planning. Some Examples Route search: Find a route between Lehigh University and the Naval Research Laboratory Project management: Construct.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Introduction to search Chapter 3. Why study search? §Search is a basis for all AI l search proposed as the basis of intelligence l inference l all learning.
Introduction to search Chapter 3. Why study search? §Search is a basis for all AI l search proposed as the basis of intelligence l all learning algorithms,
CS 415 – A.I. Slide Set 5. Chapter 3 Structures and Strategies for State Space Search – Predicate Calculus: provides a means of describing objects and.
State-Space Searches. 2 State spaces A state space consists of A (possibly infinite) set of states The start state represents the initial problem Each.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Lecture 3: Uninformed Search
Conformant Probabilistic Planning via CSPs ICAPS-2003 Nathanael Hyafil & Fahiem Bacchus University of Toronto.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Search CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
Automated Planning Dr. Héctor Muñoz-Avila. What is Planning? Classical Definition Domain Independent: symbolic descriptions of the problems and the domain.
AI Lecture 17 Planning Noémie Elhadad (substituting for Prof. McKeown)
Problem Reduction So far we have considered search strategies for OR graph. In OR graph, several arcs indicate a variety of ways in which the original.
1 Running Experiments for Your Term Projects Dana S. Nau CMSC 722, AI Planning University of Maryland Lecture slides for Automated Planning: Theory and.
Intro to Planning Or, how to represent the planning problem in logic.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Arc Consistency CPSC 322 – CSP 3 Textbook § 4.5 February 2, 2011.
(Classical) AI Planning. General-Purpose Planning: State & Goals Initial state: (on A Table) (on C A) (on B Table) (clear B) (clear C) Goals: (on C Table)
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
1 Chapter 11 Hierarchical Task Network Planning. 2 Motivation We may already have an idea how to go about solving problems in a planning domain Example:
CS621: Artificial Intelligence Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay Lecture 19: Hidden Markov Models.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Heuristic Search Planners. 2 USC INFORMATION SCIENCES INSTITUTE Planning as heuristic search Use standard search techniques, e.g. A*, best-first, hill-climbing.
1 Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
SNS College of Engineering Department of Computer Science and Engineering AI Planning Presented By S.Yamuna AP/CSE 5/23/2018 AI.
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Review for the Midterm Exam
Presentation transcript:

1Nau: Univ of Alberta, 2004 Forward-Chaining Planning in Nondeterministic Domains Ugur Kuter and Dana Nau Department of Computer Science and Institute for Systems Research University of Maryland College Park, Maryland

2Nau: Univ of Alberta, 2004 Generating Plans of Action l Programs to aid human planners u Project management (consumer software) u Plan storage and retrieval »(e.g., variant process planning) u Automatic schedule generation »(various OR and AI techniques) l For some problems, really want to generate plans automatically u Much more difficult u One source of difficulty: nondeterministic outcomes »If I plan to perform some action a, I cannot be sure in advance what outcome a will have

3Nau: Univ of Alberta, 2004 Planning with Nondeterminism l Actions with multiple possible outcomes u Action failures »e.g., gripper drops its load u Exogenous events »e.g., road closed l Like Markov Decision Processes (MDPs), but without probabilities attached to the outcomes u Useful if accurate probabilities aren’t available, or if probability calculations would introduce inaccuracies l Existing approaches u Conditional Planning (e.g., Penberthy & Weld, 1992) u Conformant Planning (e.g., Smith & Weld, 1998) u Symbolic Model Checking (e.g., Cimatti et al., 1998, 2003) a c b Grasp block c a c b Intended outcome abc Unintended outcome

4Nau: Univ of Alberta, 2004 Research Motivation l Algorithms for planning with nondeterminism have very high computational complexity u Search space usually is huge u Existing algorithms search most of the space l Classical planning u Lots of work on generating plans quickly u Techniques for pruning large parts of the entire space u Can we generalize any of these techniques for use in nondeterministic domains?

5Nau: Univ of Alberta, 2004 Our Results l A way to nondeterminize any forward-chaining planner for deterministic planning domains u Rewrite it so that it works in nondeterministic domains l Theoretical analysis u Under the appropriate conditions, some nondeterminized planners can run exponentially faster than the best previous planners for nondeterministic domains l Experimental verification of the theoretical results

6Nau: Univ of Alberta, 2004 l Some of the most capable existing planners use forward chaining u Backtracking state-space search starting at the initial state u e.g., HSP, TLPlan, TALplanner, SHOP2 l FCP: abstract model of forward-chaining planners l Among different forward-chaining planners, the main difference is the action-generation function  (s)  {actions applicable to s} l Can classify them based on  u Domain-specific u Domain-independent u Domain-configurable Forward-Chaining Planners Procedure FCP (s 0, g) π := the empty plan; s := s 0 loop if s satisfies g then return π else if s isn’t in ancestors(s) then A :=  (s) if A is empty then return failure nondeterministically choose a  A π := π.a; s :=  (s,a) else return failure

7Nau: Univ of Alberta, 2004 Classification of Forward-Chaining Planners l Domain-specific:  is designed or tuned for one specific domain u Several application-oriented planners work this way »e.g., EDAPS (process planning), Tignum 2 (used in Bridge Baron) »Good performance in the given domain, but hard to generalize l Domain-independent:  works in any domain within some class u Usually,  works in any classical planning domain u Focus of most research on AI planning u So far, not practical for real-world planning l Domain-configurable: … Procedure FCP (s 0, g) π := the empty plan; s := s 0 loop if s satisfies g then return π else if s isn’t in ancestors(s) then A :=  (s) if A is empty then return failure nondeterministically choose a  A π := π.a; s :=  (s,a) else return failure

8Nau: Univ of Alberta, 2004 Classification (continued) l Domain-configurable u  has a domain-independent computational engine u Give domain-specific information to  as part of the domain description »How to prune some of the actions from  1.Control rules written in temporal logic, used for pruning 2.Hierarchical Task Networks (HTNs) and ordered decomposition Procedure FCP (s 0, g, K) π := the empty plan; s := s 0 loop if s satisfies g then return π else if s isn’t in ancestors(s) then A :=  (s, K) if A is empty then return failure nondeterministically choose a  A π := π.a; s :=  (s,a) else return failure

9Nau: Univ of Alberta, Control Rules in Temporal Logic l Depth-first forward search, with control rules written in temporal logic u For each state s, a control rule, f »prune s if it doesn’t satisfy f u Control rules for successors of s are computed via logical progression l TLPlan (Bacchus & Kabanza, Artificial Intelligence 2000) l TALplanner (Doherty & Kvarnstrom, AMAI 2001) u Both work the same way, but they use different temporal logics l Example (next slide): u A trivial blocks-world planning problem u LTL (the logic used in TLPlan)

10Nau: Univ of Alberta, 2004 Example State s:Goal: {on(b,a)} l Control rule f: never pick up block x from the table unless x needs to be on top of another block l Progressed formula f + (must be true in all children of s) u If we pick up a, f + will not be satisfied - prune this state u If we pick up b, f + will be satisfied - keep searching below this state l Can write rules to prune huge parts of the search space ab a b

11Nau: Univ of Alberta, HTN Planning method travel(x,y) get-ticket (a(x), a(y)) travel (x, a(x))fly (a(x), a(y)) travel (a(y),y) air-travel(x,y) get-taxiride-taxi (x,y)pay-driver taxi-travel(x,y) travel(UMD, U-of-Alberta) get-ticket(DCA, YEG) go to Orbitz find-flights(DCA,YEG) buy-ticket(DCA,YEG) travel(UMD, DCA) get-taxi ride-taxi(UMD, DCA) pay-driver fly(DCA, YEG) travel(YEG, U-of-Alberta) get-taxi ride-taxi(YEG, U-of-Alberta) pay-driver task u Decompose tasks into subtasks u Handle constraints (e.g., taxi not good for long distances) u Resolve interactions (e.g., take taxi early enough to catch plane) u If necessary, backtrack and try other decompositions

12Nau: Univ of Alberta, 2004 Ordered Decomposition l Decompose tasks in the same order in which they’ll be executed l Whenever we want to plan the next task u we’ve already planned everything that comes before it u Thus, we know the current state of the world l SHOP2 (Nau et al., IJCAI 2001, JAIR 2003) s0s0 s1s1 s2s2 … task t m … … task t n op 1 op 2 op i S i–1 task t 0

13Nau: Univ of Alberta, 2004 Performance l Using control rules and HTNs u can encode domain-specific problem-solving knowledge u highly focused search »go almost directly toward a near-optimal solution, with very little backtracking l TLPlan, TALplanner, and SHOP2 have been the best performers in the International Planning Competitions »Several orders of magnitude faster than the domain- independent planners »Solved many more problems

14Nau: Univ of Alberta, 2004 Us:East declarer, West dummy Opponents:defenders, South & North Contract:East – 3NT On lead:West at trick 3 East:  KJ74 West:  A2 Out:  QT98653 Expressivity l Forward-chaining planners always know the current state u This makes it easy to do things that would be difficult otherwise u States can be arbitrary data structures u Preconditions and effects can include »logical inference »complex numeric computations »interactions with other software packages l Applications: u SHOP2 is open-source freeware, has been used in dozens of applications (Nau et al., 2004) u Bacchus and Kabanza are attempting to commercialize TLPlan

15Nau: Univ of Alberta, 2004 How to Nondeterminize Forward-Chaining Planners l Two steps: 1. Modify FCP to generate policies rather than plans 2. Modify FCP to solve problems in which actions have multiple outcomes l Want to do this in such a way that it will work for all instances of FCP u Nondeterminized versions of HSP, TLPlan, TALplanner, SHOP2, etc.

16Nau: Univ of Alberta, 2004 Plans Versus Policies l In classical domains, a solution is a plan (sequence of actions) l For nondeterministic domains, that’s not sufficient u An action may lead to more than one possible state u What to do next depends on what state we’re in u Instead of a plan, use a policy: a partial function from states to actions s0s0 s1s1 s2s2 s3s3 a0a0 a1a1 a2a2 Initial State Goal State s0s0 s1s1 s3s3 a0a0 s2s2 a1a1 a2a2 s4s4 π = (a 0, a 1, a 2 ) π = {(s 1,a 0 ), (s 1,a 1 ), (s 2,a 3 )} s0s0 s1s1 s3s3 a0a0

17Nau: Univ of Alberta, 2004 Execution Graphs l An action a has more than one possible outcome … … so a policy π has more than one possible execution path l Execution graph E(π) = the graph of all of π’s possible execution paths u S π = {all states in E(π)} s0s0 s2s2 s3s3 s4s4 s1s1 s5s5 Initial States Goal States a1a1 a1a1 a2a2 π = {(s 0, a 0 ), (s 1, a 1 ), (s 2, a 1 ), (s 3, a 2 )} a0a0 s0s0 s1s1 s3s3 a0a0

18Nau: Univ of Alberta, 2004 Nondeterminization (Step 1) l Rewrite FCP so that it generates solution policies rather than solution plans Procedure Policy-FCP (s 0, g, K) π :=  ; s := s 0 loop if s satisfies g then return π else if s isn’t in S π then A :=  (s, K) if A is empty then return failure nondeterministically choose a  A π := π  {(s,a)}; s :=  (s,a) else return failure Procedure FCP (s 0, g, K) π := the empty plan; s := s 0 loop if s satisfies g then return π else if s isn’t in ancestors(s) then A :=  (s, K) if A is empty then return failure nondeterministically choose a  A π := π.a; s :=  (s,a) else return failure

19Nau: Univ of Alberta, 2004 (Cimatti et al, Artificial Intelligence, 2003) l Weak solution: at least one execution path reaches a goal l Strong solution: every execution path reaches a goal l Strong-cyclic solution: every fair execution path reaches a goal u Don’t stay in a cycle forever if there’s a state-transition out of it s0 s1 s3 Goal a0 a1 a2 s2 a3 s0 s1 s3 Goal a0 a1 a2 s2 s0 s1 s3 Goal a0 a1 a2 s2 Goal Types of Solutions a3

20Nau: Univ of Alberta, 2004 Nondeterminization (Step 2) l Modify Policy-FCP to generate strong-cyclic solutions u Can also modify it to generate strong and weak solutions (won’t discuss details) Procedure ND-FCP (S 0, g, K) π :=  ; S := S 0 ; solved :=  loop if S =  then return π select s in S and remove it from S if s satisfies g then put s into solved else if s isn’t in S π then A :=  (s, K) if A is empty then return failure nondeterministically choose a  A π := π  {(s,a)}; S := S   (s,a) else if s has no descendants in (S  solved) – S π then return failure Procedure Policy-FCP (s 0, g, K) π :=  ; s := s 0 loop if s satisfies g then return π else if s isn’t in S π then A :=  (s, K) if A is empty then return failure nondeterministically choose a  A π := π  {(s,a)}; s :=  (s,a) else return failure

21Nau: Univ of Alberta, 2004 Bookkeeping l Bookkeeping to generate graphs rather than paths u S = {nodes that have been generated but not yet explored} u solved = {nodes from which we know we can get to a solution} s0 s1 s3 a Procedure ND-FCP (S 0, g, K) π :=  ; S := S 0 ; solved :=  loop if S =  then return π select s in S and remove it from S if s satisfies g then put s into solved else if s isn’t in S π then A :=  (s, K) if A is empty then return failure nondeterministically choose a  A π := π  {(s,a)}; S := S   (s,a) else if s has no descendants in (S  solved) – S π then return failure

22Nau: Univ of Alberta, 2004 l A node s is unsolvable in the following cases: u s is a dead end, u s is part of a cycle from which there is no escape, u every descendant of s is unsolvable l This happens if s has no descendants in (S  solved) – S π Failure Detection s0 s1 s3 a0 a3 a1 s2 s6 a2 s4 s5 Procedure ND-FCP (S 0, g, K) π :=  ; S := S 0 ; solved :=  loop if S =  then return π select s in S and remove it from S if s satisfies g then put s into solved else if s isn’t in S π then A :=  (s, K) if A is empty then return failure nondeterministically choose a  A π := π  {(s,a)}; S := S   (s,a) else if s has no descendants in (S  solved) – S π then return failure

23Nau: Univ of Alberta, 2004 Formal Properties l Several planning algorithms are instances of FCP u TLPlan, TALplanner, SHOP2, etc. u Only difference: what  is l Nondeterminizing FCP preserves , so it works on any instance of FCP u ND-TLPlan, ND-TALplanner, ND-SHOP2, etc. l Nondeterminizing them preserves soundness, completeness, time complexity u Details on the next few slides Procedure ND-FCP (S 0, g, K) π :=  ; S := S 0 ; solved :=  loop if S =  then return π select s in S and remove it from S if s satisfies g then put s into solved else if s isn’t in S π then A :=  (s, K) if A is empty then return failure nondeterministically choose a  A π := π  {(s,a)}; S := S   (s,a) else if s has no descendants in (S  solved) – S π then return failure

24Nau: Univ of Alberta, 2004 Nondeterministic Versions of Operators and Domains l Nondeterministic version of an operator o u Same as o except that it may have additional possible outcomes u Failures, exogenous events, etc. l Nondeterministic version of a domain D u The operators are nondeterministic versions of the ones in D a c b Grasp block c a c b Intended outcome abc Unintended outcome

25Nau: Univ of Alberta, 2004 Formal Properties l Nondeterminizing an algorithm preserves its soundness and completeness u Let P be any planning algorithm that’s an instance of FCP u Let ND-P be the nondeterminization of P u Let D be any classical planning domain u Let D’ be any nondeterministic version of D l If P is sound/complete on D, then ND-P is sound/complete on D’ l Nondeterminizing an algorithm preserves its time complexity (as a function of its output) u Let T P (n) and T ND-P (n) be the running times of P and ND-P, where n = size of the solution found u Then T ND-P (n) is polynomially bounded by T P (n) »(Details on next slide)

26Nau: Univ of Alberta, 2004 Time-Complexity Theorem l P = an instance of FCP; D = a classical domain l Suppose P’s time complexity is O(f(|  |)), where f is monotonic l D = a nondeterministic version of D u ND-P’s time complexity is O(p(f(|  |))) l Caveat: π may be exponentially larger than π s0s0 s1s1 s2s2 s3s3 a0a0 a1a1 a2a2 Initial State Goal State s0s0 s2s2 s3s3 s4s4 s1s1 s5s5 Initial States Goal States a1a1 a1a1 a2a2 a0a0

27Nau: Univ of Alberta, 2004 Special Case l Suppose that P runs in polynomial time and ND-P produces solutions of polynomial size l Then ND-P runs in polynomial time l Example: Blocks World u Given the appropriate domain knowledge »TALplanner, TLplan and SHOP2 solve Blocks-World problems in polynomial time »ND-TALplanner, ND-TLplan, and ND-SHOP2 produce solutions of polynomial size u With this domain knowledge, »ND-TALplanner, ND-TLplan, and ND-SHOP2 solve nondeterministic-BW problems in polynomial time

28Nau: Univ of Alberta, 2004 Experimental Verification l Implementation of ND-SHOP2 l Compare with MBP (Bertoli et al., 2001) u The best-known planner for nondeterministic domains u Based on symbolic model-checking l Two experimental domains u Robot-Navigation (Kabanza et al., 1997) »The e. coli of research on planning with nondeterminism u Nondeterministic Blocks-World

29Nau: Univ of Alberta, 2004 Robot Navigation Domain l Adapted from (Kabanza et al., 1997) u Rooms, doors, hallway u Robot can open/close doors, move packages to other rooms u Objective: move packages to their destinations u A kid runs around and randomly opens/closes doors »Robot may need to re-open a door repeatedly to go through l Experimental Setup u Kid doors: k = 1, …, 7 u Packages: n = 1, …, 5 u 20 randomly-generated problems for each combination of n, k

30Nau: Univ of Alberta, 2004 Varying the problem size

31Nau: Univ of Alberta, 2004 Varying the amount of nondeterminism

32Nau: Univ of Alberta, 2004 Nondeterministic Blocks World l Traditional Blocks-World operators: u pickup, putdown, stack, unstack l Actions may have unintended outcomes u e.g., drop a block on the table l Experimental Setup u vary number of blocks from 3 to 10 u 20 randomly-generated problems for each case a c b Grasp block c a c b Intended outcome abc Unintended outcome

33Nau: Univ of Alberta, 2004 Varying the problem size

34Nau: Univ of Alberta, 2004 Complexity Analysis l Complexity analysis shows MBP running in exponential time and ND-SHOP2 running in time O(n 5 ) l To see why, need to understand how MBP and ND-SHOP2 work

35Nau: Univ of Alberta, 2004 Representing Policies l A policy π is a partial function from states into actions π(s 0 ) = a 0, π(s 1 ) = a 1, π(s 2 ) = a 1, π(s 3 ) = a 2 l Can use a symbolic representation roughly like this: if in(r 4 ) and holding(b) and door-closed(r 4 ) then π(s) = open-door(r 4 ) if in(r 4 ) and holding(b) and door-open(r 4 ) then π(s) = go(r 4, hall) u Each state description ignores all doors other than d 4 u Includes an exponential number of states l Both MBP and ND-SHOP2 use symbolic representations of policies u Can write polynomial-size policies for exponentially large state spaces

36Nau: Univ of Alberta, 2004 How MBP Generates Policies l MBP uses model-checking techniques u e.g., computing pre-images of sets of states u Roughly like a breadth-first backward search l MBP may need to explore exponentially many states that are unreachable from the initial state u Exponentially many states => exponential time u That’s what happens in the robot navigation and nondeterminized blocks world domains

37Nau: Univ of Alberta, 2004 How ND-SHOP2 Generates Policies l ND-SHOP2 takes domain knowledge in the form of HTN methods u Method m1 Task: take-package (p, r, hall) Precond: in(r), holding(p), door-open(r) Subtasks: go(r, hall) u Method m2 Task: take-package(p, r, hall) Precond: in(r), holding(p), door-closed(r) Subtasks: open-door(r), go(r, hall) l Consider the task take-package(b, r 4, hall) l ND-SHOP can very quickly develop the policy if in(r 4 ) and holding(b) and door-closed(r 4 ) then π(s) = open-door(r 4 ) if in(r 4 ) and holding(b) and door-open(r 4 ) then π(s) = go(r 4, hall)

38Nau: Univ of Alberta, 2004 Conclusions l A technique for “nondeterminization” of forward-chaining classical planner l Theoretical analysis u Nondeterminization preserves soundness/completeness u Time complexity of the generalized planners is polynomially bounded by the time complexity of the original ones l Experimental verification of the results

39Nau: Univ of Alberta, 2004 Future Work l Nondeterministic planning domains are just like MDPs except that there are no probabilities l We are quite confident that u We can generalize our approach to work in MDPs too u Our “MDP-ized” algorithms will be able to run exponentially faster than traditional MDP algorithms l Preliminary implementation and experiments u So far, very encouraging

40Nau: Univ of Alberta, 2004 l M. Ghallab, D. Nau, and P. Traverso, Automated Planning: Theory and Practice (Morgan Kaufmann, May 2004) l First comprehensive textbook on automated planning u models, techniques, algorithms u case studies of applications Web site: u Lecture slides available online Related Work