Expressive and Efficient Frameworks for Partial Satisfaction Planning Subbarao Kambhampati Arizona State University (Proposal submitted for consideration.

Slides:



Advertisements
Similar presentations
Heuristic Search techniques
Advertisements

1 Constraint Satisfaction Problems A Quick Overview (based on AIMA book slides)
Probabilistic Planning (goal-oriented) Action Probabilistic Outcome Time 1 Time 2 Goal State 1 Action State Maximize Goal Achievement Dead End A1A2 I A1.
Effective Approaches for Partial Satisfaction (Over-subscription) Planning Romeo Sanchez * Menkes van den Briel ** Subbarao Kambhampati * * Department.
Subbarao Kambhampati Arizona State University What’s Hot: ICAPS “Challenges in Planning” A brief talk on the core & (one) fringe of ICAPS Talk given at.
Decision Theoretic Planning
Best-First Search: Agendas
Planning under Uncertainty
Computational problems, algorithms, runtime, hardness
1 Integer Programming Approaches for Automated Planning Menkes van den Briel Department of Industrial Engineering Arizona State University
1 Optimisation Although Constraint Logic Programming is somehow focussed in constraint satisfaction (closer to a “logical” view), constraint optimisation.
Over-subscription Planning with Numeric Goals J. Benton Computer Sci. & Eng. Dept. Arizona State University Tempe, AZ Minh Do Palo Alto Research Center.
Finding Admissible Bounds for Over- subscribed Planning Problems J. Benton Menkes van den BrielSubbarao Kambhampati Arizona State University.
1 Planning. R. Dearden 2007/8 Exam Format  4 questions You must do all questions There is choice within some of the questions  Learning Outcomes: 1.Explain.
Computational Methods for Management and Economics Carla Gomes
Challenges in Adapting Automated Planning for Autonomic Computing Biplav Srivastava Subbarao Kambhampati IBM India Research Lab Arizona State University.
1. 2 Problem Description & Assumption Metric model in Sapa planner:  f(p) = w * time(p) + (1-w) * cost(p).  Assuming that the trade-off value w is given.
Planning Graph-based Heuristics for Cost-sensitive Temporal Planning Minh B. Do & Subbarao Kambhampati CSE Department, Arizona State University
Nov 14 th  Homework 4 due  Project 4 due 11/26.
Constraint Satisfaction Problems
Handling non-determinism and incompleteness. Problems, Solutions, Success Measures: 3 orthogonal dimensions  Incompleteness in the initial state  Un.
4/1 Agenda: Markov Decision Processes (& Decision Theoretic Planning)
This material in not in your text (except as exercises) Sequence Comparisons –Problems in molecular biology involve finding the minimum number of edit.
A Hybrid Linear Programming and Relaxed Plan Heuristic for Partial Satisfaction Planning Problems J. Benton Menkes van den BrielSubbarao Kambhampati Arizona.
Minh Do - PARC Planning with Goal Utility Dependencies J. Benton Department of Computer Science Arizona State University Tempe, AZ Subbarao.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2006.
Chapter 5 Outline Formal definition of CSP CSP Examples
9/23. Announcements Homework 1 returned today (Avg 27.8; highest 37) –Homework 2 due Thursday Homework 3 socket to open today Project 1 due Tuesday –A.
CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.
CP Summer School Modelling for Constraint Programming Barbara Smith 1.Definitions, Viewpoints, Constraints 2.Implied Constraints, Optimization,
An efficient distributed protocol for collective decision- making in combinatorial domains CMSS Feb , 2012 Minyi Li Intelligent Agent Technology.
Visualizations to Support Interactive Goal Model Analysis Jennifer Horkoff 1 Eric Yu 2 Department of Computer Science 1 Faculty of Information 2
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Constraint Satisfaction Problems (CSPs) CPSC 322 – CSP 1 Poole & Mackworth textbook: Sections § Lecturer: Alan Mackworth September 28, 2012.
1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.
CP Summer School Modelling for Constraint Programming Barbara Smith 2. Implied Constraints, Optimization, Dominance Rules.
Hande ÇAKIN IES 503 TERM PROJECT CONSTRAINT SATISFACTION PROBLEMS.
AI Automated Planning In A Nutshell Vitaly Mirkis March 4, 2013 Netanya Academic College Acknowledgments: Some slides are based slides of Prof. Carmel.
15.053Tuesday, April 9 Branch and Bound Handouts: Lecture Notes.
Conformant Probabilistic Planning via CSPs ICAPS-2003 Nathanael Hyafil & Fahiem Bacchus University of Toronto.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
AI Lecture 17 Planning Noémie Elhadad (substituting for Prof. McKeown)
Chapter 2) CSP solving-An overview Overview of CSP solving techniques: problem reduction, search and solution synthesis Analyses of the characteristics.
Decision Theoretic Planning. Decisions Under Uncertainty  Some areas of AI (e.g., planning) focus on decision making in domains where the environment.
CPS Computational problems, algorithms, runtime, hardness (a ridiculously brief introduction to theoretical computer science) Vincent Conitzer.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia
Markov Decision Processes AIMA: 17.1, 17.2 (excluding ), 17.3.
Search Control.. Planning is really really hard –Theoretically, practically But people seem ok at it What to do…. –Abstraction –Find “easy” classes of.
Roman Barták (Charles University in Prague, Czech Republic) ACAT 2010.
Heuristic Search Planners. 2 USC INFORMATION SCIENCES INSTITUTE Planning as heuristic search Use standard search techniques, e.g. A*, best-first, hill-climbing.
1 Chapter 5 Branch-and-bound Framework and Its Applications.
Certification of Reusable Software Artifacts
Announcements Homework 1 Full assignment posted..
Computational problems, algorithms, runtime, hardness
Integer Programming An integer linear program (ILP) is defined exactly as a linear program except that values of variables in a feasible solution have.
The minimum cost flow problem
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Class #17 – Thursday, October 27
Announcements Homework 3 due today (grace period through Friday)
Instructor: Shengyu Zhang
Graphplan/ SATPlan Chapter
CS 188: Artificial Intelligence Fall 2007
Class #19 – Monday, November 3
CPS 173 Computational problems, algorithms, runtime, hardness
Graphplan/ SATPlan Chapter
Graphplan/ SATPlan Chapter
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Presentation transcript:

Expressive and Efficient Frameworks for Partial Satisfaction Planning Subbarao Kambhampati Arizona State University (Proposal submitted for consideration to Behzad Kamgar-Parsi/ONR)

Partial Satisfaction/Over-Subscription Planning  Traditional planning problems  Find the (lowest cost) plan that satisfies all the given goals  PSP Planning  Find the highest utility plan given the resource constraints  Goals have utilities and actions have costs  …arises naturally in many real world planning scenarios  MARS rovers attempting to maximize scientific return, given resource constraints  UAVs attempting to maximize reconnaisance returns, given fuel etc constraints  Logistics problems resource constraints  … due to a variety of reasons  Constraints on agent’s resources  Conflicting goals  With complex inter-dependencies between goal utilities  Soft constraints  Limited time

Supporting PSP planning  PSP planning changes planning from a “satisficing” to an “optimizing” problem  It is trivial to find a plan; hard to find a good one!  Rich connections to OR(IP)/MDP  Requires selecting “objectives” in addition to “actions”  Which subset of goals to achieve  At what degree to satisfy individual goals  E.g. Collect as much soil sample as possible; get done as close to 2pm as possible  Currently, the objective selection is left to humans  Leads to highly suboptimal plans since objective selection cannot be done independent of planning  We propose to develop scalable methods for synthesizing plans in such over-subscribed scenarios

Proposal Overview  Preliminary work  Simple formal model: PSP-Net Benefit  MDP-based, IP-based, and heuristic-planning based approaches  Proposed directions  Improving expressiveness of PSP planners  Handling goals needing degree of satisfaction (e.g. numeric goals)  Handling goals with soft deadline (where utility of the delayed goals is reduced)  Handling complex interactions between objectives  Interactions between the plans of the goals  Interactions between the utilities of the goals  Improving search in PSP planners  More powerful heuristics for PSP planning (which take interactions into account)  More flexible search frameworks --non-combinable costs and utilities  Multi-objective search  Applications  Replanning as a PSP planning problem

Formulation  PSP Net benefit:  Given a planning problem P = (F, A, I, G), and for each action a “cost” c a  0, and for each goal fluent f  G a “utility” u f  0, and a positive number k. Is there a finite sequence of actions  = (a 1, a 2, …, a n ) that starting from I leads to a state S that has net benefit  f  (S  G) u f –  a  c a  k. PLAN EXISTENCE PLAN LENGTH PSP GOAL LENGTH PSP GOAL PLAN COSTPSP UTILITY PSP UTILITY COST PSP NET BENEFIT Maximize the Net Benefit Actions have execution costs, goals have utilities, and the objective is to find the plan that has the highest net benefit.  easy enough to extend to mixture of soft and hard goals

A spectrum of approaches for PSP-Net Benefit  EXACT METHODS  Deterministic MDPs  Model the problem as a deterministic MDP with action costs, where a state has a reward equal to the utility of the goals that hold in it.  A special action “Done” takes the agent from any state S to a state S d which is a sink state  Guaranteed optimal, but very slow (using SPUDD, a state of the art MDP solver)  Optiplan  Integer programming based STRIPS planner  Optimal for a given plan length  Equivalent to bounded-horizon MDP  HEURISTIC METHODS  Altalt ps  Heuristic planner that selects the “objectives” up front heuristically  Novel use of planning-graph based reachability analysis to pick objectives  Not optimal, but quite fast  Sapa ps  Models PSP as heuristic search. Can be optimal given admissible heuristics.  Can be thought of as a search- based solution to the deterministic MDP [AAAI 2004; KBCS 2004] Source of Strength: Planning graph based Reachability Heuristics for PSP

Comparison of approaches [AAAI 2004] Exact algorithms based on MDPs don’t scale at all

Adapting PG heuristics for PSP  Challenges:  Need to propagate costs on the planning graph  The exact set of goals are not clear  Interactions between goals  Obvious approach of considering all 2 n goal subsets is infeasible  Idea: Select a subset of the top level goals upfront  Challenge: Goal interactions  Approach: Estimate the net benefit of each goal in terms of its utility minus the cost of its relaxed plan  Bias the relaxed plan extraction to (re)use the actions already chosen for other goals [optional]

SAPA PS : A forward A* Approach for PSP A*: f(S) = g(S) + h(S) A1: Navigate(X,Y)A2: SampleSoil(Y) A3: TakePicture A4: Navigate(Y,Z) A5: SampleRock(Y) g(S) is the net benefit of the plan that got us from initial state to S -- Difference between the utility of goals holding in S and and the cost of actions that took us from I to S h*(S) is the additional net benefit of the best plan P starting from S (If S’ is the result of applying P to S, then we want to maximize [U(S’) – U(S)] – C(P)] h(S) is the estimate of h*() Anytime A* Algorithm: Search through best beneficial nodes [optional]

SAPA PS : Modeling A* search for PSP  Search node evaluation  (f = g+h):  Lowest expected total number of actions  Candidate Plans:  Qualifying plans: Achieve all goals  Search termination criteria:  Achieving all goals  Search node evaluation  (f = g+h):  Highest expected total “benefit” (goal utility – action cost).  Candidate Plans:  “Beneficial” plans: Total achieved goal utility > total action cost.  Search termination criteria:  No search node appears to be extendable to be more beneficial than the best beneficial plan found. Many state-of-the-art planners use best-first A* search.  How to model A* search to PSP Net Benefit? [optional]

Proposal Overview  Preliminary work  Simple formal model: PSP-Net Benefit  MDP-based, IP-based, and heuristic-planning based approaches  Proposed directions  Improving expressiveness of PSP planners  Handling goals needing degree of satisfaction (e.g. numeric goals)  Handling goals with soft deadlines (where utility of the delayed goals is reduced)  Handling complex interactions between objectives  Interactions between the plans of the goals  Interactions between the utilities of the goals  Improving search in PSP planners  More powerful heuristics for PSP planning (which take interactions into account)  More flexible search frameworks --non-combinable costs and utilities  Multi-objective search  Applications  Replanning as a PSP planning problem

Search & Heuristic Improvements  Make objective selection more sensitive to goal (achievement) interactions  Consider group interactions  Consider negative interactions  Preliminary work in ICAPS 2005 (with Sanchez Nigenda)  Consider faster techniques for exact methods  Leverage our recent work on novel IP encodings  Based on loosely coupled network flow problems which is highly competitive with SAT methods  ICAPS 2005 (with van den Briel)  Consider adapting directed and anytime MDP techniques

Degree & Delay of Satisfaction In metric temporal domains, PSP will involve –Partial Degree of satisfaction If you can’t give me 1000$, give me half at least Need to track costs for various intervals of a numeric quantity  –Delayed Satisfaction If you submit the homework past the deadline, you will get penalty points Preliminary work on degree of satisfaction in [IJCAI 2005]

Utility interactions between goals PSP-net benefit considers goal achievement interactions..but assumes additive model of goal utilities –U(G1,G2)= U(G1)+U(G2) Additive utility model often unrealistic –Utility having two shoes is much more than the sum of the utilities of having either one of them –Utility of having two cars is less than the sum of utilities of having either one of them Challenges: –Elicit utility models (preference elicitation) –Model utility interactions Adapt and extend CP-nets for modeling goal utilities –Can also consider qualitative preference models –Extend the reachability heuristics to consider both plan interactions and goal interactions

Non-combinable costs/utilities PSP Net Benefit assumes costs and utilities are in same units …often does not hold –E.g. different types of resource costs (fuel, manpower); different types of utilities Solution: Multi-objective search –Either elicit utility models Alpha * manpower + Beta * mission utility –..or search for highest utility plans given a specific resource bound –..or provide pareto (non-dominated) set of solution plans and let the user choose Challenge: Need to adapt reachability heuristics to separately track the various types of costs and utilities –We plan to build on our work on multi-objective temporal planning in SAPA

Combining uncertainty and partial satisfaction  Time permitting, we hope to extend our PSP framework to handle stochastic domains  Planning in stochastic domains already has many natural affinities to PSP  If the planner wants to ensure that its plan reaches goals with higher probability, it needs to often go for longer (costlier) plans ..Many challenges remain in selecting objectives in stochastic domains  We expect to leverage our significant work in extending reachability heuristics for stochastic and non-deterministic domains  [UAI 2005; AAAI 2005; ICAPS 2004; JAIR in review] Note: Not in the proposal draft

Explaining the planner’s decisions in mixed initiative scenarios  In mixed-initiative scenarios, humans would like to get explanations on the selected objectives  Anecdotal evidence suggests that in military planning applications, human users are not willing to take a plan when the objectives selected by the planner do not match the human’s intuition  Challenge: Explaining the “optimality” of the planner’s decisions is technically hard  In contrast, explaining correctness is much simpler  Proposed approach: Will modify the reachability heuristic computations to leave a trace of their reasoning  Intent would be to explain at least the pareto-optimality of the selected set of objectives 1.when a subgoal cannot not be included because of cost-based or preference-based interactions with other selected subgoals, annotate this fact 2.summarize the pareto-set (in multi-objective optimization cases) in terms of conditional plans explaining which member of the set is “optimal” under what conditions 3.Support sensitivity analysis on the stability of the selected objectives (i.e., under what conditions will they no longer be optimal)

Modeling Replanning as a PSP problem  Traditionally, replanning has been cast as a “procedure” rather than a problem  Modify the old plan to handle the new situations ..we take the stance that replanning is a “problem”  Achieve the original goals of the agent from the current initial situation  Subject to various constraints that were imposed by the partial execution of the original plan  Reservations, Commitments– these are however soft constraints ..Replanning can be best modeled as a PSP problem!  We propose to do this..

Summary and Impact  PSP planning problems are ubiquitous and extend the modeling power of planning frameworks .. By foregrounding user preferences among different objectives  They pose interesting technical challenges to the state of the art ..by emphasizing plan-quality considerations  We have already made significant progress in handling PSP problems  AAAI 2004; ICAPS 2005 (2); IJCAI 2005 ..and propose to extend our framework significantly ..as well as demonstrate its power through applications