Presentation is loading. Please wait.

Presentation is loading. Please wait.

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati CSE 574 Planning & Learning (which is actually more of the former and less of.

Similar presentations


Presentation on theme: "A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati CSE 574 Planning & Learning (which is actually more of the former and less of."— Presentation transcript:

1 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati CSE 574 Planning & Learning (which is actually more of the former and less of the latter) Subbarao Kambhampati http://rakaposhi.eas.asu.edu/cse574

2 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati The Indian Standard Time G Right now, it is 3:45AM in the morning in India –Where I was for the whole break »And only got back yesterday l And my body thinks it is still in India –I could never stay awake after 3AM – And the greedy Mariott closed the only half- decent coffee shop around here. So…. Wake me up if you see me dozing off

3 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Most everything Will be on Homepage

4 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Logistics G Office hours: After class 4:30-5:30 and by appointment G No “official TA” Romeo Sanchez (rsanchez@asu.edu)rsanchez@asu.edu And Binh Minh Do (binhminh@asu.edu)binhminh@asu.edu –Will kindly provide unofficial TA support G Caveats –Graduate level class. No text book—you will read papers. Participation required and essential G Evaluation (subject to change) –Participation (~20%) »Do readings before classes. Attend classes. Take part in discussions. Be scribes for class discussions. –Projects/Homeworks (~35%) »May involve using existing planners, writing new domains –Semester project (~20%) »Either a term paper or a code-based project –Mid-term and Final (~25%)

5 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Your introductions G Name G Standing G Area(s) of interest G Reasons if any for taking the course G Do you prefer –Homeworks/class projects OR –Semester long individual project?

6 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Planning : The big picture G Synthesizing goal-directed behavior G Planning involves –Action selection; Handling causal dependencies –Action sequencing and handling resource allocation »typically called SCHEDULING –Depending on the problem, plans can be »action sequences »or “policies” (action trees, state-action mappings etc.)

7 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati The Many Complexities of Planning Environment action perception Goals (Static vs. Dynamic) (Observable vs. Partially Observable) (perfect vs. Imperfect) (Deterministic vs. Stochastic) What action next? (Instantaneous vs. Durative) (Full vs. Partial satisfaction) The $$$$$$ Question

8 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Planning & (Classical Planning) Environment action perception Goals (Static) (Observable) (perfect) (deterministic) What action next? I = initial state G = goal state OiOi (prec)(effects) [ I ] OiOi OjOj OkOk OmOm [ G ]

9 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Static Deterministic ObservableInstantaneousPropositional “Classical Planning” Dynamic Replanning/ Situated Plans Durative Temporal Reasoning Continuous Numeric Constraint reasoning (LP/ILP) Stochastic Contingent/Conformant Plans, Interleaved execution MDP Policies POMDP Policies Partially Observable Contingent/Conformant Plans, Interleaved execution Semi-MDP Policies

10 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Class of 23 rd January I am less jet-lagged (waking up only at 3AM) I discovered Side-bar café (near law-library) Even started sadism (homework assignments) In short—general sweetness and light all around

11 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Applications (Current & Potential) G Scheduling problems with action choices as well as resource handling requirements –Problems in supply chain management –HSTS (Hubble Space Telescope scheduler) –Workflow management G Autonomous agents –RAX/PS (The NASA Deep Space planning agent) G Software module integrators –VICAR (JPL image enhancing system); CELWARE (CELCorp) –Test case generation (Pittsburgh) G Interactive decision support –Monitoring subgoal interactions »Optimum AIV system G Plan-based interfaces –E.g. NLP to database interfaces –Plan recognition G Web-service composition

12 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Lots of activity... G Significant scale-up in the last 4-5 years –Before we could synthesize about 5-6 action plans in minutes –Now, we can synthesize 100-action plans in minutes »Further scale-up with domain-specific control G Significant strides in our understanding –Rich connections between planning and CSP(SAT) OR (ILP) »Vanishing separation between planning & Scheduling –New ideas for heuristic control of planners –Wide array of approaches for customizing planners with domain-specific knowledge New people. Conferences. Workshops. Competitions. Inter-planetary explorations. So, Why the increased interest?

13 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Broad Aims & Biases of the First Part AIM: We will concentrate on planning in deterministic, quasi-static and fully observable worlds Will start with “classical” domains; but discuss handling durative actions and numeric constraints, as well as replanning Neo-Classical Planning BIAS: To the extent possible, we shall shun brand-names and concentrate on unifying themes Better understanding of existing planners Normalized comparisons between planners Evaluation of trade-offs provided by various design choices Better understanding of inter-connections Hybrid planners using multiple refinements Explication of the connections between planning, CSP, SAT and ILP

14 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Overview for the first part  The Planning problem –Our focus – Modeling, Proving correctness  Refinement Planning: Formal Framework  Conjunctive refinement planners  Disjunctive refinement planners –Refinement of disjunctive plans –Solution extraction from disjunctive plans »Direct, Compiled (SAT, CSP, ILP,BDD)  Heuristics/Optimizations  Customizing Planners –User-assisted Customization –Automated customization  Support for non-classical worlds

15 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Why Care about “classical” Planning? G Most of the recent advances occurred in neo-classical planning G Many stabilized environments satisfy neo-classical assumptions –It is possible to handle minor assumption violations through replanning and execution monitoring “ This form of solution has the advantage of relying on widely-used (and often very efficient) classical planning technology” Boutilier, 2000 G Techniques developed for neo-classical planning often shed light on effective ways of handling non-classical planning worlds –Currently, most of the efficient techniques for handling non-classical scenarios are still based on ideas/advances in classical planning

16 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati “..As such, the classcial model can b viewed as a way of approximating the solution of the underlying POMDP. […] This form of solution has the advantage of relying on widely-used (and often very efficient) classical planning technology” Also put some of the classification stuff?

17 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati The (too) many brands of classical planners Planning as Search Search in the space of States (progression, regression, MEA) (STRIPS, PRODIGY, TOPI, HSP, HSP-R, UNPOP, FF) Search in the space of Plans (total order, partial order, protections, MTC) (Interplan,SNLP,TOCL, UCPOP,TWEAK) Search in the space of Task networks (reduction of non-primitive tasks) (NOAH, NONLIN, O-Plan, SIPE) Planning as CSP/ILP/SAT/BDD (Graphplan, IPP, STAN, SATPLAN, BLackBOX,GP-CSP,BDDPlan) Planning as Theorem Proving (Green’s planner) Planning as Model Checking

18 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati A Unifying View CONTROL Heuristics/Optimizations Reachability Relevance Relax Subgoal interactions Directed Partial Consistency enforcement PART 2 HTN Schemas TL Formulas Cutting Planes Domain-customization Case-based Abstraction-based Failure-based Domain Analysis* Hand-coded Learned PART 3 Refinement Planning Disjunctive Refinement Planning Conjunctive Refinement Planning CSPILPBDD What are Plans? Refinements? How are sets of plans represented compactly? How are they refined? How are they searched? Graph-basedSAT SEARCH FSS, BSS, PS Candidate set semantics PART I 1.0 1.11.2

19 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Modeling Planning Problems: Actions, States, Correctness PART I.0

20 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Transition Sytem Perspective G We can think of the agent-environment dynamics in terms of the transition systems –A transition system is a 2-tuple where »S is a set of states »A is a set of actions, with each action a being a subset of SXS –Transition systems can be seen as graphs with states corresponding to nodes, and actions corresponding to edges »If transitions are not deterministic, then the edges will be “hyper-edges”—i.e. will connect sets of states to sets of states –The agent may know that its initial state is some subset S’ of S »If the environment is not fully observable, then |S’|>1. »|S’| can be > 1 even in fully-observable domains (if we want to do find policies rather than plans) –It may consider some subset Sg of S as desirable states –Finding a plan is equivalent to finding (shortest) paths in the graph corresponding to the transition system

21 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Transition System Models A transition system is a two tuple Where S is a set of “states” A is a set of “transitions” each transition a is a subset of SXS --If a is a (partial) function then deterministic transition --otherwise, it is a “non-deterministic” transition --It is a stochastic transition If there are probabilities associated with each state a takes s to --Finding plans becomes is equivalent to finding “paths” in the transition system Transition system models are called “Explicit state-space” models In general, we would like to represent the transition systems more compactly e.g. State variable representation of states. These latter are called “Factored” models Each action in this model can be Represented by incidence matrices (e.g. below) The set of all possible transitions Will then simply be the SUM of the Individual incidence matrices

22 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Manipulating Transition Systems

23 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati MDPs as general cases of transition systems G An MDP (Markov Decision Process) is a general (deterministic or non-deterministic) transition system where the states have “Rewards” –In the special case, only a certain set of “goal states” will have high rewards, and everything else will have no rewards –In the general case, all states can have varying amount of rewards G Planning, in the context of MDPs, will be to find a “policy” (a mapping from states to actions) that has the maximal expected reward G We will talk about MDPs later in the semester

24 A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Problems with transition systems G Transition systems are a great conceptual tool to understand the differences between the various planning problems G …However direct manipulation of transition systems tends to be too cumbersome –The size of the explicit graph corresponding to a transition system is often very large (see Homework 1 problem 1) –The remedy is to provide “compact” representations for transition systems »Start by explicating the structure of the “states” l e.g. states specified in terms of state variables »Represent actions not as incidence matrices but rather functions specified directly in terms of the state variables l An action will work in any state where some state variables have certain values. When it works, it will change the values of certain (other) state variables


Download ppt "A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati CSE 574 Planning & Learning (which is actually more of the former and less of."

Similar presentations


Ads by Google