Refinement Planing CSE 574 April 15, 2003 Dan Weld.

Refinement Planing CSE 574 April 15, 2003 Dan Weld

Planning Applications RAX/PS (The NASA Deep Space planning agent) HSTS (Hubble Space Telescope scheduler) Similar work in planning earth obs: satellite / plane Spacecraft Repair / Workflow Shuttle refurbishment Optimum AIV system Elevator control Koehler's Miconic domain – fielded in skyscrapers. An airport-ground-traffic-control Company of Wolfgang Hatzack

Planning Applications 2 Diagnose, reconfigure power distribution systems Sylvie Thiebaux / EDF Data Transformation VICAR (JPL image enhancing system); CELWARE (CELCorp) Online games & training “Intelligent” characters Classics (fielded?) Robotics NLP to database interfaces

Planning Applications 3 Control of an Australian Brewery (SIPE) Desert Storm Logistics Planning “DART saved more $$ during this campaign than the whole DARPS budget for the past 25 years”

More Administrivia No class Fri 4/18 But: Read p1-30 Boutilier, Dean & Hanks No review necessary Experiment with  1 Planner Write a review of the planner Plan Project What Who: 1-3 person groups

Project 1: Goal Selection Input Init state Action schemata Goals Output Plan each with assoc utility = f(g) assoc resource cost = f(act) maximizing utility subject to resource bound Resource bound

Project 2: Embedded Agent Implement Simulator Takes init state + action schemata as input Communicates with agent Integrate MDP Agent SPUDD, GPT, or ?? Extensions Augment to take user-specified goals Incremental policy changes One-shot vs recurring reward Real time issues

Project 3: Incomplete Info & Time Extend SAPA or other temporal planner Sensory effects Handle information gathering goals @time Interleaved execution or conditional plans (Likely adopt ideas from Petrick & Bacchus)

A recent (turbulent) history of planning 1995 Advent of CSP style compilation approach: Graphplan [Blum & Furst] SATPLAN [Kautz & Selman] Use of reachability analysis and Disjunctive constraints 1970s-1995 UCPOP, Zeno [Penberthy &Weld] IxTeT [Ghallab et al] The whole world believed in POP and was happy to stack 6 blocks! UCPOP Domination of heuristic state search approach: HSP/R [Bonet & Geffner] UNPOP [McDermott]: POP is dead! Importance of good Domain-independent heuristics 1997 UNPOP 2000 - Hoffman’s FF – a state search planner won the AIPS-00 competition! … but NASA’s highly publicized RAX still a POP dinosaur! POP believed to be good framework to handle temporal and resource planning [Smith et al, 2000] RePOP

In the beginning it was all POP. Then it was cruelly UnPOPped The good times return with Re(vived)POP

Too many brands of classical planners Planning as Search Search in the space of States (progression, regression, MEA) (STRIPS, PRODIGY, TOPI, HSP, HSP-R, UNPOP, FF) Search in the space of Plans (total order, partial order, protections, MTC) (Interplan,SNLP,TOCL, UCPOP,TWEAK) Search in the space of Task networks (reduction of non-primitive tasks) (NOAH, NONLIN, O-Plan, SIPE) Planning as CSP/ILP/SAT/BDD (Graphplan, IPP, STAN, SATPLAN, BLackBOX,GP-CSP,BDDPlan) Planning as Theorem Proving (Green’s planner) Planning as Model Checking

A Unifying View CONTROL Heuristics/Optimizations Reachability Relevance Relax Subgoal interactions Directed Partial Consistency enforcement PART 2 HTN Schemas TL Formulas Cutting Planes Domain-customization Case-based Abstraction-based Failure-based Domain Analysis* Hand-coded Learned PART 3 Refinement Planning Disjunctive Refinement Planning Conjunctive Refinement Planning CSPILPBDD What are Plans? Refinements? How are sets of plans represented compactly? How are they refined? How are they searched? Graph-basedSAT SEARCH FSS, BSS, PS Candidate set semantics PART I 1.0 1.11.2

Main Points Framework Types of refinements Presatisfaction, preordering, tractability Refinement vs. solution extraction Splitting as a way to decrease soln extraction time … at a cost Use of disjunctive representations

Tradeoffs among Basic Strategies Progression/regression must commit to both position and relevance of actions (Regression can judge relevance— sort of-- but handles sets of states) + Give state information (Easier plan validation) - Leads to premature commitment >but better heuristic guidance - Too many states when actions have durations Plan-space refinement (PSR) avoids constraining position + Reduces commitment (large candidate set /branch) >But harder to get heuristic estimate - Increases plan-validation costs + Easily extendible to actions with duration State SpacePlan Space

(Dis)advantages of partial order planning The Heuristic Angle Estimating the distance of a partial plan from a Flaw-less solution plan is conceptually harder Than estimating the distance of a set of states from The init state which in turn is harder than estimating The cost of a single state from the goal state The Commitment angle Progression/regression planners commit to both Position and relevance. PS planners only commit To relevance. --Unnecessary commitments increase the chance of backtracking >>But also make it easier to validate/evalute the partial plan Action Position, Relevance Branching Factor Depth of Search Tree Maintenance Goals Durative Actions

Weaknesses Numerous terms, far from first use Is this interesting? While (I still have candidate plans) If I have a solution plan, return it Else, improve the existing plans EndWhile Interleaving different strategies Dynamically determine which strategy to use Exploiting learning, symmetry in planning

Future Work Filling in holes Can unsupervised learning can be used? For supervised learning, Can sufficient training samples be obtained? Can one extend refinement strategies E.g. planning under uncertainty?

Transition System Perspective Model agent-env. dynamics as transition systems A transition system is a 2-tuple where S is a set of states A is a set of actions, each action a being a subset of SXS Graphs with states = to nodes, and actions =edges If transitions are not deterministic, then the edges will be “hyper-edges Agent may know that its initial state is subset S’ of S If the env. is not fully observable, then |S’|>1. Consider some subset Sg of S as desirable Finding a plan is equivalent to finding (shortest) path in the graph corresponding to the transition system

Transition System Models A transition system is a two tuple Where S is a set of “states” A is a set of “transitions” each transition a is a subset of SXS --If a is a (partial) function then deterministic transition --otherwise, it is a “non-deterministic” transition --It is a stochastic transition If there are probabilities associated with each state a takes s to --Finding plans becomes is equivalent to finding “paths” in the transition system Transition system models are called “Explicit state-space” models In general, we would like to represent the transition systems more compactly e.g. State variable representation of states. These latter are called “Factored” models Each action in this model can be Represented by incidence matrices (e.g. below) The set of all possible transitions Will then simply be the SUM of the Individual incidence matrices

Manipulating Transition Systems

MDPs = general transition systems A Markov Decision Process) is a general (deterministic or non-) transition system where the states have “Rewards” In the general case, all states can have varying amount of rewards Planning defined as finding a “policy” A mapping from states to actions which has the maximal expected reward

Problems with transition systems Transition systems are a great conceptual tool …However direct manipulation of transition systems tends to be too cumbersome The size of the explicit graph corresponding to a transition system is often very large The remedy is to provide “compact” representations Start by explicating the structure of the “states” e.g. states specified in terms of state variables Represent actions not as incidence matrices but rather functions specified directly in terms of the state vars An action will work in any state where some state variables have certain values. When it works, it will change the values of certain (other) state variables

Factoring States 3 prop variables: P, Q, R 8 world states

Boolean Functions P Q TF T {P, Q, R} -> {T/F}  P  Q BDDs

Refinement Planing CSE 574 April 15, 2003 Dan Weld.

Similar presentations

Presentation on theme: "Refinement Planing CSE 574 April 15, 2003 Dan Weld."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Refinement Planing CSE 574 April 15, 2003 Dan Weld.

Similar presentations

Presentation on theme: "Refinement Planing CSE 574 April 15, 2003 Dan Weld."— Presentation transcript:

Similar presentations

About project

Feedback