Download presentation
Presentation is loading. Please wait.
1
Goals, plans, and planning Northwestern University CS 395 Behavior-Based Robotics Ian Horswill
2
Modal logic Need to reason about States of knowledge Goals These aren’t propositions about objects … … but rather about other propositions (define-signal front-sonar … (mode (know (< front-sonar 2000)))) … (define-signal fspace (min front-sonar front-left-sonar front-right-sonar)) (define-signal advance (behavior (know fspace) (rt-vector 0 fspace)))
3
Modalities in GRL In GRL, a modality is a special kind of signal procedure The signal it returns is just a default You can override it with a mode declaration It’s memoized so that it always returns the same signal object when called on the same signal object ( define-signal-modality (mymode x) … compute default … ) (define-signal sig expr (mode (mymode expr )))
4
Simplified modality definitions (define-signal-modality (know x) (define inputs (signal-inputs x)) (signal-expression (apply and (know inputs)))) (define-signal-modality (goal x) (define the-mode (signal-expression (accumulate or))) (define (forward-goal y) (drive-signal! x y)) (for-each forward-goal (signal-inputs x)) the-mode)
5
GRL modal logic API (know x) Whether x’s value is known (goal x) True if x is a goal of achievement Robot “wants” to make it true and move on (maintain-goal x) True if x is a maintenance goal Robot “wants” to make it true and keep it true (know-goal x) True if x is a knowledge goal Robot “wants” to determine the value of x
6
Built-in inference axioms (know (operator arg …)) (and (know arg) …) (goal (know x)) (know-goal x) (goal (maintain x)) (maintain-goal x) (know (know x)) true (know (goal x)) true
7
Goal reduction API (define-signal s (and a b c …)) (define-reduction s parallel) When s is a goal, all its inputs are goals This is what was shown three slides ago (define-signal s (and a b c …)) (define-reduction s serial) When s is a goal, a is a goal When s is a goal and a is true, b is a goal When s is a goal and both a and b are goals, c is a goal
8
Useful functions (know-that x) True if (know x) and x (satisfied-goal x) True if x is a goal and is true (unsatisfied-goal x) True if x is a goal and is false (parallel-and a b c …) And gate with parallel goal reduction (serial-and a b c …) And gate with parallel goal reduction
9
Planning Given Goal (desired state of the environment) Current state of the environment Set of actions Descriptions of how actions change the state of the environment Actions are essentially functions from states to states Find a series of actions (called a plan) that will result in the desired goal state
10
A bad planning algorithm Key idea: simulate every possible series of actions until your simulation finds the goal Plan(s, g) { for each action a { let s’ = a(s) the state after running a if s == g return s else try { return a+plan(s’,g) } catch backtrack {}; // Try another action } throw backtrack; }
11
Complexity Have to search a tree of plans If there are n possible actions, there are n m possible m-step plans Naïve algorithm is exponential Cleaver optimizations possible, but it’s still basically an exponential problem
12
Generalizations Conditional planning Allow ifs inside of the plan to handle contingencies More robust More expensive to plan Automatic programming Plans can be arbitrary programs Fully undecidable
13
Generalizations (2) Markov Decision Problems (MDPs) Actions aren’t deterministic Only know a probability distribution on the possible result states for each action Actions are now functions from probability distributions to probability distributions Plan can’t be a program anymore (how do you know what the output state is?) Payoff function that tells you how good a state is Find the policy that gives you the best expected (i.e. average over the state probability distribution) payoff Really really expensive
14
Generalizations (3) Partially Observable MDPs (POMDPs) Actions aren’t deterministic Don’t know what state you’re in Sensors only give us a probability distribution on states Not states Policy has to map probability distributions (called “belief states”) to actions Not states to actions Payoff function that tells you how good a state is Find the policy that gives you the best expected (i.e. average over the state probability distribution) payoff Really really really expensive
15
Generalizations (4) Can you detect a pattern here? How to get tenure Find a complicated instance of a problem that current technology can’t handle Devise an elegant yet prohibitively expensive technology to solve it Write a paper that starts with “To survive in complex dynamic worlds, an agent must …” Add a description of your technique Prove a lot of theorems about how your technique will solve all instances of the problem given more CPU time than the lifetime of the universe Write: “Future work: make it fast”
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.