Presentation is loading. Please wait.

Presentation is loading. Please wait.

11/5  Bayes Nets project due  Prolog project assigned  Today: FOPC—Resolution Thm Proving; Situation Calculus  Leading to planning.

Similar presentations


Presentation on theme: "11/5  Bayes Nets project due  Prolog project assigned  Today: FOPC—Resolution Thm Proving; Situation Calculus  Leading to planning."— Presentation transcript:

1 11/5  Bayes Nets project due  Prolog project assigned  Today: FOPC—Resolution Thm Proving; Situation Calculus  Leading to planning

2

3

4 Efficiency can be improved by re-ordering subgoals adaptively  e.g., try to prove Pet before Small in Lilliput Island; and Small before Pet in pet-store.

5 Forward (bottom-up) vs. Backward (top-down) chaining Forward chaining fires rules starting from facts –Using P, derive Q –Using Q & R, derive S – Using S, derive Z – Using Z, Q, derive W –Using Q, derive J –No more inferences. Check if J holds. It does. So proved Backward chaining starts from the theorem to be proved –We want to prove J. –Using Q=>J, we can subgoal on Q –Using P=>Q, we can subgoal on P –P holds. We are done. Suppose we have P => Q Q & R =>S S => Z Z & Q => W Q => J P R We want to prove J Forward chaining allows parallel derivation of many facts together; but it may derive facts that are not relevant for the theorem. Backward chaining concentrates on proving subgoals that are relevant to the theorem. However, it proves theorems one at a time. Some similarity with progression vs. regression…

6 Datalog and Deductive Databases A deductive database is a generalization of relational database, where in addition to the relational store, we also have a set of “rules”. –The rules are in definite clause form (universally quantified implications, with one non-negated head, and a conjunction of non-negated tails) When a query is asked, the answers are retrieved both from the relational store, and by deriving new facts using the rules. The inference in deductive databases thus involves using GMP rule. Since deductive databases have to derived all answers for a query, top-down evaluation winds up being too inefficient. So, bottom-up (forward chaining) evaluation is used (which tends to derive non-relevant facts  A neat idea called magic-sets allows us to temporarily change the rules (given a specific query), such that forward chaining on the modified rules will avoid deriving some of the irrelevant facts. Base facts P(a,b),Q(b) R(c).. Rules P(x,y),Q(y)=>R(y) ?R(z) RDBMS R(c); R(b).. Connection to Progression becoming goal directed w.r.t. P.G. reachability heuristics

7 Similar to “Integer Programming” or “Constraint Programming”

8 Generate compilable matchers for each pattern, and use them

9

10

11

12

13 Example of FOPC Resolution.. Everyone is loved by someone If x loves y, x will give a valentine card to y Will anyone give Rao a valentine card? y/z;x/Rao ~loves(z,Rao) z/SK(rao);x’/rao

14 Finding where you left your key.. Atkey(Home) V Atkey(Office) 1 Where is the key? Ex Atkey(x) Negate Forall x ~Atkey(x) CNF ~Atkey(x) 2 Resolve 2 and 1 with x/home You get Atkey(office) 3 Resolve 3 and 2 with x/office You get empty clause So resolution refutation “found” that there does exist a place where the key is… Where is it? what is x bound to? x is bound to office once and home once. so x is either home or office

15 Existential proofs.. The previous example shows that resolution refutation is powerful enough to model existential proofs. In contrast, generalized modus ponens is only able to model constructive proofs.. (We also discussed a cute example of existential proof—is it possible for an irrational number power another irrational number to be a rational number—we proved it is possible, without actually giving an example).

16 Existential proofs.. Are there irrational numbers p and q such that p q is rational? Rational Irrational This and the previous examples show that resolution refutation is powerful enough to model existential proofs. In contrast, generalized modus ponens is only able to model constructive proofs..

17 GMP vs. Resolution Refutation While resolution refutation is a complete inference for FOPC, it is computationally semi-decidable, which is a far cry from polynomial property of GMP inferences. So, most common uses of FOPC involve doing GMP-style reasoning rather than the full theorem-proving.. There is a controversy in the community as to whether the right way to handle the computational complexity is to – a. Develop “tractable subclasses” of languages and require the expert to write all their knowlede in the procrustean beds of those sub-classes (so we can claim “complete and tractable inference” for that class) OR –Let users write their knowledge in the fully expressive FOPC, but just do incomplete (but sound) inference. –See Doyle & Patil’s “Two Theses of Knowledge Representation”

18 11/7  Homework 4 due 11/14  Make-up class 11/9 same time same room +Raspberry Bars..  Final exam is scheduled for 12/10 12:20- 2:10pm

19 Situational Calculus: Time & Change in FOPC SitCalc is a special class of FOPC with –Special terms called “situations” Situations can be thought of as referring to snapshots of the universe at various times –Special terms called “actions” Putdown(A); stack(B,x) etc (A,B constants) –Special function called Result which returns a situation Result(action-term,situation-term) Result(putdown(a),S) World properties can be modeled as predicates (with an extra situational argument) Clear(B,S 0 ) Actions are modeled in terms of what needs to be true in the situation where the action takes place, and what will be true in the situation that results You can also have intra-situation axioms

20

21 ..So, is Planning=Theorem Proving?..yes, BUT –Consider the previous problem, except you now have another block B which is already on table and is clear. Your goal is to get A onto table while leaving B clear. –Sounds like a no-brainer, right? –..but the theorem prover won’t budge It has no axiom telling it that B will remain clear in the situation Result(Putdown(A),S 0 ) Big deal.. We will throw in an axiom saying that Clear(x) continues to hold in the situation after Putdown(A) –But WAIT. We are now writing axioms about properties that DO NOT CHANGE There may be too many axioms like this If there are K properties and M actions, we need K*M frame axioms …AND we have to resolve against them –Increasing the depth of the proof (and thus exponentially increasing the complexity..) There are ways to reduce the number of frame axioms from K*M to just K (write, for each property P, the only conditions under which it transitions from True to False between situations) –Called Successor State Axioms But we still have to explicitly prove to ourselves that everything that has not changed has actually not changed –..unless we make additional assumptions E.g. STRIPS assumption… –If a property has not been mentioned in an action’s effects, it is assumed that it remains the same Sphexishness

22 One kind of determinism, genetic fixity, is illustrated powerfully by the example of the digger wasp, Sphex ichneumoneus: When the time comes for egg laying, the wasp Sphex builds a burrow for the purpose and seeks out a cricket which she stings in such a way as to paralyze but not kill it. She drags the cricket into the burrow, lays her eggs alongside, closes the burrow, then flies away, never to return. In due course, the eggs hatch and the wasp grubs feed off the paralyzed cricket, which has not decayed, having been kept in the wasp equivalent of deep freeze. To the human mind, such an elaborately organized and seemingly purposeful routine conveys a convincing flavor of logic and thoughtfulness-- until more details are examined. For example, the Wasp's routine is to bring the paralyzed cricket to the burrow, leave it on the threshold, go inside to see that all is well, emerge, and then drag the cricket in. If the cricket is moved a few inches away while the wasp is inside making her preliminary inspection, the wasp, on emerging from the burrow, will bring the cricket back to the threshold, but not inside, and will then repeat the preparatory procedure of entering the burrow to see that everything is all right. If again the cricket is removed a few inchies while the wasp is inside, once again she will move the cricket up to the threshold and re-enter the burrow for a final check. The wasp never thinks of pulling the cricket straight in. On one occasion this procedure was repeated forty times, always with the same result. (Woodridge, 1963, p. 82)

23 Deterministic Planning Given an initial state I, a goal state G and a set of actions A:{a1…an} Find a sequence of actions that when applied from the initial state will lead the agent to the goal state. Qn: Why is this not just a search problem (with actions being operators?) –Answer: We have “factored” representations of states and actions. And we can use this internal structure to our advantage in –Formulating the search (forward/backward/insideout) –deriving more powerful heuristics etc.

24 Problems with transition systems Transition systems are a great conceptual tool to understand the differences between the various planning problems …However direct manipulation of transition systems tends to be too cumbersome –The size of the explicit graph corresponding to a transition system is often very large (see Homework 1 problem 1) –The remedy is to provide “compact” representations for transition systems Start by explicating the structure of the “states” –e.g. states specified in terms of state variables Represent actions not as incidence matrices but rather functions specified directly in terms of the state variables –An action will work in any state where some state variables have certain values. When it works, it will change the values of certain (other) state variables

25 Blocks world State variables: Ontable(x) On(x,y) Clear(x) hand-empty holding(x) Stack(x,y) Prec: holding(x), clear(y) eff: on(x,y), ~cl(y), ~holding(x), hand-empty Unstack(x,y) Prec: on(x,y),hand-empty,cl(x) eff: holding(x),~clear(x),clear(y),~hand-empty Pickup(x) Prec: hand-empty,clear(x),ontable(x) eff: holding(x),~ontable(x),~hand-empty,~Clear(x) Putdown(x) Prec: holding(x) eff: Ontable(x), hand-empty,clear(x),~holding(x) Initial state: Complete specification of T/F values to state variables --By convention, variables with F values are omitted Goal state: A partial specification of the desired state variable/value combinations --desired values can be both positive and negative Init: Ontable(A),Ontable(B), Clear(A), Clear(B), hand-empty Goal: ~clear(B), hand-empty All the actions here have only positive preconditions; but this is not necessary STRIPS ASSUMPTION: If an action changes a state variable, this must be explicitly mentioned in its effects

26 What do we lose with STRIPS actions? Need to write all effects explicitly –Can’t depend on derived effects Leads to loss of modularity –Instead of saying “Clear” holds when nothing is “On” the block, we have to write Clear effects everywhere –If now the blocks become bigger and can hold two other blocks, you will have to rewrite all the action descriptions Then again, state- variable (STRIPS) model is a step-up from the even more low-level “State Transition model” Where actions are just mappings from States to States (and so must be seen as SXS matrices) Very loose Analogy: State-transition models  Assembly lang (factored) state-variable models  C (first-order) sit-calc models  Lisp

27 Progression: An action A can be applied to state S iff the preconditions are satisfied in the current state The resulting state S’ is computed as follows: --every variable that occurs in the actions effects gets the value that the action said it should have --every other variable gets the value it had in the state S where the action is applied Ontable(A) Ontable(B), Clear(A) Clear(B) hand-empty holding(A) ~Clear(A) ~Ontable(A) Ontable(B), Clear(B) ~handempty Pickup(A) Pickup(B) holding(B) ~Clear(B) ~Ontable(B) Ontable(A), Clear(A) ~handempty STRIPS ASSUMPTION: If an action changes a state variable, this must be explicitly mentioned in its effects

28 Generic (progression) planner Goal test(S,G)—check if every state variable in S, that is mentioned in G, has the value that G gives it. Child generator(S,A) –For each action a in A do If every variable mentioned in Prec(a) has the same value in it and S –Then return Progress(S,a) as one of the children of S »Progress(S,A) is a state S’ where each state variable v has value v[Eff(a)]if it is mentioned in Eff(a) and has the value v[S] otherwise Search starts from the initial state

29 State Variable Models World is made up of states which are defined in terms of state variables –Can be boolean (or multi-ary or continuous) States are complete assignments over state variables –So, k boolean state variables can represent how many states? Actions change the values of the state variables –Applicability conditions of actions are also specified in terms of partial assignments over state variables

30 Planning vs. Search: What is the difference? Search assumes that there is a child-generator and goal-test functions which know how to make sense of the states and generate new states Planning makes the additional assumption that the states can be represented in terms of state variables and their values –Initial and goal states are specified in terms of assignments over state variables Which means goal-test doesn’t have to be a blackbox procedure –That the actions modify these state variable values The preconditions and effects of the actions are in terms of partial assignments over state variables –Given these assumptions certain generic goal-test and child-generator functions can be written Specifically, we discussed one Child-generator called “Progression”, another called “Regression” and a third called “Partial-order” Notice that the additional assumptions made by planning do not change the search algorithms (A*, IDDFS etc)—they only change the child-generator and goal-test functions –In particular, search still happens in terms of search nodes that have parent pointers etc. The “state” part of the search node will correspond to –“Complete state variable assignments” in the case of progression –“Partial state variable assignments” in the case of regression –“A collection of steps, orderings, causal commitments and open-conditions in the case of partial order planning

31 Why is this more compact? (than explicit transition systems) In explicit transition systems actions are represented as state-to- state transitions where in each action will be represented by an incidence matrix of size |S|x|S| In state-variable model, actions are represented only in terms of state variables whose values they care about, and whose value they affect. Consider a state space of 1024 states. It can be represented by log 2 1024=10 state variables. If an action needs variable v1 to be true and makes v7 to be false, it can be represented by just 2 bits (instead of a 1024x1024 matrix) –Of course, if the action has a complicated mapping from states to states, in the worst case the action rep will be just as large –The assumption being made here is that the actions will have effects on a small number of state variables.

32 Regression: A state S can be regressed over an action A (or A is applied in the backward direction to S) Iff: --There is no variable v such that v is given different values by the effects of A and the state S --There is at least one variable v’ such that v’ is given the same value by the effects of A as well as state S The resulting state S’ is computed as follows: -- every variable that occurs in S, and does not occur in the effects of A will be copied over to S’ with its value as in S -- every variable that occurs in the precondition list of A will be copied over to S’ with the value it has in in the precondition list ~clear(B) hand-empty Putdown(A) Stack(A,B) ~clear(B) holding(A) clear(B) Putdown(B)?? Termination test: Stop when the state s’ is entailed by the initial state s I *Same entailment dir as before..

33 On the asymmetry of init/goal states Goal state is partial –It is a (seemingly) good thing if only m of the k state variables are mentioned in a goal specification, then upto 2 k-m complete state of the world can satisfy our goals!..I say “seeming” because sometimes a more complete goal state may provide hints to the agent as to what the plan should be –In the blocks world example, if we also state that On(A,B) as part of the goal (in addition to ~Clear(B)&hand-empty) then it would be quite easy to see what the plan should be.. Initial State is complete –If initial state is partial, then we have “partial observability” (i.e., the agent doesn’t know where it is!) If only m of the k state variables are known, then the agent is in one of 2 k-m states! In such cases, the agent needs a plan that will take it from any of these states to a goal state –Either this could be a single sequence of actions that works in all states (e.g. bomb in the toilet problem) –Or this could be “conditional plan” that does some limited sensing and based on that decides what action to do..More on all this during the third class Because of the asymmetry between init and goal states, progression is in the space of complete states, while regression is in the space of “partial” states (sets of states). Specifically, for k state variables, there are 2 k complete states and 3 k “partial” states –(a state variable may be present positively, present negatively or not present at all in the goal specification!)

34 Regression vs. Reversibility Notice that regression doesn’t require that the actions are reversible in the real world –We only think of actions in the reverse direction during simulation –…just as we think of them in terms of their individual effects during partial order planning Normal blocks world is reversible (if you don’t like the effects of stack(A,B), you can do unstack(A,B)). However, if the blocks world has a “bomb” the table action, then normally, there won’t be a way to reverse the effects of that action. –But even with that action we can do regression –For example we can reason that the best way to make table go- away is to add “Bomb” action into the plan as the last action..although it might also make you go away

35 Progression vs. Regression The never ending war.. Part 1 Progression has higher branching factor Progression searches in the space of complete (and consistent) states Regression has lower branching factor Regression searches in the space of partial states –There are 3 n partial states (as against 2 n complete states) You can also do bidirectional search stop when a (leaf) state in the progression tree entails a (leaf) state (formula) in the regression tree

36 Domain model for Have-Cake and Eat-Cake problem

37 The three different child-generator functions (progression, regressio and partial order planning) correspond to three different ways of proving the correctness of a plan Notice the way the proof of causal correctness is akin to the proof of n-queens correctness.. If there are no conflicts, it is a solution

38 Plan Space Planning: Terminology Step: a step in the partial plan—which is bound to a specific action Orderings: s1<s2 s1 must precede s2 Open Conditions: preconditions of the steps (including goal step) Causal Link (s1—p—s2): a commitment that the condition p, needed at s2 will be made true by s1 –Requires s1 to “cause” p Either have an effect p Or have a conditional effect p which is FORCED to happen –By adding a secondary precondition to S1 Unsafe Link: (s1—p—s2; s3) if s3 can come between s1 and s2 and undo p (has an effect that deletes p). Empty Plan: { S:{I,G}; O:{I<G}, OC:{g1@G;g2@G..}, CL:{}; US:{}}

39 Algorithm 1. Let P be an initial plan 2. Flaw Selection: Choose a flaw f ( either open condition or unsafe link ) 3. Flaw resolution: If f is an open condition, choose an action S that achieves f If f is an unsafe link, choose promotion or demotion Update P Return NULL if no resolution exist 4. If there is no flaw left, return P else go to 2. S0S0 S1S1 S2S2 S3S3 S in f p ~p g1g1 g2g2 g2g2 oc 1 oc 2 q1q1 Choice points Flaw selection (open condition? unsafe link?) Flaw resolution (how to select (rank) partial plan?) Action selection (backtrack point) Unsafe link selection (backtrack point) S0S0 S inf g1g2g1g2 1. Initial plan: 2. Plan refinement (flaw selection and resolution): POP background

40

41

42

43

44 S_infty < S2

45 If it helps take away some of the pain, you may note that the remote agent used a form of partial order planner!

46 Relevance, Rechabililty & Heuristics Progression takes “applicability” of actions into account –Specifically, it guarantees that every state in its search queue is reachable..but has no idea whether the states are relevant (constitute progress towards top-level goals) SO, heuristics for progression need to help it estimate the “relevance” of the states in the search queue Regression takes “relevance” of actions into account –Specifically, it makes sure that every state in its search queue is relevant.. But has not idea whether the states (more accurately, state sets) in its search queue are reachable SO, heuristics for regression need to help it estimate the “reachability” of the states in the search queue Reachability: Given a problem [I,G], a (partial) state S is called reachable if there is a sequence [a 1,a 2,…,a k ] of actions which when executed from state I will lead to a state where S holds Relevance: Given a problem [I,G], a state S is called relevant if there is a sequence [a1,a2,…,ak] of actions which when executed from S will lead to a state satisfying (Relevance is Reachability from goal state) Since relevance is nothing but reachability from goal state, reachability analysis can form the basis for good heuristics

47 Subgoal interactions Suppose we have a set of subgoals G 1,….G n Suppose the length of the shortest plan for achieving the subgoals in isolation is l 1,….l n We want to know what is the length of the shortest plan for achieving the n subgoals together, l 1…n If subgoals are independent: l 1..n = l 1 +l 2 +…+l n If subgoals have +ve interactions alone: l 1..n < l 1 +l 2 +…+l n If subgoals have -ve interactions alone: l 1..n > l 1 +l 2 +…+l n If you made “independence” assumption, and added up the individual costs of subgoals, then your resultant heuristic will be  perfect if the goals are actually independent  inadmissible (over-estimating) if the goals have +ve interactions  un-informed (hugely under-estimating) if the goals have –ve interactions

48 Scalability of Planning Before, planning algorithms could synthesize about 6 – 10 action plans in minutes Significant scale-up in the last 6-7 years –Now, we can synthesize 100 action plans in seconds. Realistic encodings of Munich airport! The primary revolution in planning in the recent years has been domain-independent heuristics to scale up plan synthesis Problem is Search Control!!! …and now for a ring-side retrospective

49 Planning Graph Basics –Envelope of Progression Tree (Relaxed Progression) Linear vs. Exponential Growth –Reachable states correspond to subsets of proposition lists –BUT not all subsets are states Can be used for estimating non- reachability –If a state S is not a subset of k th level prop list, then it is definitely not reachable in k steps p pq pr ps pqr pq pqs p psq ps pst pqrspqrs pqrstpqrst A1 A2 A3 A2 A1 A3 A1 A3 A4 A1 A2 A3 A1 A2 A3 A4 [ECP, 1997]


Download ppt "11/5  Bayes Nets project due  Prolog project assigned  Today: FOPC—Resolution Thm Proving; Situation Calculus  Leading to planning."

Similar presentations


Ads by Google