4/22: Unexpected Hanging and other sadistic pleasures of teaching  Today: Probabilistic Plan Recognition  Tomorrow: Web Service Composition (BY 510;

4/22: Unexpected Hanging and other sadistic pleasures of teaching  Today: Probabilistic Plan Recognition  Tomorrow: Web Service Composition (BY 510; 11AM)  Thursday: Continual Planning for Printers (in-class)  Tuesday 4/29: (Interactive) Review

Oregon State University Approaches to plan recognition  Consistency-based  Hypothesize & revise  Closed-world reasoning  Version spaces  Probabilistic  Stochastic grammars  Pending sets  Dynamic Bayes nets  Layered hidden Markov models  Policy recognition  Hierarchical hidden semi-Markov models  Dynamic probabilistic relational models  Example application: Assisted Cognition Can be complementary.. First pick the consistent plans, and check which of them is most likely (tricky if the agent can make errors)

Oregon State University Agenda (as actually realized in class)  Plan recognition as probabilistic (max weight) parsing  On the connection between dynamic bayes nets and plan recognition; with a detour on the special inference tasks for DBN  Examples of plan recognition techniques based on setting up DBNs and doing MPE inference on them  Discussion of Decision Theoretic Assistance paper

Stochastic grammars Huber, Durfee, & Wellman, "The Automated Mapping of Plans for Plan Recognition", 1994The Automated Mapping of Plans for Plan Recognition Darnell Moore and Irfan Essa, "Recognizing Multitasked Activities from Video using Stochastic Context-Free Grammar", AAAI-02, 2002. Darnell Moore and Irfan Essa, "Recognizing Multitasked Activities from Video using Stochastic Context-Free Grammar", CF grammar w/ probabilistic rules Chart parsing + Viterbi Successful for highly structured tasks (e.g. playing cards) Problems: errors, context

Probabilistic State-dependent grammars

Connection with DBNs

Time and Change in Probabilistic Reasoning

Temporal (Sequential) Process A temporal process is the evolution of system state over time Often the system state is hidden, and we need to reconstruct the state from the observations Relation to Planning: –When you are observing a temporal process, you are observing the execution trace of someone else’s plan…

Dynamic Bayes Networks are “templates” for specifying the relation between the values of a random variable across time-slices  e.g. How is Rain at time t related to Rain at time t+1? We call them templates because they need to be expanded (unfolded) to the required number of time steps to reason about the connection between variables at different time points

Normal LW takes each sample through the network one by one Idea 1: Take them all from t to t+1 lock-step  the samples are the distribution Normal LW doesn’t do well when the evidence is downstream (the sample weight will be too small) In DBN, none of the evidence is affecting the sampling! EVEN MORE of an issue

Special Cases of DBNs are well known in the literature Restrict number of variables per state –Markov Chain: DBN with one variable that is fully observable –Hidden Markov Model: DBN with only one state variable that is hidden and can be estimated through evidence variable(s) Restrict the type of CPD –Kalman Filters: DBN where the system transition function as well as the observation variable are linear gaussian The advantage of Gaussians is that the posterior distribution remains Gaussian

Plan Recognition Approaches based on setting up DBNs

Dynamic Bayes nets (I) E. Horvitz, J. Breese, D. Heckerman, D. Hovel, and K. Rommelse. The Lumiere Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users. Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, July 1998.The Lumiere Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users. Towards a Bayesian model for keyhole plan recognition in large domains Albrecht, Zukermann, Nicholson, Bud Towards a Bayesian model for keyhole plan recognition in large domains Models relationship between user’s recent actions and goals (help needs) Probabilistic goal persistence Programming in machine language?

Excel help (partial)

Cognitive mode { normal, error } Dynamic Bayesian Nets GPS reading Edge, velocity, position Data (edge) association Transportation mode Trip segment Goal z k-1 zkzk x k-1 xkxk  k-1 kk Time k-1 Time k m k-1 mkmk t k-1 tktk g k-1 gkgk c k-1 ckck Learning and Inferring Transportation Routines Lin Liao, Dieter Fox, and Henry Kautz, Nineteenth National Conference on Artificial Intelligence, San Jose, CA, 2004.

Pending sets A new model of plan recognition. Goldman, Geib, and Miller A new model of plan recognition. Probabilistic plan recognition for hostile agents. Geib, Goldman Probabilistic plan recognition for hostile agents. Explicitly models the agent’s “plan agenda” using Poole’s “probabilistic Horn abduction” rules Handles multiple concurrent interleaved plans & negative evidence Number of different possible pending sets can grow exponentially Context problematic? Metric time? Happen(X,T+1)  Pending(P,T), X in P, Pick(X,P,T+1). Pending(P’,T+1)  Pending(P,T), Leaves(L), Progress(L, P, P’, T+1).

Layered hidden Markov models N. Oliver, E. Horvitz, and A. Garg. Layered Representations for Recognizing Office Activity, Proceedings of the Fourth IEEE International Conference on Multimodal Interaction (ICMI 2002)Layered Representations for Recognizing Office Activity Cascade of HMM’s, operating at different temporal granularities Inferential output at layer K is “evidence” for layer K+1

Policy recognition Tracking and Surveillance in Wide-Area Spatial Environments Using the Hidden Markov Model. Hung H. Bui, Svetha Venkatesh and West. Tracking and Surveillance in Wide-Area Spatial Environments Using the Hidden Markov Model. Bui, H. H., Venkatesh, S., and West, G. (2000) On the recognition of abstract Markov policies. Seventeenth National Conference on Artificial Intelligence (AAAI-2000), Austin, TexasOn the recognition of abstract Markov policies. Model agent using hierarchy of abstract policies (e.g. abstract by spatial decomposition) Compute the conditional probability of top-level policy given observations Compiled into DBN

Hierarchical hidden semi- Markov models Combine hierarchy (function call semantics) with metric time Compile to DBN Time nodes represent a distribution over the time of the next state “switch” “Linear time” smoothing Research issues – parametric time nodes, varying granularity Hidden semi-Markov models (segment models) Kevin Murphy. November 2002. Hidden semi-Markov models (segment models) HSSM: Theory into Practice, Deibel & Kautz, forthcoming.

Dynamic probabilistic relational models Friedman, N., L. Getoor, D. Koller, A. Pfeffer. Learning Probabilistic Relational Models. IJCAI-99, Stockholm, Sweden (July 1999).Learning Probabilistic Relational Models Relational Markov Models and their Application to Adaptive Web Navigation, Anderson, Domingos, Weld 2002. Relational Markov Models and their Application to Adaptive Web Navigation Dynamic probabilistic relational models, Anderson, Domingos, Weld, forthcoming. PRM - reasons about classes of objects and relations Lattice of classes can capture plan abstraction DPRM – efficient approximate inference by Rao- Blackwellized particle filtering Open: approximate smoothing?

Assisted cognition Understanding human behavior from low-level sensory data Using commonsense knowledge Learning individual user models Actively offering prompts and other forms of help as needed Alerting human caregivers when necessary http://www.cs.washington.edu/assistcog/ Computer systems that improve the independence and safety of people suffering from cognitive limitations by…

Activity Compass Zero-configuration personal guidance system Learns model of user’s travel on foot, by public transit, by bike, by car Predicts user’s next destination, offers proactive help if lost or late Integrates user data with external constraints Maps, bus schedules, calendars, … EM approach to clustering & segmenting data The Activity CompassThe Activity Compass Don Patterson, Oren Etzioni, and Henry Kautz (2003)

Activity of daily living monitor & prompter Foundations of Assisted Cognition Systems.Foundations of Assisted Cognition Systems. Kautz, Etzioni, Fox, Weld, and Shastri, 2003

Recognizing unexpected events using online model selection User errors, abnormal behavior Select model that maximizes likelihood of data: Generic model User-specific model Corrupt (impaired) user model Neurologically-plausible corruptions Repetition Substitution Stalling Fox, Kautz, & Shastri (forthcoming) fill kettle put kettle on stove fill kettle put kettle on stove put kettle in closet

Decision-Theoretic Assistance Don’t just recognize! Jump in and help.. Allows us to also talk about POMDPs

Oregon State University Intelligent Assistants  Many examples of AI techniques being applied to assistive technologies  Intelligent Desktop Assistants  Calendar Apprentice (CAP) (Mitchell et al. 1994)  Travel Assistant (Ambite et al. 2002)  CALO Project  Tasktracer  Electric Elves (Hans Chalupsky et al. 2001)  Assistive Technologies for the Disabled  COACH System (Boger et al. 2005)

Oregon State University Not So Intelligent  Most previous work uses problem-specific, hand-crafted solutions  Lack ability to offer assistance in ways not planned for by designer  Our goal: provide a general, formal framework for intelligent-assistant design  Desirable properties:  Explicitly reason about models of the world and user to provide flexible assistance  Handle uncertainty about the world and user  Handle variable costs of user and assistive actions  We describe a model-based decision-theoretic framework that captures these properties

Oregon State University An Episodic Interaction Model User Assistant W6W6 W7W7 W8W8 W9W9 Goal Achieved W2W2 User Action W4W4 W5W5 W3W3 Assistant Actions W1W1 Goal Initial State Each user and assistant action has a cost Action set U Action set A Objective: minimize expected cost of episodes

Oregon State University Example: Grid World Domain World states: (x,y) location and door status Possible goals: Get wood, gold, or food User actions: Up, Down, Left, Right, noop Open a door in current room (all actions have cost = 1) Assistant actions: Open a door, noop (all actions have cost = 0)

Oregon State University World and User Models UtUt WtWt AtAt W t+1 G P(G) P(U t | G, W t ) W1W1 W2W2 W3W3 W4W4 U1U1 A1A1 U2U2 ? Model world dynamics as a Markov decision process (MDP) Model user as a stochastic policy P(W t+1 | W t, U t, A t ) Goal Distribution Action distribution conditioned on goal and world state Transition Model Given: model, action sequence Output: assistant action

Oregon State University Optimal Solution: Assistant POMDP UtUt WtWt AtAt W t+1 G P(G) P(U t | G, W t ) P(W t+1 | W t, U t, A t ) Goal Distribution Action distribution conditioned on goal and world state Transition Model  Can view as a POMDP called the assistant POMDP  Hidden State: user goal  Observations: user actions and world states  Optimal policy gives mapping from observation sequences to assistant actions  Represents optimal assistant  Typically intractable to solve exactly

Oregon State University Approximate Solution Approach Goal RecognizerAction Selection Environment User UtUt AtAt OtOt P(G) Assistant WtWt  Online actions selection cycle 1) Estimate posterior goal distribution given observation 2) Action selection via myopic heuristics

Oregon State University Goal Estimation WtWt Current State P(G | O t ) Goal posterior given observations up to time t W t+1 UtUt P(G | O t+1 ) Updated goal posterior new observation  Given  P(G | O t ) : goal posterior at time t initally equal to prior P(G)  P(U t | G, W t ) : stochastic user policy  O t+1 : new observation of user action and world state it is straightforward to update goal posterior at time t+1 must learn user policy

Oregon State University Learning User Policy  Use Bayesian updates to update user policy P(U|G, W) after each episode  Problem: can converge slowing, leading to poor goal estimation  Solution: use strong prior on user policy derived via planning  Assume that user behaves “nearly rational”  Take prior distribution on P(U|G, W) to be bias toward optimal user actions  Let Q(U,W,G) be value of user taking action U in state W given goal G  Can compute via MDP planning  Use prior P(U | G, W) α exp(Q(U,W,G))

Oregon State University Q(U,W,G) for Grid World

Oregon State University Approximate Solution Approach Goal RecognizerAction Selection Environment User UtUt AtAt OtOt P(G) Assistant WtWt  Online actions selection cycle 1) Estimate posterior goal distribution given observation 2) Action selection via myopic heuristics

Oregon State University Action Selection: Assistant POMDP At’At’ WtWt W t+1 W t+2 U G At’At’ WtWt Assistant MDP  Assume we know the user goal G and policy  Can create a corresponding assistant MDP over assistant actions  Can compute Q(A, W, G) giving value of taking assistive action A when users goal is G  Select action that maximizes expected (myopic) value: If you just want to recognize, you only need P(G|O t ) If you just want to help (and know the goal), you just need Q(A,W,G)

Oregon State University Experiments: Grid World Domain

Oregon State University Experiments: Kitchen Domain

Oregon State University Experimental Results  Experiment: 12 human subjects, two domains  Subjects were asked to achieve a sequence of goals  Compared average cost of performing tasks with assistant to optimal cost without assistant  Assistant reduced cost by over 50%

Oregon State University Summary of Assumptions  Model Assumptions:  World can be approximately modeled as MDP  User and assistant interleave actions (no parallel activity)  User can be modeled as a stationary, stochastic policy  Finite set of known goals  Assumptions Made by Solution Approach  Access to practical algorithm for solving the world MDP  User does no reason about the existence of the assistance  Goal set is relatively small and known to assistant  User is close to “rational”

While DBNs are special cases of B.N.’s there are a certain inference tasks that are particularly frequently useful for them (Notice that all of them involve estimating posterior probability distributions—as is done in any B.N. inference)

Can do much better if we exploit the repetitive structure Both Exact and Approximate B.N. Inference methods can be made to take the temporal structure into account.  Specialized variable-elimination method  Unfold t+1 th level, and roll-up t th level by variable elimination  Specialized Likelihood-weighting methods that take evidence into account  Particle Filtering Techniques

Class Ended here.. Slides beyond this not discussed

Belief States If we have k state variables, 2 k states A “belief state” is a probability distribution over states –Non-deterministic We just know the states for which the probability is non- zero 2 2^k belief states –Stochastic We know the probability distribution over the states Infinite number of probability distributions –A complete state is a special case of belief state where the distribution is “dirac-delta” i.e., non-zero only for one state In blocks world, Suppose we have blocks A and B and they can be “clear”, “on-table” “On” each other -A state: A is on table, B is on table, both are clear, hand is empty -A belief state : A is either on B or on Table B is on table. Hand is empty  2 states in the belief state

Actions and Belief States Two types of actions –Standard actions: Modify the distribution of belief states Doing “C on A” action in the belief state gives us a new belief state (with C on A on B OR C on A; B clear) Doing “Shake-the-Table” action converts the previous belief state to (A on table; B on Table; A clear; B clear) –Notice that actions reduce the uncertainty! Sensing actions –Sensing actions observe some aspect of the belief state –The observations modify the belief state distribution In the belief state above, if we observed that two blocks are clear, then the belief state changes to {A on table; B on table; both clear} If the observation above is noisy (i.e, we are not completely certain), then the probability distribution just changes so more probability mass is centered on the {A on table; B on Table} state. A belief state : A is either on B or on Table B is on table. Hand is empty

4/22: Unexpected Hanging and other sadistic pleasures of teaching  Today: Probabilistic Plan Recognition  Tomorrow: Web Service Composition (BY 510;

Similar presentations

Presentation on theme: "4/22: Unexpected Hanging and other sadistic pleasures of teaching  Today: Probabilistic Plan Recognition  Tomorrow: Web Service Composition (BY 510;"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

4/22: Unexpected Hanging and other sadistic pleasures of teaching  Today: Probabilistic Plan Recognition  Tomorrow: Web Service Composition (BY 510;

Similar presentations

Presentation on theme: "4/22: Unexpected Hanging and other sadistic pleasures of teaching  Today: Probabilistic Plan Recognition  Tomorrow: Web Service Composition (BY 510;"— Presentation transcript:

Similar presentations

About project

Feedback