Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 416 Artificial Intelligence Lecture 17 Reasoning over Time Chapter 15 Lecture 17 Reasoning over Time Chapter 15.

Similar presentations


Presentation on theme: "CS 416 Artificial Intelligence Lecture 17 Reasoning over Time Chapter 15 Lecture 17 Reasoning over Time Chapter 15."— Presentation transcript:

1 CS 416 Artificial Intelligence Lecture 17 Reasoning over Time Chapter 15 Lecture 17 Reasoning over Time Chapter 15

2 Sampling your way to a solution As time proceeds, you collect information X t – The variables you cannot observe (at time t)X t – The variables you cannot observe (at time t) E t – the variables you can observe (at time t)E t – the variables you can observe (at time t) –A particular observation is e t X a:b – indicates set of variables from X a to X bX a:b – indicates set of variables from X a to X b As time proceeds, you collect information X t – The variables you cannot observe (at time t)X t – The variables you cannot observe (at time t) E t – the variables you can observe (at time t)E t – the variables you can observe (at time t) –A particular observation is e t X a:b – indicates set of variables from X a to X bX a:b – indicates set of variables from X a to X b

3 Dealing with time Consider P ( x t | e 0:t ) To construct Bayes NetworkTo construct Bayes Network –x t depends on e t –e t depends on e t-1 –e t-1 depends on e t-2 –… potentially infinite number of parents Avoid this by making an assumption!Avoid this by making an assumption! Consider P ( x t | e 0:t ) To construct Bayes NetworkTo construct Bayes Network –x t depends on e t –e t depends on e t-1 –e t-1 depends on e t-2 –… potentially infinite number of parents Avoid this by making an assumption!Avoid this by making an assumption!

4 Markov assumption The current state depends only on a finite history of previous states First-order Markov process: the current state depends only on the previous stateFirst-order Markov process: the current state depends only on the previous state The current state depends only on a finite history of previous states First-order Markov process: the current state depends only on the previous stateFirst-order Markov process: the current state depends only on the previous state

5 Stationarity assumption Changes in the real world are caused by a stationary process The laws that cause a state variable to change at time t are exactly the same at all other timesThe laws that cause a state variable to change at time t are exactly the same at all other times –The variable values may change over time, but the nature of the system doesn’t change Changes in the real world are caused by a stationary process The laws that cause a state variable to change at time t are exactly the same at all other timesThe laws that cause a state variable to change at time t are exactly the same at all other times –The variable values may change over time, but the nature of the system doesn’t change

6 Models of state transitions State transition model Sensor model Evidence variables depend only on the current stateEvidence variables depend only on the current state – The actual state of the world causes the evidence valuesThe actual state of the world causes the evidence values State transition model Sensor model Evidence variables depend only on the current stateEvidence variables depend only on the current state – The actual state of the world causes the evidence valuesThe actual state of the world causes the evidence values

7 Initial Conditions Specify a prior probability over the states at time 0 P(X 0 )P(X 0 ) Specify a prior probability over the states at time 0 P(X 0 )P(X 0 )

8 A complete joint distribution We know Initial conditions of state variables: P (X 0 )Initial conditions of state variables: P (X 0 ) Initial observations (evidence variables):Initial observations (evidence variables): Transition probabilities:Transition probabilities: Therefore we have a complete model We know Initial conditions of state variables: P (X 0 )Initial conditions of state variables: P (X 0 ) Initial observations (evidence variables):Initial observations (evidence variables): Transition probabilities:Transition probabilities: Therefore we have a complete model

9 What might we do with our model? Filtering given all evidence to date, compute the belief state of the unobserved variables: P(X t | e 1:t )given all evidence to date, compute the belief state of the unobserved variables: P(X t | e 1:t )Prediction Predict the posterior distribution of a future state: P(X t+k | e 1:t )Predict the posterior distribution of a future state: P(X t+k | e 1:t )Smoothing Use recent evidence values as hindsight to predict previous values of the unobserved variables: P(X k | e 1:t ), 0<=k<tUse recent evidence values as hindsight to predict previous values of the unobserved variables: P(X k | e 1:t ), 0<=k<t Most likely explanation What sequence of states most likely generated the sequence of observations? argmax x1:t P(x 1:t | e 1:t )What sequence of states most likely generated the sequence of observations? argmax x1:t P(x 1:t | e 1:t )Filtering given all evidence to date, compute the belief state of the unobserved variables: P(X t | e 1:t )given all evidence to date, compute the belief state of the unobserved variables: P(X t | e 1:t )Prediction Predict the posterior distribution of a future state: P(X t+k | e 1:t )Predict the posterior distribution of a future state: P(X t+k | e 1:t )Smoothing Use recent evidence values as hindsight to predict previous values of the unobserved variables: P(X k | e 1:t ), 0<=k<tUse recent evidence values as hindsight to predict previous values of the unobserved variables: P(X k | e 1:t ), 0<=k<t Most likely explanation What sequence of states most likely generated the sequence of observations? argmax x1:t P(x 1:t | e 1:t )What sequence of states most likely generated the sequence of observations? argmax x1:t P(x 1:t | e 1:t )

10 Filtering / Prediction Given filtering up to t, can we predict t+1 from new evidence at t+1? Two steps: Project state at x t to x t+1 using transition model: P(X t | X t-1 )Project state at x t to x t+1 using transition model: P(X t | X t-1 ) Update that projection using e t+1 and sensor model: P(E t | X t )Update that projection using e t+1 and sensor model: P(E t | X t ) Given filtering up to t, can we predict t+1 from new evidence at t+1? Two steps: Project state at x t to x t+1 using transition model: P(X t | X t-1 )Project state at x t to x t+1 using transition model: P(X t | X t-1 ) Update that projection using e t+1 and sensor model: P(E t | X t )Update that projection using e t+1 and sensor model: P(E t | X t )

11 Filtering/Projection Project state at x t to x t+1 using transition model: P(X t | X t-1 )Project state at x t to x t+1 using transition model: P(X t | X t-1 ) Update that projection using e t+1 and sensor model: P(E t | X t )Update that projection using e t+1 and sensor model: P(E t | X t ) Project state at x t to x t+1 using transition model: P(X t | X t-1 )Project state at x t to x t+1 using transition model: P(X t | X t-1 ) Update that projection using e t+1 and sensor model: P(E t | X t )Update that projection using e t+1 and sensor model: P(E t | X t ) sensor model transition model must solve because we don’t know X t

12 Filtering/Projection X t+1 is really a function of e 1:t and x tX t+1 is really a function of e 1:t and x t Because we don’t know x t, we sum across all possible valuesBecause we don’t know x t, we sum across all possible values X t+1 is really a function of e 1:t and x tX t+1 is really a function of e 1:t and x t Because we don’t know x t, we sum across all possible valuesBecause we don’t know x t, we sum across all possible values must solve prediction of X t+1 (previous values not useful)

13 Filtering example Is it raining t ? Based on observation of umbrella t Initial probability, P(R 0 ) = Initial probability, P(R 0 ) = Transition model: P (R t+1 | r t ) = P (R t+1 | ~r t ) = Transition model: P (R t+1 | r t ) = P (R t+1 | ~r t ) = Sensor model: P (R t | u t ) = P (R t | ~u t ) = Sensor model: P (R t | u t ) = P (R t | ~u t ) = Given U 1 = TRUE, what is P(R 1 )? First, predict transition from x 0 to x 1 and update with evidenceFirst, predict transition from x 0 to x 1 and update with evidence Is it raining t ? Based on observation of umbrella t Initial probability, P(R 0 ) = Initial probability, P(R 0 ) = Transition model: P (R t+1 | r t ) = P (R t+1 | ~r t ) = Transition model: P (R t+1 | r t ) = P (R t+1 | ~r t ) = Sensor model: P (R t | u t ) = P (R t | ~u t ) = Sensor model: P (R t | u t ) = P (R t | ~u t ) = Given U 1 = TRUE, what is P(R 1 )? First, predict transition from x 0 to x 1 and update with evidenceFirst, predict transition from x 0 to x 1 and update with evidence

14 Given U 1 = TRUE, what is P(R 1 )? Predict transition from x 0 to x 1 Because we don’t know x 0 we have to consider all casesBecause we don’t know x 0 we have to consider all cases Predict transition from x 0 to x 1 Because we don’t know x 0 we have to consider all casesBecause we don’t know x 0 we have to consider all cases It was raining It wasn’t raining

15 Given U 1 = TRUE, what is P(R 1 )? Update with evidence sensor model prob. of seeing umbrella given it was raining prob. of seeing umbrella given it wasn’t raining

16 Given U 1 and U 2 = true, what is P(R 2 ) We computed R 1 in previous steps First, predict R 2 from R 1 We computed R 1 in previous steps First, predict R 2 from R 1

17 Given U 1 and U 2 = true, what is P(R 2 ) Second, update R 2 with evidence When queried to solve for R n Use a forward algorithm that recursively solves for R i for i < nUse a forward algorithm that recursively solves for R i for i < n Second, update R 2 with evidence When queried to solve for R n Use a forward algorithm that recursively solves for R i for i < nUse a forward algorithm that recursively solves for R i for i < n From R 1

18 Prediction Use evidence 1  t to predict state at t+k+1 For all possible states x t+k consider the transition model to x t+k+1For all possible states x t+k consider the transition model to x t+k+1 For all states x t+k consider the likelihood given e 1:tFor all states x t+k consider the likelihood given e 1:t Use evidence 1  t to predict state at t+k+1 For all possible states x t+k consider the transition model to x t+k+1For all possible states x t+k consider the transition model to x t+k+1 For all states x t+k consider the likelihood given e 1:tFor all states x t+k consider the likelihood given e 1:t

19 Prediction Limits of prediction As k increases, a fixed output results – stationary distributionAs k increases, a fixed output results – stationary distribution The time to reach the stationary distribution – mixing timeThe time to reach the stationary distribution – mixing time Limits of prediction As k increases, a fixed output results – stationary distributionAs k increases, a fixed output results – stationary distribution The time to reach the stationary distribution – mixing timeThe time to reach the stationary distribution – mixing time

20 Smoothing P(X k | e 1:t ), 0<=k<t Attack this in two partsAttack this in two parts –P(X k | e 1:k, e k+1:t ) P(X k | e 1:t ), 0<=k<t Attack this in two partsAttack this in two parts –P(X k | e 1:k, e k+1:t ) Bayes b k+1:t = P(e k+1:t | X k )

21 Smoothing Forward part: What is probability X k given evidence 1  kWhat is probability X k given evidence 1  k Backward part: What is probability of observing evidence k+1  t given X kWhat is probability of observing evidence k+1  t given X k How do we compute the backward part? Forward part: What is probability X k given evidence 1  kWhat is probability X k given evidence 1  k Backward part: What is probability of observing evidence k+1  t given X kWhat is probability of observing evidence k+1  t given X k How do we compute the backward part?

22 Smoothing Computing the backward part

23 Whiteboard

24 Example Probability r 1 given u 1 and u 2 solved for this in step one of forward soln.

25 Viterbi Consider finding the most likely path through a sequence of states given observations Could enumerate all 2 5 permutations of five-sequence rain/~rain options and evaluate P(x 1:5 | e 1:5 ) Consider finding the most likely path through a sequence of states given observations Could enumerate all 2 5 permutations of five-sequence rain/~rain options and evaluate P(x 1:5 | e 1:5 )

26 Viterbi Could use smoothing to find posterior distribution for weather at each time step and create path through most probable – treats each as a single step, not a sequence!

27 Viterbi Specify a final state and find previous states that form most likely path Let R 5 = trueLet R 5 = true Find R 4 such that it is on the optimal path to R 5. Consider each value of R 4Find R 4 such that it is on the optimal path to R 5. Consider each value of R 4 –Evaluate how likely it will lead to R 5 =true and how easily it is reached  Find R 3 such that it is on optimal path to R 4. Consider each value… Specify a final state and find previous states that form most likely path Let R 5 = trueLet R 5 = true Find R 4 such that it is on the optimal path to R 5. Consider each value of R 4Find R 4 such that it is on the optimal path to R 5. Consider each value of R 4 –Evaluate how likely it will lead to R 5 =true and how easily it is reached  Find R 3 such that it is on optimal path to R 4. Consider each value…

28 Viterbi – Recursive algorithm

29 Viterbi - Recursive

30 The Viterbi algorithm is just like the filtering algorithm except for two changes Replace f 1:t = P(X t | e 1:t )Replace f 1:t = P(X t | e 1:t ) –with: Summation over x t replaced with max over x tSummation over x t replaced with max over x t The Viterbi algorithm is just like the filtering algorithm except for two changes Replace f 1:t = P(X t | e 1:t )Replace f 1:t = P(X t | e 1:t ) –with: Summation over x t replaced with max over x tSummation over x t replaced with max over x t

31 Review Forward:Forward/Backward:Max:Forward:Forward/Backward:Max:

32 Hidden Markov Models (HMMs) Represent the state of the world with a single discrete variable If your state has multiple variables, form one variable whose value takes on all possible tuples of multiple variablesIf your state has multiple variables, form one variable whose value takes on all possible tuples of multiple variables Let number of states be SLet number of states be S –Transition model is an SxS matrix  Probability of transitioning from any state to another –Evidence is an SxS diagonal matrix  Diagonal consists of likelihood of observation at time t Represent the state of the world with a single discrete variable If your state has multiple variables, form one variable whose value takes on all possible tuples of multiple variablesIf your state has multiple variables, form one variable whose value takes on all possible tuples of multiple variables Let number of states be SLet number of states be S –Transition model is an SxS matrix  Probability of transitioning from any state to another –Evidence is an SxS diagonal matrix  Diagonal consists of likelihood of observation at time t

33 Kalman Filters Gauss invented least-squares estimation and important parts of statistics in 1745 When he was 18 and trying to understand the revolution of heavy bodies (by collecting data from telescopes)When he was 18 and trying to understand the revolution of heavy bodies (by collecting data from telescopes) Invented by Kalman in 1960 A means to update predictions of continuous variables given observations (fast and discrete for computer programs)A means to update predictions of continuous variables given observations (fast and discrete for computer programs) –Critical for getting Apollo spacecrafts to insert into orbit around Moon. Gauss invented least-squares estimation and important parts of statistics in 1745 When he was 18 and trying to understand the revolution of heavy bodies (by collecting data from telescopes)When he was 18 and trying to understand the revolution of heavy bodies (by collecting data from telescopes) Invented by Kalman in 1960 A means to update predictions of continuous variables given observations (fast and discrete for computer programs)A means to update predictions of continuous variables given observations (fast and discrete for computer programs) –Critical for getting Apollo spacecrafts to insert into orbit around Moon.


Download ppt "CS 416 Artificial Intelligence Lecture 17 Reasoning over Time Chapter 15 Lecture 17 Reasoning over Time Chapter 15."

Similar presentations


Ads by Google