Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS498-EA Reasoning in AI Lecture #23 Instructor: Eyal Amir Fall Semester 2011.

Similar presentations


Presentation on theme: "CS498-EA Reasoning in AI Lecture #23 Instructor: Eyal Amir Fall Semester 2011."— Presentation transcript:

1 CS498-EA Reasoning in AI Lecture #23 Instructor: Eyal Amir Fall Semester 2011

2 Last Time Time and uncertainty Inference: filtering, prediction, smoothing Hidden Markov Models (HMMs) –Model –Exact Reasoning Dynamic Bayesian Networks –Model –Exact Reasoning

3 Inference Tasks Filtering: –Belief state: probability of state given the evidence Prediction: –Like filtering without evidence Smoothing: –Better estimate of past states Most likelihood explanation: –Scenario that explains the evidence

4 Filtering (forward algorithm) Predict: Update : Recursive step E t-1 E t+1 X t-1 XtXt X t+1 EtEt

5 Smoothing Forwardbackward

6 Most Likely Explanation Finding most likely path E t-1 E t+1 X t-1 XtXt X t+1 EtEt Called Viterbi

7 Today Dynamic Bayesian Networks –Exact Inference –Approximate Inference

8 Dynamic Bayesian Network DBN is like a 2time-BN –Using the first order Markov assumptions Standard BN Time 0Time 1

9 Dynamic Bayesian Network Basic idea: –Copy state and evidence for each time step –Xt: set of unobservable (hidden) variables (e.g.: Pos, Vel) –Et: set of observable (evidence) variables (e.g.: Sens.A, Sens.B) Notice: Time is discrete

10 Example

11 Inference in DBN Unroll: Inference in the above BN Not efficient (depends on the sequence length)

12 Exact Inference in DBNs Variable Elimination: –Add slice t+1, sum out slice t using variable elimination x 1 (0) x 1 (3) x 1 (2) x 1 (1) X 2 (0) X 2 (3) X 2 (2) X 2 (1) X 3 (0) X 3 (3) X 3 (2) X 3 (1) X 4 (0) X 4 (3) X 4 (2) X 4 (1) No conditional independence after few steps

13 s1s4s3s2s5s1s4s3s2s5s1s4s3s2s5s1s4s3s2s5 Exact Inference in DBNs Variable Elimination: –Add slice t+1, sum out slice t using variable elimination

14 Variable Elimination s4s3s2s5 s4s3s2s5 s4s3s2s5 s4s3s2s5

15 Variable Elimination s4s3s5 s4s3s5 s4s3s5 s4s3s5

16 Variable Elimination s4s5 s4s5 s4s5 s4s5

17 DBN Representation: DelC TtTt LtLt CR t RHC t T t+1 L t+1 CR t+1 RHC t+1 f CR (L t, CR t, RHC t, CR t+1 ) f T (T t, T t+1 ) L CR RHC CR (t+1) CR (t+1) O T T 0.2 0.8 E T T 1.0 0.0 O F T 0.0 1.0 E F T 0.0 1.0 O T F 1.0 0.1 E T F 1.0 0.0 O F F 0.0 1.0 E F F 0.0 1.0 T T (t+1) T (t+1) T 0.91 0.09 F 0.0 1.0 RHM t RHM t+1 MtMt M t+1 f RHM (RHM t, RHM t+1 ) RHM R (t+1) R (t+1) T 1.0 0.0 F 0.0 1.0

18 Benefits of DBN Representation Pr (Rm t+1,M t+1,T t+1, L t+1,C t+1, Rc t+1 | Rm t,M t,T t, L t,C t, Rc t ) = f Rm (Rm t, Rm t+1 ) * f M (M t, M t+1 ) * f T (T t, T t+1 ) * f L (L t, L t+1 ) * f Cr (L t, Cr t, Rc t, Cr t+1 ) * f Rc (Rc t, Rc t+1 ) - Only few parameters vs. 25440 for matrix -Removes global exponential dependence s 1 s 2... s 160 s 1 0.9 0.05... 0.0 s 2 0.0 0.20... 0.1 s 160 0.1 0.0... 0.0...... TtTt LtLt CR t RHC t T t+1 L t+1 CR t+1 RHC t+1 RHM t RHM t+1 MtMt M t+1

19 DBN Myth Bayesian Network: a decomposed structure to represent the full joint distribution Does it imply easy decomposition for the belief state? No!

20 Tractable, approximate representation Exact inference in DBN is intractable Need approximation –Maintain an approximate belief state –E.g. assume Gaussian processes Boyen-Koller approximation: –Factored belief state

21 Idea Use a decomposable representation for the belief state (pre-assume some independency)

22 Problem What about the approximation errors? –It might accumulate and grow unbounded…

23 Contraction property Main properties of B-K approximation: –Under reasonable assumptions about the stochasticity of the process, every state transition results in a contraction of the distance between the two distributions by a constant factor –Since approximation errors from previous steps decrease exponentially, the overall error remains bounded indefinitely

24 Basic framework Definition 1: –Prior belief state: –Posterior belief state: Monitoring task:

25 Simple contraction Distance measure: –Relative entropy (KL-divergence) between the actual and the approximate belief state Contraction due to O: Contraction due to T (can we do better?):

26 Simple contraction (cont) Definition: –Minimal mixing rate: Theorem 3 (the single process contraction theorem): –For process Q, anterior distributions φ and ψ, ulterior distributions φ ’ and ψ ’,

27 Simple contraction (cont) Proof Intuition:

28 Compound processes Mixing rate could be very small for large processes The trick is to assume some independence among subprocesses and factor the DBN along these subprocesses Fully independent subprocesses: –Theorem 5 of [BK98]: For L independent subprocesses T 1, …, T L. Let γ l be the mixing rate for T l and let γ = min l γ l. Let φ and ψ be distributions over S 1 (t), …, S L (t), and assume that ψ renders the S l (t) marginally independent. Then:

29 Compound processes (cont) Conditionally independent subprocesses Theorem 6 of [BK98]: –For L independent subprocesses T 1, …, T L, assume each process depends on at most r others, and each influences at most q others. Let γ l be the mixing rate for T l and let γ = min l γ l. Let φ and ψ be distributions over S 1 (t), …, S L (t), and assume that ψ renders the S l (t) marginally independent. Then:

30 Efficient, approximate monitoring If each approximation incurs an error bounded by ε, then –Total error =>error remains bounded Conditioning on observations might introduce momentary errors, but the expected error will contract

31 Approximate DBN monitoring Algorithm (based on standard clique tree inference): 1.Construct a clique tree from the 2-TBN 2.Initialize clique tree with conditional probabilities from CPTs of the DBN 3.For each time step: a.Create a working copy of the tree Y. Create σ (t+1). b.For each subprocess l, incorporate the marginal σ (t) [X (t) l ] in the appropriate factor in Y. c.Incorporate evidence r (t+1) in Y. d.Calibrate the potentials in Y. e.For each l, query Y for marginal over X l (t+1) and store it in σ (t+1).

32 Solution: BK algorithm With mixing, bounded projection error: total error is bounded Exact step Approximation/ marginalization step Break into smaller clusters

33 Boyen-Koller Approximation Example of variational inference with DBNs Computer posterior for time t from (factored) state estimate at time t-1 –Assume posterior has factored form Error is bounded


Download ppt "CS498-EA Reasoning in AI Lecture #23 Instructor: Eyal Amir Fall Semester 2011."

Similar presentations


Ads by Google