Presentation is loading. Please wait.

Presentation is loading. Please wait.

Guidance: Assignment 3 Part 1 matlab functions in statistics toolbox  betacdf, betapdf, betarnd, betastat, betafit.

Similar presentations


Presentation on theme: "Guidance: Assignment 3 Part 1 matlab functions in statistics toolbox  betacdf, betapdf, betarnd, betastat, betafit."— Presentation transcript:

1 Guidance: Assignment 3 Part 1 matlab functions in statistics toolbox  betacdf, betapdf, betarnd, betastat, betafit

2 Guidance: Assignment 3 Part 2 You will explore the role of the priors. The Weiss model showed that priors play an important role when  observations are noisy  observations don’t provide strong constraints  there aren’t many observations.

3 Guidance: Assignment 3 Part 3 Implement model a bit like Weiss et al. (2002) Goal: infer motion (velocity) of a rigid shape from observations at two instances in time. Assume distinctive features that make it easy to identify the location of the feature at successive times.

4 Assignment 2 Guidance Bx: the x displacement of the blue square (= delta x in one unit of time) By: the y displacement of the blue square Rx: the x displacement of the red square Ry: the y displacement of the red square These observations are corrupted by measurement noise. Gaussian, mean zero, std deviation σ D: direction of motion (up, down, left, right) Assume only possibilities are one unit of motion in any direction

5 Assignment 2: Generative Model Same assumptions for Bx, By. Rx conditioned on D=up is drawn from a Gaussian

6 Assignment 2 Math Conditional independence

7 Assignment 2 Implementation Quiz: do we need worry about the Gaussian density function normalization term?

8 Introduction To Bayes Nets (Stuff stolen from Kevin Murphy, UBC, and Nir Friedman, HUJI)

9 What Do You Need To Do Probabilistic Inference In A Given Domain? Joint probability distribution over all variables in domain

10 Qualitative part Directed acyclic graph (DAG) Nodes: random vars. Edges: direct influence Quantitative part Set of conditional probability distributions 0.90.1 e b e 0.20.8 0.01 0.99 0.90.1 be b b e BE P(A | E,B) Family of Alarm Earthquake Radio Burglary Alarm Call Compact representation of joint probability distributions via conditional independence Together Define a unique distribution in a factored form Bayes Nets (a.k.a. Belief Nets) Figure from N. Friedman

11 What Is A Bayes Net? Earthquake Radio Burglary Alarm Call A node is conditionally independent of its ancestors given its parents. E.g., C is conditionally independent of R, E, and B given A Notation: C? R,B,E | A Quiz: What sort of parameter reduction do we get? From 2 5 – 1 = 31 parameters to 1+1+2+4+2=10

12 Conditional Distributions Are Flexible E.g., Earthquake and Burglary might have independent effects on Alarm A.k.a. noisy-or where p B and p E are probabilities of burglary and earthquake alone This constraint reduces # free parameters to 8! Earthquake Burglary Alarm

13 Why Are Bayes Nets Useful? Factored representation may have exponentially fewer parameters than full joint  Lower time complexity (i.e., easier inference)  Lower sample complexity (i.e., less data for learning) Graph structure supports  Modular representation of knowledge  Local, distributed algorithms for inference and learning  Intuitive (possibly causal) interpretation  Strong theory about the nature of cognition or the generative process that produces observed data Can’t represent arbitrary contingencies among variables, so theory can be rejected by data

14 Inference Computing posterior probabilities – Probability of hidden events given any evidence Most likely explanation – Scenario that explains evidence Rational decision making – Maximize expected utility – Value of Information Effect of intervention – Causal analysis Earthquake Radio Burglary Alarm Call Radio Call Figure from N. Friedman Explaining away effect

15 Domain: Monitoring Intensive-Care Patients 37 variables 509 parameters …instead of 2 37 PCWP CO HRBP HREKG HRSAT ERRCAUTER HR HISTORY CATECHOL SAO2 EXPCO2 ARTCO2 VENTALV VENTLUNG VENITUBE DISCONNECT MINVOLSET VENTMACH KINKEDTUBE INTUBATIONPULMEMBOLUS PAPSHUNT ANAPHYLAXIS MINOVL PVSAT FIO2 PRESS INSUFFANESTHTPR LVFAILURE ERRBLOWOUTPUT STROEVOLUMELVEDVOLUME HYPOVOLEMIA CVP BP A Real Bayes Net: Alarm Figure from N. Friedman

16 More Real-World Bayes Net Applications “Microsoft’s competitive advantage lies in its expertise in Bayesian networks” -- Bill Gates, quoted in LA Times, 1996 MS Answer Wizards, (printer) troubleshooters Medical diagnosis Speech recognition (HMMs) Gene sequence/expression analysis Turbocodes (channel coding)

17 Conditional Independence A node is conditionally independent of its ancestors given its parents. What about conditional independence between variables that aren’t directly connected (e.g., Burglary and Radio)? Earthquake Radio Burglary Alarm Call

18 d-separation Criterion for deciding if nodes are conditionally independent. A path from node u to node v is d-separated by a set of nodes Z if the path matches one of these templates: uzv uzv uzv uzv z

19 Conditional Independence Nodes u and v are conditionally independent given set Z if all (undirected) paths between u and v are d- separated by Z. E.g., uv z z z

20 d-separation Along Paths For paths involving > 1 intermediate node, the path is d-separated if the outer two nodes of any triple are d-separated. uzv uzv uzv uzv z u v z z u v zz u v zz d separated Not d separated

21 PCWP CO HRBP HREKG HRSAT ERRCAUTER HR HISTORY CATECHOL SAO2 EXPCO2 ARTCO2 VENTALV VENTLUNG VENITUBE DISCONNECT MINVOLSET VENTMACH KINKEDTUBE INTUBATIONPULMEMBOLUS PAPSHUNT ANAPHYLAXIS MINOVL PVSAT FIO2 PRESS INSUFFANESTHTPR LVFAILURE ERRBLOWOUTPUT STROEVOLUMELVEDVOLUME HYPOVOLEMIA CVP BP

22 PCWP CO HRBP HREKG HRSAT ERRCAUTER HR HISTORY CATECHOL SAO2 EXPCO2 ARTCO2 VENTALV VENTLUNG VENITUBE DISCONNECT MINVOLSET VENTMACH KINKEDTUBE INTUBATIONPULMEMBOLUS PAPSHUNT ANAPHYLAXIS MINOVL PVSAT FIO2 PRESS INSUFFANESTHTPR LVFAILURE ERRBLOWOUTPUT STROEVOLUMELVEDVOLUME HYPOVOLEMIA CVP BP

23 Sufficiency For Conditional Independence: Markov Blanket The Markov blanket of node u consists of the parents, children, and children’s parents of u P(u|MB(u),v) = P(u|MB(u)) u

24 Probabilistic Models Probabilistic models DirectedUndirected Graphical models Alarm network State-space models HMMs Naïve Bayes classifier PCA/ ICA Markov Random Field Boltzmann machine Ising model Max-ent model Log-linear models (Bayesian belief nets)(Markov nets)

25 Turning A Directed Graphical Model Into An Undirected Model Via Moralization Moralization: connect all parents of each node and remove arrows

26 Toy Example Of A Markov Net X1X1 X2X2 X5X5 X3X3 X4X4 e.g., X 1 ? X 4, X 5 | X 2, X 3 X i ? X rest | X nbrs Potential function Partition function Clique: (largest) subset of vertices such that each pair is connected by an edge Clique 12 3 3

27 A Real Markov Net Estimate P(x 1, …, x n | y 1, …, y n ) Ψ (x i, y i ) = P(y i | x i ): local evidence likelihood Ψ (x i, x j ) = exp(-J(x i, x j )): compatibility matrix Observed pixels Latent causes

28 Example Of Image Segmentation With MRFs Sziranyi et al. (2000)

29 Graphical Models Are A Useful Formalism E.g., Naïve Bayes model D RxRyBxBy Marginalizing over D Definition of conditional probability

30 Graphical Models Are A Useful Formalism E.g., feedforward neural net with noise, sigmoid belief net Hidden layer Input layer Output layer

31 Graphical Models Are A Useful Formalism E.g., Restricted Boltzmann machine (Hinton) Also known as Harmony network (Smolensky) Hidden units Visible units

32 Graphical Models Are A Useful Formalism E.g., Gaussian Mixture Model

33 Graphical Models Are A Useful Formalism E.g., dynamical (time varying) models in which data arrives sequentially or output is produced as a sequence  Dynamic Bayes nets (DBNs) can be used to model such time-series (sequence) data  Special cases of DBNs include Hidden Markov Models (HMMs) State-space models

34 Hidden Markov Model (HMM) Y1Y1 Y3Y3 X1X1 X2X2 X3X3 Y2Y2 Phones/ words acoustic signal transition matrix Gaussian observations

35 State-Space Model (SSM)/ Linear Dynamical System (LDS) Y1Y1 Y3Y3 X1X1 X2X2 X3X3 Y2Y2 “True” state Noisy observations

36 Example: LDS For 2D Tracking Q3Q3 R1R1 R3R3 R2R2 Q1Q1 Q2Q2 X1X1 X1X1 X2X2 X2X2 X1X1 X2X2 y1y1 y1y1 y2y2 y2y2 y2y2 y1y1 o o o o sparse linear-Gaussian system

37 Kalman Filtering (Recursive State Estimation In An LDS) Y1Y1 Y3Y3 X1X1 X2X2 X3X3 Y2Y2 Estimate P(X t |y 1:t ) from P(X t-1 |y 1:t-1 ) and y t Predict: P(X t |y 1:t-1 ) = s Xt-1 P(X t |X t-1 ) P(X t-1 |y 1:t-1 ) Update: P(X t |y 1:t ) / P(y t |X t ) P(X t |y 1:t-1 )

38 Mike’s Project of the Week G X student trial α P δ problem IRT model

39 Mike’s Project of the Week X student trial L0L0 T τ GS BKT model

40 Mike’s Project of the Week X γσ student trial L0L0 T τ α P δ problem η GS IRT+BKT model

41 Why Are Bayes Nets Useful? Factored representation may have exponentially fewer parameters than full joint  Lower time complexity (i.e., easier inference)  Lower sample complexity (i.e., less data for learning) Graph structure supports  Modular representation of knowledge  Local, distributed algorithms for inference and learning  Intuitive (possibly causal) interpretation  Strong theory about the nature of cognition or the generative process that produces observed data Can’t represent arbitrary contingencies among variables, so theory can be rejected by data


Download ppt "Guidance: Assignment 3 Part 1 matlab functions in statistics toolbox  betacdf, betapdf, betarnd, betastat, betafit."

Similar presentations


Ads by Google