Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 17

Slides:



Advertisements
Similar presentations
Department of Computer Science Undergraduate Events More
Advertisements

Exact Inference in Bayes Nets
Dynamic Bayesian Networks (DBNs)
Lirong Xia Approximate inference: Particle filter Tue, April 1, 2014.
Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis)
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.
CPSC 322, Lecture 31Slide 1 Probability and Time: Markov Models Computer Science cpsc322, Lecture 31 (Textbook Chpt 6.5) March, 25, 2009.
CPSC 322, Lecture 32Slide 1 Probability and Time: Hidden Markov Models (HMMs) Computer Science cpsc322, Lecture 32 (Textbook Chpt 6.5) March, 27, 2009.
CPSC 422, Lecture 18Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Feb, 25, 2015 Slide Sources Raymond J. Mooney University of.
CPSC 422, Lecture 14Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14 Feb, 4, 2015 Slide credit: some slides adapted from Stuart.
CPSC 422, Lecture 19Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Feb, 27, 2015 Slide Sources Raymond J. Mooney University of.
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models
CPSC 322, Lecture 32Slide 1 Probability and Time: Hidden Markov Models (HMMs) Computer Science cpsc322, Lecture 32 (Textbook Chpt 6.5.2) Nov, 25, 2013.
LAC group, 16/06/2011. So far...  Directed graphical models  Bayesian Networks Useful because both the structure and the parameters provide a natural.
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 2, 2015.
CPSC 322, Lecture 33Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 33 Nov, 30, 2015 Slide source: from David Page (MIT) (which were.
CPSC 422, Lecture 19Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of.
Probabilistic Graphical Models seminar 15/16 ( ) Haim Kaplan Tel Aviv University.
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
CPSC 422, Lecture 17Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 17 Oct, 19, 2015 Slide Sources D. Koller, Stanford CS - Probabilistic.
Today Graphical Models Representing conditional dependence graphically
1 Relational Factor Graphs Lin Liao Joint work with Dieter Fox.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) Nov, 13, 2013.
CS498-EA Reasoning in AI Lecture #23 Instructor: Eyal Amir Fall Semester 2011.
Please Also Complete Teaching Evaluations
HMM: Particle filters Lirong Xia. HMM: Particle filters Lirong Xia.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 10
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 16
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 32
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 32
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11
Markov Networks.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 20
Probability and Time: Markov Models
UBC Department of Computer Science Undergraduate Events More
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 16
Probability and Time: Markov Models
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 16
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 2
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Decision Theory: Sequential Decisions
Instructors: Fei Fang (This Lecture) and Dave Touretzky
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 17
Probability and Time: Markov Models
Pairwise Markov Networks
Please Also Complete Teaching Evaluations
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14
Probability and Time: Markov Models
Readings: K&F: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7 Markov networks, Factor graphs, and an unified view Start approximate inference If we are lucky… Graphical.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18
Markov Networks.
HMM: Particle filters Lirong Xia. HMM: Particle filters Lirong Xia.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14
Presentation transcript:

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 17 Oct, 19, 2016 Slide Sources D. Koller, Stanford CS - Probabilistic Graphical Models D. Page, Whitehead Institute, MIT Several Figures from “Probabilistic Graphical Models: Principles and Techniques” D. Koller, N. Friedman 2009 CPSC 422, Lecture 17

Lecture Overview Undirected models Finish (directed) temporal models Probabilistic Graphical models Undirected models CPSC 422, Lecture 17

Simple but Powerful Approach: Particle Filtering Idea from Exact Filtering: should be able to compute P(Xt+1 | e1:t+1) from P( Xt | e1:t )‏ “.. One slice from the previous slice…” Idea from Likelihood Weighting Samples should be weighted by the probability of evidence given parents New Idea: run multiple samples simultaneously through the network CPSC 422, Lecture 11

Run all N samples together through the network, one slice at a time STEP 0: Generate a population on N initial-state samples by sampling from initial state distribution P(X0) Particle Filtering N = 10 Form of filtering, where the N samples are the forward message Essentially use the sample themselves as an approximation of the current state distribution No need to unroll the network, all needed is current and next slice => “constant” update per time step

Particle Filtering STEP 1: Propagate each sample for xt forward by sampling the next state value xt+1 based on P(Xt+1 |Xt ) Rt P(Rt+1=t) t f 0.7 0.3

Particle Filtering STEP 2: Weight each sample by the likelihood it assigns to the evidence E.g. assume we observe not umbrella at t+1 Rt P(ut) P(┐ut) t f 0.9 0.2 0.1 0.8

STEP 3: Create a new population from the population at Xt+1, i.e. resample the population so that the probability that each sample is selected is proportional to its weight Particle Filtering Weighted sample with replacement (ie “we can pick the same sample multiple times”) Start the Particle Filtering cycle again from the new sample

Is PF Efficient? In practice, approximation error of particle filtering remains bounded overtime Approximation error of particle filtering remains bounded over time, at least empirically—theoretical analysis is difficult Probabilities in the transition and sensor models are strictly less that 1 and greater than 0 In statistics, efficiency is a term used in the comparison of various statistical procedures ……] Essentially, a more efficient estimator, experiment, or test needs fewer observations than a less efficient one to achieve a given performance. It is also possible to prove that the approximation maintains bounded error with high probability (with specific assumption: probs in transition and sensor models >0 and <1)

Applications of AI 422 big picture: Where are we? StarAI (statistical relational AI) Hybrid: Det +Sto 422 big picture: Where are we? Query Planning Deterministic Stochastic Value Iteration Approx. Inference Full Resolution SAT Logics Belief Nets Markov Decision Processes and Partially Observable MDP Markov Chains and HMMs First Order Logics Ontologies Temporal rep. Applications of AI Approx. : Gibbs Undirected Graphical Models Markov Networks Conditional Random Fields Reinforcement Learning Representation Reasoning Technique Prob CFG Prob Relational Models Markov Logics Forward, Viterbi…. Approx. : Particle Filtering First Order Logics (will build a little on Prolog and 312) Temporal reasoning from NLP 503 lectures Reinforcement Learning (not done in 340, at least in the latest offering) Markov decision process Set of states S, set of actions A Transition probabilities to next states P(s’| s, a′) Reward functions R(s, s’, a) RL is based on MDPs, but Transition model is not known Reward model is not known While for MDPs we can compute an optimal policy RL learns an optimal policy CPSC 322, Lecture 34

Lecture Overview Probabilistic Graphical models Intro Example Markov Networks Representation (vs. Belief Networks) Inference in Markov Networks (Exact and Approx.) Applications of Markov Networks CPSC 422, Lecture 17

Probabilistic Graphical Models Viterbi is MAP, right? From “Probabilistic Graphical Models: Principles and Techniques” D. Koller, N. Friedman 2009 CPSC 422, Lecture 17

Misconception Example Four students (Alice, Bill, Debbie, Charles) get together in pairs, to work on a homework But only in the following pairs: AB AD DC BC Professor misspoke and might have generated misconception A student might have figured it out later and told study partner Undirected graphs (cf. Bayesian networks, which are directed) A Markov network represents the joint probability distribution over events which are represented by variables Nodes in the network represent variables CPSC 422, Lecture 17

Example: In/Depencencies Are A and C independent because they never spoke? No, because A might have figure it out and told B who then told C But if we know the values of B and D…. And if we know the values of A and C CPSC 422, Lecture 17

Which of these two Bnets captures the two independencies of our example? A Bnet require that we ascribe directionality to the influence while in this case the interaction between the variables is symmetrical A BD independent only given A B BD marginally independent c. Both d. None CPSC 422, Lecture 17

Parameterization of Markov Networks X Not conditional probabilities – capture the affinities between related variables Also called clique potentials or just potentials D and A and BC tend to agree. DC tend to disagree Table values are typically nonnegative Table values have no other restrictions Not necessarily probabilities Not necessarily < 1 X Factors define the local interactions (like CPTs in Bnets) What about the global model? What do you do with Bnets? CPSC 422, Lecture 17

How do we combine local models? As in BNets by multiplying them! This is the joint distribution from which I can compute any CPSC 422, Lecture 17

Multiplying Factors (same seen in 322 for VarElim) Deliberately chosen factors that do not correspond to prob or to conditional prob. To emphasize the generality of this operation You have seen this in VarElimination CPSC 422, Lecture 17

Factors do not represent marginal probs. ! a0 b0 0.13 a0 b1 0.69 a1 b0 0.14 a1 b1 0.04 Marginal P(A,B) Computed from the joint The reason for the discrepancy is the influence of the other factors on the distribution D and C tend to disagree but A and D AND C and B tend to agree so A and B should disagree This is dominating the weaker agreement according to the AB factor CPSC 422, Lecture 17

Step Back…. From structure to factors/potentials In a Bnet the joint is factorized…. In a Markov Network you have one factor for each maximal clique a clique in an undirected graph is a subset of its vertices such that every two vertices in the subset are connected by an edge. CPSC 422, Lecture 17

Directed vs. Undirected Independencies Factorization CPSC 422, Lecture 17

General definitions Two nodes in a Markov network are independent if and only if every path between them is cut off by evidence eg for A C So the markov blanket of a node is…? eg for C CPSC 422, Lecture 17

Markov Networks Applications (1): Computer Vision Called Markov Random Fields Stereo Reconstruction Image Segmentation Object recognition Typically pairwise MRF Each vars correspond to a pixel (or superpixel ) Edges (factors) correspond to interactions between adjacent pixels in the image E.g., in segmentation: from generically penalize discontinuities, to road under car CPSC 422, Lecture 17

Image segmentation CPSC 422, Lecture 17

Conditional random fields (next class Fri) Markov Networks Applications (2): Sequence Labeling in NLP and BioInformatics Conditional random fields (next class Fri) A skip chain CRF for named entity recognition, with connection between adjacent words and long rang connection between multiple occurrences of the same word CPSC 422, Lecture 17

Learning Goals for today’s class You can: Justify the need for undirected graphical model (Markov Networks) Interpret local models (factors/potentials) and combine them to express the joint Define independencies and Markov blanket for Markov Networks Perform Exact and Approx. Inference in Markov Networks Describe a few applications of Markov Networks CPSC 422, Lecture 17

One week to Midterm, Wed, Oct 26, we will start at 9am sharp How to prepare…. Keep Working on assignment-2 ! Go to Office Hours x Learning Goals (look at the end of the slides for each lecture – complete list ahs been posted) Revise all the clicker questions and practice exercises More practice material has been posted Check questions and answers on Piazza CPSC 422, Lecture 17

How to acquire factors? CPSC 422, Lecture 17

[Source: WIKIPEDIA] A graph with 23 1-vertex cliques (its vertices), 42 2-vertex cliques (its edges), 19 3-vertex cliques (the light and dark blue triangles), and 2 4-vertex cliques (dark blue areas). The six edges not associated with any triangle and the 11 light blue triangles form maximal cliques. The two dark blue 4-cliques are both maximum and maximal, and the clique number of the graph is 4. CPSC 422, Lecture 17