Reasoning Under Uncertainty: Independence and Inference CPSC 322 – Uncertainty 5 Textbook §6.3.1 (and 6.5.2 for HMMs) March 25, 2011.

Slides:

Advertisements

Similar presentations

BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.

Advertisements

Reasoning under Uncertainty: Marginalization, Conditional Prob., and Bayes Computer Science cpsc322, Lecture 25 (Textbook Chpt ) Nov, 6, 2013.

Lirong Xia Hidden Markov Models Tue, March 28, 2014.

Reasoning Under Uncertainty: Bayesian networks intro Jim Little Uncertainty 4 November 7, 2014 Textbook §6.3, 6.3.1, 6.5, 6.5.1,

Lirong Xia Approximate inference: Particle filter Tue, April 1, 2014.

Slide 1 Reasoning Under Uncertainty: More on BNets structure and construction Jim Little Nov (Textbook 6.3)

Advanced Artificial Intelligence

GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.

CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.

Reasoning under Uncertainty: Conditional Prob., Bayes and Independence Computer Science cpsc322, Lecture 25 (Textbook Chpt ) March, 17, 2010.

Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –One exception: games with multiple moves In particular, the Bayesian.

… Hidden Markov Models Markov assumption: Transition model:

CPSC 322, Lecture 30Slide 1 Reasoning Under Uncertainty: Variable elimination Computer Science cpsc322, Lecture 30 (Textbook Chpt 6.4) March, 23, 2009.

CPSC 322, Lecture 28Slide 1 Reasoning Under Uncertainty: More on BNets structure and construction Computer Science cpsc322, Lecture 28 (Textbook Chpt 6.3)

CPSC 322, Lecture 31Slide 1 Probability and Time: Markov Models Computer Science cpsc322, Lecture 31 (Textbook Chpt 6.5) March, 25, 2009.

CPSC 322, Lecture 32Slide 1 Probability and Time: Hidden Markov Models (HMMs) Computer Science cpsc322, Lecture 32 (Textbook Chpt 6.5) March, 27, 2009.

Bayesian Networks Alan Ritter.

CPSC 422, Lecture 18Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Feb, 25, 2015 Slide Sources Raymond J. Mooney University of.

CPSC 422, Lecture 14Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14 Feb, 4, 2015 Slide credit: some slides adapted from Stuart.

CPSC 322, Lecture 29Slide 1 Reasoning Under Uncertainty: Bnet Inference (Variable elimination) Computer Science cpsc322, Lecture 29 (Textbook Chpt 6.4)

CS 188: Artificial Intelligence Fall 2009 Lecture 19: Hidden Markov Models 11/3/2009 Dan Klein – UC Berkeley.

1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.

CPSC 322, Lecture 24Slide 1 Reasoning under Uncertainty: Intro to Probability Computer Science cpsc322, Lecture 24 (Textbook Chpt 6.1, 6.1.1) March, 15,

Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)

Made by: Maor Levy, Temple University  Probability expresses uncertainty.  Pervasive in all of Artificial Intelligence  Machine learning 

QUIZ!!  T/F: The forward algorithm is really variable elimination, over time. TRUE  T/F: Particle Filtering is really sampling, over time. TRUE  T/F:

Reasoning Under Uncertainty: Bayesian networks intro CPSC 322 – Uncertainty 4 Textbook §6.3 – March 23, 2011.

Reasoning Under Uncertainty: Independence and Inference Jim Little Uncertainty 5 Nov 10, 2014 Textbook §6.3.1, 6.5, 6.5.1,

Department of Computer Science Undergraduate Events More

Probabilistic Reasoning ECE457 Applied Artificial Intelligence Spring 2007 Lecture #9.

Introduction to Bayesian Networks

CPSC 322, Lecture 32Slide 1 Probability and Time: Hidden Markov Models (HMMs) Computer Science cpsc322, Lecture 32 (Textbook Chpt 6.5.2) Nov, 25, 2013.

UIUC CS 498: Section EA Lecture #21 Reasoning in Artificial Intelligence Professor: Eyal Amir Fall Semester 2011 (Some slides from Kevin Murphy (UBC))

Reasoning Under Uncertainty: Conditioning, Bayes Rule & the Chain Rule Jim Little Uncertainty 2 Nov 3, 2014 Textbook §6.1.3.

Announcements Project 4: Ghostbusters Homework 7

1 CMSC 671 Fall 2001 Class #21 – Tuesday, November 13.

Reasoning under Uncertainty: Conditional Prob., Bayes and Independence Computer Science cpsc322, Lecture 25 (Textbook Chpt ) Nov, 5, 2012.

The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)

4 Proposed Research Projects SmartHome – Encouraging patients with mild cognitive disabilities to use digital memory notebook for activities of daily living.

Computer Science CPSC 322 Lecture 27 Conditioning Ch Slide 1.

CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 2, 2015.

QUIZ!!  In HMMs...  T/F:... the emissions are hidden. FALSE  T/F:... observations are independent given no evidence. FALSE  T/F:... each variable X.

Reasoning Under Uncertainty: Independence Jim Little Uncertainty 3 Nov 5, 2014 Textbook §6.2.

Lecture 29 Conditional Independence, Bayesian networks intro Ch 6.3, 6.3.1, 6.5, 6.5.1,

1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.

Reasoning Under Uncertainty: Independence CPSC 322 – Uncertainty 3 Textbook §6.2 March 21, 2011.

Bayes network inference  A general scenario:  Query variables: X  Evidence (observed) variables and their values: E = e  Unobserved variables: Y 

CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) Nov, 13, 2013.

Reasoning Under Uncertainty: Variable Elimination for Bayes Nets Jim Little Uncertainty 6 Nov Textbook §6.4,

Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk Undergrad TAs: Sam Johnson, Nikhil Johri CS 440 / ECE 448 Introduction to Artificial Intelligence.

Reasoning Under Uncertainty: Variable Elimination CPSC 322 – Uncertainty 6 Textbook §6.4.1 March 28, 2011.

CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016.

Computer Science CPSC 322 Lecture 19 Bayesian Networks: Construction 1.

Reasoning Under Uncertainty: Belief Networks

Reasoning Under Uncertainty: More on BNets structure and construction

CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12

Read R&N Ch Next lecture: Read R&N

Reasoning Under Uncertainty: More on BNets structure and construction

Reasoning Under Uncertainty: Conditioning, Bayes Rule & Chain Rule

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14

Read R&N Ch Next lecture: Read R&N

Uncertainty in AI.

CS 188: Artificial Intelligence Spring 2007

Reasoning Under Uncertainty: Bnet Inference

Reasoning Under Uncertainty: Bnet Inference

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14

Class #16 – Tuesday, October 26

Read R&N Ch Next lecture: Read R&N

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14

Presentation transcript:

Reasoning Under Uncertainty: Independence and Inference CPSC 322 – Uncertainty 5 Textbook §6.3.1 (and for HMMs) March 25, 2011

Lecture Overview Recap: Bayesian Networks and Markov Chains Inference in a Special Type of Bayesian Network –Hidden Markov Models (HMMs) –Rainbow Robot example Inference in General Bayesian Networks –Observations and Inference –Time-permitting: Entailed independencies –Next lecture: Variable Elimination 2

Recap: Conditional Independence

Recap: Bayesian Networks, Definition 4

Recap: Construction of Bayesian Networks Encoding the joint over {X 1, …, X n } as a Bayesian network: –Totally order the variables: e.g., X 1, …, X n –For every variable X i, find the smallest set of parents Pa(X i )  {X 1, …, X i-1 } such that X i ╨ {X 1, …, X i-1 } \ Pa(X i ) | Pa(X i ) X i is conditionally independent from its other ancestors given its parents –For every variable X i, construct its conditional probability table P(X i | Pa(X i ) ) This has to specify a conditional probability distribution P(X i | Pa(X i ) = pa(X i ) ) for every instantiation pa(X i ) of X i ‘s parents If a variable has 3 parents each of which has a domain with 4 values, how many instantiations of its parents are there? *

Recap: Construction of Bayesian Networks Encoding the joint over {X 1, …, X n } as a Bayesian network: –Totally order the variables: e.g., X 1, …, X n –For every variable X i, find the smallest set of parents Pa(X i )  {X 1, …, X i-1 } such that X i ╨ {X 1, …, X i-1 } \ Pa(X i ) | Pa(X i ) X i is conditionally independent from its other ancestors given its parents –For every variable X i, construct its conditional probability table P(X i | Pa(X i ) ) This has to specify a conditional probability distribution P(X i | Pa(X i ) = pa(X i ) ) for every instantiation pa(X i ) of X i ‘s parents If a variable has 3 parents each of which has a domain with 4 values, how many instantiations of its parents are there? –4 * 4 * 4 = 4 3 –For each of these 4 3 values we need one probability distribution defined over the values of X i 6

Recap of BN construction with a small example Disease DSymptom SP(D,S) tt tf ft ff Disease DP(D) t0.01 f0.99 Symptom SP(S) t f YesNo

Recap of BN construction with a small example Disease DSymptom SP(D,S) tt tf ft ff Disease Symptom Disease DP(D) t0.01 f0.99 Symptom SP(S) t f0.8911

Recap of BN construction with a small example Which conditional probability tables do we need? Disease DSymptom SP(D,S) tt tf ft ff Disease Symptom Disease DP(D) t0.01 f0.99 Symptom SP(S) t f P(D|S)P(D)P(S|D)P(D,S)

Recap of BN construction with a small example Which conditional probability tables do we need? –P(D) and P(S|D) –In general: for each variable X in the network: P( X|Pa(X) ) Disease DSymptom SP(D,S) tt tf ft ff Disease Symptom Disease DP(D) t0.01 f0.99 Symptom SP(S) t f P(D=t) 0.01 Disease DP(S=t|D) t0.0099/( )=0.99 f0.099/( )=0.1

Recap of BN construction with a small example Disease DSymptom SP(D,S) tt tf ft ff How about a different ordering? Symptom, Disease –We need distributions P(S) and P(D|S) –In general: for each variable X in the network: P( X|Pa(X) ) Disease Symptom P(S=t) Symptom SP(D=t|S) t0.0099/( )= f0.0001/( )= Disease DP(D) t0.01 f0.99 Symptom SP(S) t f0.8911

Remark: where do the conditional probabilities come from? The joint distribution is not normally the starting point –We would have to define exponentially many numbers First define the Bayesian network structure –Either by domain knowledge –Or by machine learning algorithms (see CPSC 540) Typically based on local search Then fill in the conditional probability tables –Either by domain knowledge –Or by machine learning algorithms (see CPSC 340, CPSC 422) Based on statistics over the observed data 12

Lecture Overview Recap: Bayesian Networks and Markov Chains Inference in a Special Type of Bayesian Network –Hidden Markov Models (HMMs) –Rainbow Robot example Inference in General Bayesian Networks –Observations and Inference –Time-permitting: Entailed independencies –Next lecture: Variable Elimination 13

Markov Chains 14 X0X0 X1X1 X2X2 …

Hidden Markov Models (HMMs) A Hidden Markov Model (HMM) is a stationary Markov chain plus a noisy observation about the state at each time step: 15 O1O1 X0X0 X1X1 O2O2 X2X2 …

Example HMM: Robot Tracking Robot tracking as an HMM: 16 Sens 1 Pos 0 Pos 1 Sens 2 Pos 2 Robot is moving at random: P(Pos t |Pos t-1 ) Sensor observations of the current state P(Sens t |Pos t ) …

Filtering in Hidden Markov Models (HMMs) 17 O1O1 X0X0 X1X1 O2O2 X2X2 Filtering problem in HMMs: at time step t, we would like to know P(X t |o 1, …, o t ) We will derive simple update equations for this belief state: –We are given P(X 0 ) (i.e., P(X 0 | {}) –We can compute P(X t |O 1, …, O t ) if we know P(X t-1 |o 1, …, o t-1 ) –A simple example of dynamic programming …

HMM Filtering: first time step O1O1 X0X0 X1X1 Direct application of Bayes rule O 1 ╨ X 0 | X 1 and product rule By applying marginalization over X 0 “backwards”:

HMM Filtering: general time step t Direct application of Bayes rule O t ╨ {X t-1, O 1,…,O t-1 } | X t and X t ╨ {O 1,…,O t-1 } | X t-1 By applying marginalization over X t-1 “backwards”: OtOt X t-1 XtXt

HMM Filtering Summary 20 Observation probability Transition probability We already know this from the previous step 36 t t 36 36

HMM Filtering Summary 21 Observation probability Transition probability We already know this from the previous step

HMM Filtering: Rainbow Robot Summary 22

Lecture Overview Recap: Bayesian Networks and Markov Chains Inference in a Special Type of Bayesian Network –Hidden Markov Models (HMMs) –Rainbow Robot example Inference in General Bayesian Networks –Observations and Inference –Time-permitting: Entailed independencies –Next lecture: Variable Elimination 23

Bayesian Networks: Incorporating Observations In the special case of Hidden Markov Models (HMMs): –we could easily incorporate observations –and do efficient inference (in particular: filtering) Back to general Bayesian Networks –We can still incorporate observations –And we can still do (fairly) efficient inference 24

Bayesian Networks: Incorporating Observations We denote observed variables as shaded. Examples: 25 O1O1 X0X0 X1X1 O2O2 X2X2 Understood Material Assignment Grade Exam Grade Alarm Smoking At Sensor Fire

Bayesian Networks: Types of Inference Diagnostic People are leaving L=t P(F|L=t)=? PredictiveIntercausal Mixed Fire happens F=t P(L|F=t)=? Alarm goes off P(a) = 1.0 P(F|A=t,T=t)=? People are leaving L=t There is no fire F=f P(A|F=f,L=t)=? Person smokes next to sensor S=t Fire Alarm Leaving Fire Alarm Leaving Fire Alarm Leaving Fire Alarm Smoking at Sensor We will use the same reasoning procedure for all of these types

Lecture Overview Recap: Bayesian Networks and Markov Chains Inference in a Special Type of Bayesian Network –Hidden Markov Models (HMMs) –Rainbow Robot example Inference in General Bayesian Networks –Observations and Inference –Time-permitting: Entailed independencies –Next lecture: Variable Elimination 27

Inference in Bayesian Networks Given –A Bayesian Network BN, and –Observations of a subset of its variables E: E=e –A subset of its variables Y that is queried Compute the conditional probability P(Y|E=e) Step 1: Drop all variables X of the Bayesian network that are conditionally independent of Y given E=e –By definition of Y ╨ X | E=e, we know P(Y|E=e) = P(Y|X=x,E=e) –We can thus drop X Step 2: run variable elimination algorithm (next lecture) 28

Information flow in Bayesian Networks A Bayesian network structure implies a number of conditional independencies and dependencies –We can determine these directly from the graph structure I.e., we don’t need to look at the conditional probability tables Conditional independencies we derive for a graph structure will hold for all possible conditional probability tables Information we acquire about one variable flows through the network and changes our beliefs about other variables –This is also called probability propagation –But information does not flow everywhere: Some nodes block information 29

Information flow through chain structure Unobserved node in a chain lets information pass –E.g. learning the value of Fire will change your belief about Alarm –Which will in turn change your belief about Leaving Observed node in a chain blocks information –If you know the value of Alarm to start with: learning the value of Fire yields no extra information about Leaving 30 FireAlarmLeaving FireAlarmLeaving

Information flow through chain structure Information flow is symmetric (X ╨ Y | Z and Y ╨ X | Z are identical) –Unobserved node in a chain lets information pass (both ways) E.g. learning the value of Leaving will change your belief about Alarm Which will in turn change your belief about Fire –Observed node in a chain blocks information (both ways) If you know the value of Alarm to start with: learning the value of Leaving yields no extra information about Fire 31 FireAlarmLeaving FireAlarmLeaving

Information flow through common parent Unobserved common parent lets information pass –E.g. learning the value of AssignmentGrade changes your belief about UnderstoodMaterial –Which will in turn change your belief about ExamGrade Observed common parent blocks information –If you know the value of UnderstoodMaterial to start with Learning AssignmentGrade yields no extra information about ExamGrade 32 Understood Material Assignment Grade Exam Grade Understood Material Assignment Grade Exam Grade

Information flow through common child Unobserved common child blocks information –E.g. learning the value of Fire will not change your belief about Smoking: the two are marginally independent Observed common child lets information pass: explaining away –E.g., when you know the alarm is going: then learning that a person smoked next to the fire sensor “explains away the evidence” and thereby changes your beliefs about fire. 33 Alarm Smoking At Sensor Fire Alarm Smoking At Sensor Fire

Information flow through common child Exception: unobserved common child lets information pass if one of its descendants is observed –This is just as if the child itself was observed E.g., Leaving could be a deterministic function of Alarm, so observing Leaving means you know Alarm as well Thus, a person smoking next to the fire alarm can still “explain away” the evidence of people leaving the building 34 Leaving Alarm Smoking At Sensor Fire

In these cases, X and Y are (conditionally) dependent Z Z Z XY E In 3, X and Y become dependent as soon as there is evidence on Z or on any of its descendants. Summary: (Conditional) Dependencies

Blocking paths for probability propagation. Three ways in which a path between Y to X (or vice versa) can be blocked, given evidence E Z Z Z XYE Summary: (Conditional) Independencies

Training your understanding of conditional independencies in AIspace These concepts take practice to get used to Use the AIspace applet for Belief and Decision networks ( –Load the “conditional independence quiz” network (or any other one) –Go in “Solve” mode and select “Independence Quiz” You can take an unbounded number of quizzes: –It generates questions, you answer, and then get the right answer –It also allows you to ask arbitrary queries 37

Conditional Independencies in a BN 38 Is H conditionally independent of E given I? I.e., H ╨ E | I ? NoYes

Conditional Independencies in a BN 39 Is H conditionally independent of E given I? I.e., H ╨ E | I ? Yes! Information flow is blocked by the unobserved common child F

Conditional Independencies in a BN 40 Is A conditionally independent of I given F? I.e., A ╨ I | F ? NoYes

Conditional Independencies in a BN 41 Is A conditionally independent of I given F? I.e., A ╨ I | F ? No. Information can pass through (all it takes is one possible path)

Build a Bayesian Network for a given domain Classify the types of inference: –Diagnostic, Predictive, Mixed, Intercausal Identify implied (in)dependencies in the network Assignment 4 available on WebCT –Due Monday, April 4 Can only use 2 late days So we can give out solutions to study for the final exam –You should now be able to solve questions 1, 2, and 5 –Questions 3 & 4: mechanical once you know the method Method for Question 3: next Monday Method for Question 4: next Wednesday/Friday Final exam: Monday, April 11 –Less than 3 weeks from now 42 Learning Goals For Today’s Class