Computer Science CPSC 322 Lecture 26 Uncertainty and Probability (Ch. 6.1, 6.1.1, 6.1.3)

Slides:



Advertisements
Similar presentations
Computer Science CPSC 322 Lecture 25 Top Down Proof Procedure (Ch 5.2.2)
Advertisements

Computer Science CPSC 322 Lecture 3 AI Applications.
Decision Theory: Sequential Decisions Computer Science cpsc322, Lecture 34 (Textbook Chpt 9.3) Nov, 28, 2012.
Department of Computer Science Undergraduate Events More
Reasoning under Uncertainty: Marginalization, Conditional Prob., and Bayes Computer Science cpsc322, Lecture 25 (Textbook Chpt ) Nov, 6, 2013.
PROBABILITY. Uncertainty  Let action A t = leave for airport t minutes before flight from Logan Airport  Will A t get me there on time ? Problems :
Uncertainty Everyday reasoning and decision making is based on uncertain evidence and inferences. Classical logic only allows conclusions to be strictly.
CPSC 322, Lecture 2Slide 1 Representational Dimensions Computer Science cpsc322, Lecture 2 (Textbook Chpt1) Sept, 6, 2013.
Reasoning Under Uncertainty: Bayesian networks intro Jim Little Uncertainty 4 November 7, 2014 Textbook §6.3, 6.3.1, 6.5, 6.5.1,
CPSC 502, Lecture 11Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 11 Oct, 18, 2011.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.
CPSC 322, Lecture 23Slide 1 Logic: TD as search, Datalog (variables) Computer Science cpsc322, Lecture 23 (Textbook Chpt 5.2 & some basic concepts from.
CPSC 322, Lecture 19Slide 1 Propositional Logic Intro, Syntax Computer Science cpsc322, Lecture 19 (Textbook Chpt ) February, 23, 2009.
Decision Theory: Single Stage Decisions Computer Science cpsc322, Lecture 33 (Textbook Chpt 9.2) March, 30, 2009.
CPSC 322, Lecture 4Slide 1 Search: Intro Computer Science cpsc322, Lecture 4 (Textbook Chpt ) January, 12, 2009.
Reasoning under Uncertainty: Conditional Prob., Bayes and Independence Computer Science cpsc322, Lecture 25 (Textbook Chpt ) March, 17, 2010.
CPSC 322, Lecture 18Slide 1 Planning: Heuristics and CSP Planning Computer Science cpsc322, Lecture 18 (Textbook Chpt 8) February, 12, 2010.
Marginal Independence and Conditional Independence Computer Science cpsc322, Lecture 26 (Textbook Chpt 6.1-2) March, 19, 2010.
Probability.
CPSC 322, Lecture 30Slide 1 Reasoning Under Uncertainty: Variable elimination Computer Science cpsc322, Lecture 30 (Textbook Chpt 6.4) March, 23, 2009.
CPSC 322, Lecture 11Slide 1 Constraint Satisfaction Problems (CSPs) Introduction Computer Science cpsc322, Lecture 11 (Textbook Chpt 4.0 – 4.2) January,
CPSC 322, Lecture 23Slide 1 Logic: TD as search, Datalog (variables) Computer Science cpsc322, Lecture 23 (Textbook Chpt 5.2 & some basic concepts from.
CPSC 322, Lecture 12Slide 1 CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12 (Textbook Chpt ) January, 29, 2010.
KI2 - 2 Kunstmatige Intelligentie / RuG Probabilities Revisited AIMA, Chapter 13.
Ai in game programming it university of copenhagen Welcome to... the Crash Course Probability Theory Marco Loog.
CPSC 322, Lecture 31Slide 1 Probability and Time: Markov Models Computer Science cpsc322, Lecture 31 (Textbook Chpt 6.5) March, 25, 2009.
CPSC 322, Lecture 22Slide 1 Logic: Domain Modeling /Proofs + Top-Down Proofs Computer Science cpsc322, Lecture 22 (Textbook Chpt 5.2) March, 8, 2010.
CPSC 322, Lecture 24Slide 1 Reasoning under Uncertainty: Intro to Probability Computer Science cpsc322, Lecture 24 (Textbook Chpt 6.1, 6.1.1) March, 15,
Uncertainty Chapter 13.
CPSC 322, Lecture 24Slide 1 Finish Logics… Reasoning under Uncertainty: Intro to Probability Computer Science cpsc322, Lecture 24 (Textbook Chpt 6.1, 6.1.1)
Uncertainty Probability and Bayesian Networks
Slide 1 Logic: Domain Modeling /Proofs + Top-Down Proofs Jim Little UBC CS 322 – CSP October 22, 2014.
Computer Science CPSC 322 Lecture 3 AI Applications 1.
Slide 1 Constraint Satisfaction Problems (CSPs) Introduction Jim Little UBC CS 322 – CSP 1 September 27, 2014 Textbook §
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
CPSC 322, Lecture 22Slide 1 Logic: Domain Modeling /Proofs + Top-Down Proofs Computer Science cpsc322, Lecture 22 (Textbook Chpt 5.2) Oct, 26, 2010.
Reasoning Under Uncertainty: Bayesian networks intro CPSC 322 – Uncertainty 4 Textbook §6.3 – March 23, 2011.
Department of Computer Science Undergraduate Events More
Constraint Satisfaction Problems (CSPs) CPSC 322 – CSP 1 Poole & Mackworth textbook: Sections § Lecturer: Alan Mackworth September 28, 2012.
Reasoning Under Uncertainty: Conditioning, Bayes Rule & the Chain Rule Jim Little Uncertainty 2 Nov 3, 2014 Textbook §6.1.3.
Reasoning Under Uncertainty: Introduction to Probability CPSC 322 – Uncertainty 1 Textbook §6.1 March 16, 2011.
Uncertainty. Assumptions Inherent in Deductive Logic-based Systems All the assertions we wish to make and use are universally true. Observations of the.
Reasoning under Uncertainty: Conditional Prob., Bayes and Independence Computer Science cpsc322, Lecture 25 (Textbook Chpt ) Nov, 5, 2012.
Computer Science CPSC 322 Lecture 27 Conditioning Ch Slide 1.
CPSC 322, Lecture 19Slide 1 (finish Planning) Propositional Logic Intro, Syntax Computer Science cpsc322, Lecture 19 (Textbook Chpt – 5.2) Oct,
Reasoning Under Uncertainty: Independence Jim Little Uncertainty 3 Nov 5, 2014 Textbook §6.2.
Computer Science CPSC 322 Lecture 22 Logical Consequences, Proof Procedures (Ch 5.2.2)
Lecture 29 Conditional Independence, Bayesian networks intro Ch 6.3, 6.3.1, 6.5, 6.5.1,
Reasoning Under Uncertainty: Independence CPSC 322 – Uncertainty 3 Textbook §6.2 March 21, 2011.
Logic: Proof procedures, soundness and correctness CPSC 322 – Logic 2 Textbook §5.2 March 7, 2011.
CPSC 322, Lecture 2Slide 1 Representational Dimensions Computer Science cpsc322, Lecture 2 (Textbook Chpt1) Sept, 7, 2012.
Domain Splitting, Local Search CPSC 322 – CSP 4 Textbook §4.6, §4.8 February 4, 2011.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) Nov, 13, 2013.
Computer Science CPSC 322 Lecture 16 Logic Wrap Up Intro to Probability 1.
Computer Science cpsc322, Lecture 25
Constraint Satisfaction Problems (CSPs) Introduction
Marginal Independence and Conditional Independence
Finish Logics… Reasoning under Uncertainty: Intro to Probability
Chapter 10: Using Uncertain Knowledge
Where are we in CS 440? Now leaving: sequential, deterministic reasoning Entering: probabilistic reasoning and machine learning.
Constraint Satisfaction Problems (CSPs) Introduction
Reasoning Under Uncertainty: Conditioning, Bayes Rule & Chain Rule
Decision Theory: Single Stage Decisions
Planning: Heuristics and CSP Planning
Probability and Time: Markov Models
Probability and Time: Markov Models
Decision Theory: Single Stage Decisions
Probability and Time: Markov Models
Probability and Time: Markov Models
Domain Splitting CPSC 322 – CSP 4 Textbook §4.6 February 4, 2011.
Presentation transcript:

Computer Science CPSC 322 Lecture 26 Uncertainty and Probability (Ch. 6.1, 6.1.1, 6.1.3)

Where are we? Environment Problem Type Query Planning Deterministic Stochastic Constraint Satisfaction Search Arc Consistency Search Logics STRIPS Vars + Constraints Value Iteration Variable Elimination Belief Nets Decision Nets Markov Processes Static Sequential Representation Reasoning Technique Variable Elimination Done with Deterministic Environments 2

Where are we? Environment Problem Type Query Planning Deterministic Stochastic Constraint Satisfaction Search Arc Consistency Search Logics STRIPS Vars + Constraints Value Iteration Variable Elimination Belief Nets Decision Nets Markov Processes Static Sequential Representation Reasoning Technique Variable Elimination Second Part of the Course 3

Where Are We? Environment Problem Type Query Planning Deterministic Stochastic Constraint Satisfaction Search Arc Consistency Search Logics STRIPS Vars + Constraints Value Iteration Variable Elimination Belief Nets Decision Nets Markov Processes Static Sequential Representation Reasoning Technique Variable Elimination We’ll focus on Belief Nets 4

Lecture Overview Intro to Reasoning Under Uncertainty Introduction to Probability Random Variables and Possible World Semantics Probability Distributions Marginalization Slide 5

Two main sources of uncertainty (From Lecture 2) Sensing Uncertainty: The agent cannot fully observe a state of interest. For example: Right now, how many people are in this building? What disease does this patient have? Where is the soccer player behind me? Effect Uncertainty: The agent cannot be certain about the effects of its actions. For example: If I work hard, will I get an A? Will this drug work for this patient? Where will the ball go when I kick it? 6

Motivation for uncertainty To act in the real world, we almost always have to handle uncertainty (both effect and sensing uncertainty) Deterministic domains are an abstraction Sometimes this abstraction enables more powerful inference Now we don’t make this abstraction anymore AI main focus shifted from logic to probability in the 1980s The language of probability is very expressive and general New representations enable efficient reasoning We will see some of these, in particular Bayesian networks Reasoning under uncertainty is part of the ‘new’ AI This is not a dichotomy: framework for probability is logical! New frontier: combine logic and probability 7

Interesting article about AI and uncertainty “The machine age”, by Peter Norvig (head of research at Google) New York Post, 12 February _age_tM7xPAv4pI4JslK0M1JtxI _age_tM7xPAv4pI4JslK0M1JtxI “The things we thought were hard turned out to be easier.” Playing grandmaster level chess, or proving theorems in integral calculus “Tasks that we at first thought were easy turned out to be hard.” A toddler (or a dog) can distinguish hundreds of objects (ball, bottle, blanket, mother, …) just by glancing at them Very difficult for computer vision to perform at this level “Dealing with uncertainty turned out to be more important than thinking with logical precision.” Reasoning under uncertainty (and lots of data) are key to progress 8

Probability as a measure of uncertainty/ignorance Probability measures an agent's degree of belief in truth of propositions about states of the world It does not measure how true a proposition is Propositions are true or false. We simply may not know exactly which. Example: I roll a fair dice. What is ‘the’ (my) probability that the result is a ‘6’? 9

10

Probability as a measure of uncertainty/ignorance Probability measures an agent's degree of belief in truth of propositions about states of the world It does not measure how true a proposition is Propositions are true or false. We simply may not know exactly which. Example: I roll a fair dice. What is ‘the’ (my) probability that the result is a ‘6’? It is 1/6 ≈ 16.7%. I now look at the dice. What is ‘the’ (my) probability now? My probability is now Your probability (you have not looked at the dice) What if I tell some of you the result is even? Their probability becomes Slide 11

Probability as a measure of uncertainty/ignorance Probability measures an agent's degree of belief in truth of propositions about states of the world It does not measure how true a proposition is Propositions are true or false. We simply may not know exactly which. Example: I roll a fair dice. What is ‘the’ (my) probability that the result is a ‘6’? It is 1/6 ≈ 16.7%. I now look at the dice. What is ‘the’ (my) probability now? My probability is now either 1 or 0, depending on what I observed. Your probability has not changed: 1/6 ≈ 16.7% What if I tell some of you the result is even? Their probability increases to 1/3 ≈ 33.3%, if they believe me Different agents can have different degrees of belief in (probabilities for) a proposition, based on the evidence they have. 12

Probability as a measure of uncertainty/ignorance Probability measures an agent's degree of belief in truth of propositions about states of the world Belief in a proposition f can be measured in terms of a number between 0 and 1 this is the probability of f E.g: P(“roll of fair die came out as a 6”) = 1/6 ≈ 16.7% = Using probabilities between 0 and 1 is purely a convention. P(f) = 0 means that f is believed to be. B. Probably false A. Probably true C. Definitely falseD. Definitely true 13

Probability as a measure of uncertainty/ignorance Probability measures an agent's degree of belief in truth of propositions about states of the world Belief in a proposition f can be measured in terms of a number between 0 and 1 this is the probability of f E.g. P(“roll of fair dice came out as a 6”) = 1/6 ≈ 16.7% = Using probabilities between 0 and 1 is purely a convention. P(f) = 0 means that f is believed to be Definitely false: the probability of f being true is zero. Likewise, P(f) = 1 means f is believed to be definitely true. 14

Lecture Overview Intro to Reasoning Under Uncertainty Introduction to Probability Random Variables and Possible World Semantics Probability Distributions Marginalization Slide 15

Probability Theory and Random Variables Probability Theory system of logical axioms and formal operations for sound reasoning under uncertainty Basic element: random variable X X is a variable like the ones we have seen in CSP/Planning/Logic but the agent can be uncertain about the value of X As usual, the domain of a random variable X, written dom(X), is the set of values X can take Types of variables Boolean: e.g., Cancer (does the patient have cancer or not?) Categorical: e.g., CancerType could be one of {breastCancer, lungCancer, skinMelanomas} Numeric: e.g., Temperature (integer or real) We will focus on Boolean and categorical variables 16

Random Variables (cont’) A tuple of random variables is a complex random variable with domain.. Dom(X 1 ) × Dom(X 2 )… × Dom(X n )… Cavity = T; Weather = sunny 17

Possible Worlds E.g., if we model only two Boolean variables Cavity and Toothache, then there are 4 distinct possible worlds: w1: Cavity = T  Toothache = T w2: Cavity = T  Toothache = F w3; Cavity = F  Toothache = T w4: Cavity = T  Toothache = T A possible world specifies an assignment to each random variable possible worlds are mutually exclusive and exhaustive CavityToothache TT TF FT FF w╞ f means that proposition f is true in world w A probability measure  (w) over possible worlds w is a nonnegative real number such that -  (w) sums to 1 over all possible worlds w Why does this make sense? Slide 18

Possible Worlds E.g., if we model only two Boolean variables Cavity and Toothache, then there are 4 distinct possible worlds: w1: Cavity = T  Toothache = T w2: Cavity = T  Toothache = F w3; Cavity = F  Toothache = T w4: Cavity = T  Toothache = T A possible world specifies an assignment to each random variable possible worlds are mutually exclusive and exhaustive CavityToothache TT TF FT FF w╞ f means that proposition f is true in world w A probability measure  (w) over possible worlds w is a nonnegative real number such that -  (w) sums to 1 over all possible worlds w - The probability of proposition f is defined by: P(f )=Σ w╞ f µ(w ). i.e. sum of the probabilities of the worlds w in which f is true Why does this make sense?Because for sure we are in one of these worlds! 19

Possible Worlds Semantics Example: weather in Vancouver one Boolean variable: Weather with domain {sunny, cloudy} Possible worlds: w 1 : Weather = sunny w 2 : Weather = cloudy If p(Weather = sunny) = 0.4 What is the probability of p(Weather = cloudy)? w╞ f means that proposition f is true in world w A probability measure  (w) over possible worlds w is a nonnegative real number such that -  (w) sums to 1 over all possible worlds w - The probability of proposition f is defined by: P(f )=Σ w╞ f µ(w). 20

Possible Worlds Semantics Example: weather in Vancouver one Boolean variable: Weather with domain {sunny, cloudy} Possible worlds: w 1 : Weather = sunny w 2 : Weather = cloudy If p(Weather = sunny) = 0.4 What is the probability of p(Weather = cloudy)? p(Weather = sunny) = 0.4 means that  (w 1 ) is 0.4  (w 1 ) and  (w 2 ) have to sum to 1 (those are the only 2 possible worlds) So  (w 2 ) has to be 0.6, and thus p(Weather = cloudy) = 0.6 w╞ f means that proposition f is true in world w A probability measure  (w) over possible worlds w is a nonnegative real number such that -  (w) sums to 1 over all possible worlds w - The probability of proposition f is defined by: P(f )=Σ w╞ f µ(w). 21

One more example Now we have an additional variable: Temperature, with domain {hot, mild, cold} There are now 6 possible worlds: What’s the probability of it being cloudy and cold? B. 0.2 A. 0.1C. 0.3E. Not enough info WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold? D. 1 22

One more example Now we have an additional variable: Temperature, with domain {hot, mild, cold} There are now 6 possible worlds: What’s the probability of it being cloudy and cold? = 0.8 It is 0.2: the probability has to sum to 1 over all possible worlds WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold? 23

One more example Now we have an additional variable: Temperature, with domain {hot, mild, cold} There are now 6 possible worlds: What’s the probability of it being cloudy or cold? B. 0.6A. 1 C. 0.3 D. 0.7 WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold0.20 Remember - The probability of proposition f is defined by: P(f )=Σ w╞ f µ(w) - sum of the probabilities of the worlds w in which f is true 24

One more example Now we have an additional variable: Temperature, with domain {hot, mild, cold} There are now 6 possible worlds: What’s the probability of it being cloudy or cold? µ(w3) + µ(w4) + µ(w5) + µ(w6) = 0.7 WeatherTemperatureµ(w) w1sunnyhot0.10 w2sunnymild0.20 w3sunnycold0.10 w4cloudyhot0.05 w5cloudymild0.35 w6cloudycold0.20 Remember - The probability of proposition f is defined by: P(f )=Σ w╞ f µ(w) - sum of the probabilities of the worlds w in which f is true 25

Probability Distributions Consider the case where possible worlds are simply assignments to one random variable. When dom(X) is infinite we need a probability density function We will focus on the finite case Definition (probability distribution) A probability distribution P on a random variable X is a function dom(X)  [0,1] such that x  P(X=x) 26

Joint Distribution Joint distribution over random variables X 1, …, X n : a probability distribution over the joint random variable with domain dom(X 1 ) × … × dom(X n ) (the Cartesian product) Think of a joint distribution over n variables as the n-dimensional table of the corresponding possible worlds Each row corresponds to an assignment X 1 = x 1, …, X n = x n and its probability P(X 1 = x 1, …,X n = x n ) We can also write P(X 1 = x 1  …  X n = x n ) The sum of probabilities across the whole table is 1. E.g., {Weather, Temperature} example from before WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold

Lecture Overview Intro to Reasoning Under Uncertainty Introduction to Probability Random Variables and Possible World Semantics Probability Distributions Marginalization Slide 28

Marginalization Given the joint distribution, we can compute distributions over subsets of the variables through marginalization: We also write this as P(X) =  z  dom(Z) P(X, Z = z). Simply an application of the definition of probability measure! P(X=x) =  z  dom(Z) P(X=x, Z = z) Marginalization over Z Remember? - The probability of proposition f is defined by: P(f )=Σ w╞ f µ(w) - sum of the probabilities of the worlds w in which f is true 29

Marginalization Given the joint distribution, we can compute distributions over subsets of the variables through marginalization: We also write this as P(X) =  z  dom(Z) P(X, Z = z). This corresponds to summing out a dimension in the table.. Temperatureµ(w) hot? mild? cold? WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold0.20 P(X=x) =  z  dom(Z) P(X=x, Z = z) Marginalization over Z B. True A. False Marginalization over Weather Probabilities in new table still sum to 1 C. It Depends 30

Marginalization Given the joint distribution, we can compute distributions over subsets of the variables through marginalization: We also write this as P(X) =  z  dom(Z) P(X, Z = z). This corresponds to summing out a dimension in the table.. Temperatureµ(w) hot? mild? cold? WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold0.20 P(X=x) =  z  dom(Z) P(X=x, Z = z) Marginalization over Z B.True since it’s a probability distribution! Marginalization over Weather Probabilities in new table still sum to 1 31

Marginalization Given the joint distribution, we can compute distributions over subsets of the variables through marginalization: We also write this as P(X) =  z  dom(Z) P(X, Z = z). This corresponds to summing out a dimension in the table. Temperatureµ(w) hot?? mild cold WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold0.20 P(X=x) =  z  dom(Z) P(X=x, Z = z) Marginalization over Z How do we compute P(T =hot)? 32

Marginalization Given the joint distribution, we can compute distributions over subsets of the variables through marginalization: We also write this as P(X) =  z  dom(Z) P(X, Z = z). This corresponds to summing out a dimension in the table. Temperatureµ(w) hot?? mild cold WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold0.20 P(Temperature=hot) = P(Weather=sunny, Temperature = hot) + P(Weather=cloudy, Temperature = hot) = P(X=x) =  z  dom(Z) P(X=x, Z = z) Marginalization over Z 33

Marginalization Given the joint distribution, we can compute distributions over subsets of the variables through marginalization: We also write this as P(X) =  z  dom(Z) P(X, Z = z). This corresponds to summing out a dimension in the table. Temperatureµ(w) hot0.15 mild cold WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold0.20 P(Temperature=hot) = P(Weather=sunny, Temperature = hot) + P(Weather=cloudy, Temperature = hot) = = 0.15 P(X=x) =  z  dom(Z) P(X=x, Z = z) Marginalization over Z 34

Marginalization Given the joint distribution, we can compute distributions over subsets of the variables through marginalization: We also write this as P(X) =  z  dom(Z) P(X, Z = z). This corresponds to summing out a dimension in the table. Temperatureµ(w) hot0.15 mild?? cold WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold0.20 B A C D P(X=x) =  z  dom(Z) P(X=x, Z = z) Marginalization over Z 35

Marginalization Given the joint distribution, we can compute distributions over subsets of the variables through marginalization: We also write this as P(X) =  z  dom(Z) P(X, Z = z). This corresponds to summing out a dimension in the table. Temperatureµ(w) hot0.15 mild?? cold WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold0.20 P(X=x) =  z  dom(Z) P(X=x, Z = z) Marginalization over Z 36

Marginalization Given the joint distribution, we can compute distributions over subsets of the variables through marginalization: We also write this as P(X) =  z  dom(Z) P(X, Z = z). This corresponds to summing out a dimension in the table. Temperatureµ(w) hot0.15 mild0.55 cold??? WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold0.20 P(X=x) =  z  dom(Z) P(X=x, Z = z) Marginalization over Z 37

Marginalization Given the joint distribution, we can compute distributions over subsets of the variables through marginalization: We also write this as P(X) =  z  dom(Z) P(X, Z = z). This corresponds to summing out a dimension in the table. The new table still sums to 1. It must, since it’s a probability distribution! Temperatureµ(w) hot0.15 mild0.55 cold0.30 WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold0.20 Alternative way to compute last entry: probabilities have to sum to 1. P(X=x) =  z  dom(Z) P(X=x, Z = z) Marginalization over Z 38

Given the joint distribution, we can compute distributions over smaller sets of variables through marginalization: We also write this as P(X) =  z  dom(Z) P(X, Z = z). You can marginalize over any of the variables Marginalization Weatherµ(w) sunny?? cloudy WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold0.20 P(X=x) =  z  dom(Z) P(X=x, Z = z) Marginalization over Z e.g., Marginalization over Temperature B A C D

Given the joint distribution, we can compute distributions over smaller sets of variables through marginalization: We also write this as P(X) =  z  dom(Z) P(X, Z = z). You can marginalize over any of the variables Marginalization Weatherµ(w) sunny0.40 cloudy WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold0.20 P(Weather=sunny) = P(Weather=sunny, Temperature = hot) + P(Weather=sunny, Temperature = mild) + P(Weather=sunny, Temperature = cold) = = 0.40 P(X=x) =  z  dom(Z) P(X=x, Z = z) Marginalization over Z e.g., Marginalization over Temperature 40

Marginalization Given the joint distribution, we can compute distributions over smaller sets of variables through marginalization: We also write this as P(X) =  z  dom(Z) P(X, Z = z). You can marginalize over any of the variables Weatherµ(w) sunny0.40 cloudy0.60 WeatherTemperatureµ(w) sunnyhot0.10 sunnymild0.20 sunnycold0.10 cloudyhot0.05 cloudymild0.35 cloudycold0.20 P(X=x) =  z  dom(Z) P(X=x, Z = z) Marginalization over Z e.g., Marginalization over Temperature 41

Marginalization We also marginalize over more than one variable at once E.g. go from P(Wind, Weather, Temperature) to P (Weather) Weatherµ(w) sunny cloudy WindWeatherTemperatureµ(w) yessunnyhot0.04 yessunnymild0.09 yessunnycold0.07 yescloudyhot0.01 yescloudymild0.10 yescloudycold0.12 nosunnyhot0.06 nosunnymild0.11 nosunnycold0.03 nocloudyhot0.04 nocloudymild0.25 nocloudycold0.08 i.e., Marginalization over Temperature and Wind P(X=x) =  z 1  dom(Z 1 ),…, z n  dom(Z n ) P(X=x, Z 1 = z 1, …, Z n = z n ) 42

Marginalization We also marginalize over more than one variable at once E.g. go from P(Wind, Weather, Temperature) to P (Weather) Weatherµ(w) sunny??? cloudy WindWeatherTemperatureµ(w) yessunnyhot0.04 yessunnymild0.09 yessunnycold0.07 yescloudyhot0.01 yescloudymild0.10 yescloudycold0.12 nosunnyhot0.06 nosunnymild0.11 nosunnycold0.03 nocloudyhot0.04 nocloudymild0.25 nocloudycold0.08 i.e., Marginalization over Temperature and Wind P(X=x) =  z 1  dom(Z 1 ),…, z n  dom(Z n ) P(X=x, Z 1 = z 1, …, Z n = z n ) B A C D

Marginalization We can also marginalize over more than one variable at once Weatherµ(w) sunny0.40 cloudy WindWeatherTemperatureµ(w) yessunnyhot0.04 yessunnymild0.09 yessunnycold0.07 yescloudyhot0.01 yescloudymild0.10 yescloudycold0.12 nosunnyhot0.06 nosunnymild0.11 nosunnycold0.03 nocloudyhot0.04 nocloudymild0.25 nocloudycold0.08 P(X=x) =  z 1  dom(Z 1 ),…, z n  dom(Z n ) P(X=x, Z 1 = z 1, …, Z n = z n ) i.e., Marginalization over Temperature and Wind 44

Marginalization We can also get marginals for more than one variable WindWeatherTemperatureµ(w) yessunnyhot0.04 yessunnymild0.09 yessunnycold0.07 yescloudyhot0.01 yescloudymild0.10 yescloudycold0.12 nosunnyhot0.06 nosunnymild0.11 nosunnycold0.03 nocloudyhot0.04 nocloudymild0.25 nocloudycold0.08 WeatherTemperatureµ(w) sunnyhot0.10 sunnymild sunnycold cloudyhot cloudymild cloudycold P(X=x,Y=y) =  z 1  dom(Z 1 ),…, z n  dom(Z n ) P(X=x, Y=y, Z 1 = z 1, …, Z n = z n ) The probability of proposition f is P(f )=Σ w╞ f µ(w): sum of the probabilities of the worlds w in which f is true Still simply an application of the definition of probability measure 45

Define and give examples of random variables, their domains and probability distributions Calculate the probability of a proposition f given µ(w) for the set of possible worlds Define a joint probability distribution (JPD) Marginalize over specific variables to compute distributions over any subset of the variables Next Time: Conditional Probabilities Learning Goals For Probability so far Slide 46