# Probabilistic Reasoning Course 8

## Presentation on theme: "Probabilistic Reasoning Course 8"— Presentation transcript:

Probabilistic Reasoning Course 8
Artificial Intelligence Probabilistic Reasoning Course 8

Logic Propositional Logic Predicate Logic
The study of statements and their connectivity structure Predicate Logic The study of individuals and their properties

Uncertainty Agents almost never have access to the hole truth about their environment Agents have to act under uncertainty Rational decisions depends Relative importance of several goals Likelihood Degree to which they will be achived Example toofache

Uncertainty Complete description of environment problems
Laziness – too much work to describe all facts Theoretical ignorance – no complete theory of the domain (medicine) Practical Ignorance – not all situations are analyzed Degree of belief of affirmations Probability theory Dempster-Shafer theory Fuzzy logic Truth maintenance systems Nonmonotonic reasoning

Probabilities Provides a way of summarizing the uncertainty that cones from laziness and ignorance

Uncertainty Abduction is a reasoning process that tries to form plausible explanations for abnormal observations Abduction is distinctly different from deduction and induction Abduction is inherently uncertain Uncertainty is an important issue in abductive reasoning Definition (Encyclopedia Britannica): reasoning that derives an explanatory hypothesis from a given set of facts The inference result is a hypothesis that, if true, could explain the occurrence of the given facts

Comparing abduction, deduction, and induction
Deduction: major premise: All balls in the box are black minor premise: These balls are from the box conclusion: These balls are black Abduction: rule: All balls in the box are black observation: These balls are black explanation: These balls are from the box Induction: case: These balls are from the box hypothesized rule: All ball in the box are black A => B A B A => B B Possibly A Whenever A then B Possibly A => B Deduction reasons from causes to effects Abduction reasons from effects to causes Induction reasons from specific cases to general rules

Uncertainty Uncertain inputs Uncertain knowledge Uncertain outputs
Missing data Noisy data Uncertain knowledge Multiple causes lead to multiple effects Incomplete enumeration of conditions or effects Incomplete knowledge of causality in the domain Probabilistic/stochastic effects Uncertain outputs Abduction and induction are inherently uncertain Default reasoning, even in deductive fashion, is uncertain Incomplete deductive inference may be uncertain Probabilistic reasoning only gives probabilistic results (summarizes uncertainty from various sources)

Probabilities Kolmogorov showed that three simple axioms lead to the rules of probability theory De Finetti, Cox, and Carnap have also provided compelling arguments for these axioms All probabilities are between 0 and 1: 0 ≤ P(a) ≤ 1 Valid propositions (tautologies) have probability 1, and unsatisfiable propositions have probability 0: P(true) = 1 ; P(false) = 0 The probability of a disjunction is given by: P(a  b) = P(a) + P(b) – P(a  b) a ab b

Conditional Probabilities
P(A|B) – the part of the environment in which B is true and A is also true Probability of A, conditioned by B D = headache, P(D)=1/10 G = flue, P(G)=1/40 P(D|G) = ½ If someone has flue, the probability of also having headache is 50% P(D|G)=P(D^G)/P(G)

Bayesian Theorem P(A|B) = P(A^B) / P(B) P(A^B) = P(A|B) * P(B)
P(A^B) = P(B|A) * P(A) =>P(B|A) = P(A|B) * P(B) / P(A)

Example Diagnosis Known probabilities
Meningitis: P(M)=0.002% Stiffed neck: P(N)=5% Meningitis causes in half of cases stiffed neck: P(N|M)=50% If a patient has stiffed neck, what is the probability to have meningitis? P(M|N) = P(G|M)*P(M)/P(G) = 0.02%

Independence Variables A and B are independent if any of the following hold: P(A,B) = P(A) P(B) P(A | B) = P(A) P(B | A) = P(B) This says that knowing the outcome of A does not tell me anything new about the outcome of B.

Independence How is independence useful?
Suppose you have n coin flips and you want to calculate the joint distribution P(C1, …, Cn) If the coin flips are not independent, you need 2n values in the table If the coin flips are independent, then

Conditional Independence
Variables A and B are conditionally independent given C if any of the following hold: P(A, B | C) = P(A | C) P(B | C) P(A | B, C) = P(A | C) P(B | A, C) = P(B | C) Knowing C tells me everything about B. I don’t gain anything by knowing A (either because A doesn’t influence B or because knowing C provides all the information knowing A would give)

Bayesian Network Directed acyclic graphs (DAF) where the nodes represent random variables and directed edges capture their dependence Each node in the graph is a random variable A node X is a parent of another node Y if there is an arrow from node X to node Y eg. A is a parent of B A B C D Informally, an arrow from node X to node Y means X has a direct influence on Y

Bayesian Networks Two important properties:
Encodes the conditional independence relationships between the variables in the graph structure Is a compact representation of the joint probability distribution over the variables

Conditional Independence
The Markov condition: given its parents (P1, P2), a node (X) is conditionally independent of its non-descendants (ND1, ND2) P1 P2 ND1 X ND2 C1 C2

The Joint Probability Distribution
Due to the Markov condition, we can compute the joint probability distribution over all the variables X1, …, Xn in the Bayesian net using the formula: Where Parents(Xi) means the values of the Parents of the node Xi with respect to the graph

Example Variables: weather can have three states: sunny, cloudy, or rainy grass can be wet or dry sprinkler can be on or off Causal links in this world: If it is rainy, then it will make the grass wet directly. But if it is sunny for a long time, that too can make the grass wet, indirectly, by causing us to turn on the sprinkler.

The links may form loops, but they may not form cycles

Dempster-Shafer Theory
It is based on the work of Dempster who attempted to model uncertainty by a range of probabilities rather than a single probabilistic number. Belife [bel,pl] interval Bel = belief bel(A) Pl = plausibility pl(A)=1-bel(¬A) If no information about A and ¬A is present the belief interval is [0,1] (not 0.5) In the knowledge acquisition process the interval becomes smaller bel(A)<=P(A)<=pl(A)

Example Sue tells the true 90% of the times
P(M)=0.9, P(¬M)=0.1 Bill tells the true 80% of the times P(B)=0.8, P(¬B)=0.2 Case 1: Sue and Bill tell George that his car has been stolen Probability that non of them to be trustable is 0.02 Probability that at least one them is trustable is =0.98 Belief interval is [0.98,1]

Example 2 Case 2. Sue says that the car is stolen and Bill says that not Can be both affirmations trustable (contradictions) Sue is trustable (the car was stolen) 0.9 x (1-0.8) = 0.18 Bill is trustable (the car was not stolen) 0.8 x (1-0.9) = 0.08 Both of them are not trustable (no concrete information) (1-0.8) x (1-0.9) = 0.02 All non null probabilities =0.28 Belief that the car was stolen 0.18/0.28=0.64 Belief that the car was not stolen 0.o8/0.28=0.29 The belief interval that the car was not stolen [0.64,1-0.29]=[0.64,0.71]

Next Course Laboratory Planning and Reasoning in real world
Start CLISP