Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reasoning Under Uncertainty

Similar presentations


Presentation on theme: "Reasoning Under Uncertainty"— Presentation transcript:

1 Reasoning Under Uncertainty
Artificial Intelligence CMSC 25000 February 15, 2007

2 Agenda Motivation Probability and Bayes’ Rule
Reasoning with uncertainty Medical Informatics Probability and Bayes’ Rule Bayesian Networks Noisy-Or Decision Trees and Rationality Conclusions

3 Uncertainty Search and Planning Agents Real World:
Assume fully observable, deterministic, static Real World: Probabilities capture “Ignorance & Laziness” Lack relevant facts, conditions Failure to enumerate all conditions, exceptions Partially observable, stochastic, extremely complex Can't be sure of success, agent will maximize Bayesian (subjective) probabilities relate to knowledge

4 Motivation Uncertainty in medical diagnosis
Diseases produce symptoms In diagnosis, observed symptoms => disease ID Uncertainties Symptoms may not occur Symptoms may not be reported Diagnostic tests not perfect False positive, false negative How do we estimate confidence?

5 Motivation II Uncertainty in medical decision-making
Physicians, patients must decide on treatments Treatments may not be successful Treatments may have unpleasant side effects Choosing treatments Weigh risks of adverse outcomes People are BAD at reasoning intuitively about probabilities Provide systematic analysis

6 Probability Basics The sample space:
A set Ω ={ω1, ω2, ω3,… ωn} E.g 6 possible rolls of die; ωi is a sample point/atomic event Probability space/model is a sample space with an assignment P(ω) for every ω in Ω s.t. 0<= P(ω)<=1; Σ ωP(ω) = 1 E.g. P(die roll < 4)=1/6+1/6+1/6=1/2

7 Random Variables A random variable is a function from sample points to a range (e.g. reals, bools) E.g. Odd(1) = true P induces a probability distribution for any r.v X: P(X=xi) = Σ{ω:X(ω)=xi}P(ω) E.g. P(Odd=true)=1/6+1/6+1/6=1/2 Proposition is event (set of sample pts) s.t. proposition is true: e.g. event a= A(ω)=true

8 Why probabilities? Definitions imply that logically related events have related probabilities In AI applications, sample points are defined by set of random variables Random vars: boolean, discrete, continuous

9 Prior Probabilities Prior probabilities: belief prior to evidence
E.g. P(cavity=t)=0.2; P(weather=sunny)=0.6 Distribution gives values for all assignments Joint distribution on set of r.v.s gives probability on every atomic event of r.v.s E.g. P(weather,cavity)=4x2 matrix of values Every question about a domain can be answered with joint b/c every event is a sum of sample pts

10 Conditional Probabilities
Conditional (posterior) probabilities E.g. P(cavity|toothache) = 0.8, given only that P(cavity|toothache)=2 elt vector of 2 elt vectors Can add new evidence, possibly irrelevant P(a|b) = P(a^b)/P(b) where P(b) ≠0 Also, P(a^b)=P(a|b)P(b)=P(b|a)P(a) Product rule generalizes to chaining

11 Inference By Enumeration

12 Inference by Enumeration

13 Inference by Enumeration

14 Independence

15 Conditional Independence

16 Conditional Independence II

17 Probabilities Model Uncertainty
The World - Features Random variables Feature values States of the world Assignments of values to variables Exponential in # of variables possible states

18 Probabilities of World States
: Joint probability of assignments States are distinct and exhaustive Typically care about SUBSET of assignments aka “Circumstance” Exponential in # of don’t cares

19 A Simpler World 2^n world states = Maximum entropy
Know nothing about the world Many variables independent P(strep,ebola) = P(strep)P(ebola) Conditionally independent Depend on same factors but not on each other P(fever,cough|flu) = P(fever|flu)P(cough|flu)

20 Probabilistic Diagnosis
Question: How likely is a patient to have a disease if they have the symptoms? Probabilistic Model: Bayes’ Rule P(D|S) = P(S|D)P(D)/P(S) Where P(S|D) : Probability of symptom given disease P(D): Prior probability of having disease P(S): Prior probability of having symptom

21 Diagnosis Consider Meningitis: Disease: Meningitis: m
Symptom: Stiff neck: s P(s|m) = 0.5 P(m) =0.0001 P(s) = 0.1 How likely is it that someone with a stiff neck actually has meningitis?

22 Modeling (In)dependence
Simple, graphical notation for conditional independence; compact spec of joint Bayesian network Nodes = Variables Directed acyclic graph: link ~ directly influences Arcs = Child depends on parent(s) No arcs = independent (0 incoming: only a priori) Parents of X = For each X need

23 Example I

24 Simple Bayesian Network
MCBN1 Need: P(A) P(B|A) P(C|A) P(D|B,C) P(E|C) Truth table 2 2*2 2*2*2 A = only a priori B depends on A C depends on A D depends on B,C E depends on C A B C D E

25 Simplifying with Noisy-OR
How many computations? p = # parents; k = # values for variable (k-1)k^p Very expensive! 10 binary parents=2^10=1024 Reduce computation by simplifying model Treat each parent as possible independent cause Only 11 computations 10 causal probabilities + “leak” probability “Some other cause”

26 Noisy-OR Example A B Pn(b|a) = 1-(1-ca)(1-L) Pn(b|a) = (1-ca)(1-L)
Pn(b|a) = 1-(1 -L) = L = 0.5 Pn(b|a) = (1-L) P(B|A) b b Pn(b|a) = 1-(1-ca)(1-L)=0.6 (1-ca)(1-L)=0.4 (1-ca) =0.4/(1-L) =0.4/0.5=0.8 ca = 0.2 a

27 Noisy-OR Example II Full model: P(c|ab)P(c|ab)P(c|ab)P(c|ab) & neg A B
Noisy-Or: ca, cb, L Assume: P(a)=0.1 P(b)=0.05 Pn(c|ab)=0.3 ca= 0.5 Pn(c|b) = 0.7 C Pn(c|ab) = 1-(1-ca)(1-cb)(1-L) Pn(c|ab) = 1-(1-cb)(1-L) Pn(c|ab) = 1-(1-ca)(1-L) Pn(c|ab) = 1-(1-L) = L = 0.3 Pn(c|b)=Pn(c|ab)P(a)+Pn(c|ab)P(a) 1-0.7=(1-ca)(1-cb)(1-L)0.1+(1-cb)(1-L)0.9 0.3=0.5(1-cb)0.07+(1-cb)0.7*0.9 =0.035(1-cb)+0.63(1-cb)=0.665(1-cb) 0.55=cb

28 Graph Models Bipartite graphs E.g. medical reasoning
Generally, diseases cause symptom (not reverse) s1 s2 d1 s3 d2 s4 d3 s5 d4 s6

29 Topologies Generally more complex General Bayes Nets
Polytree: One path between any two nodes General Bayes Nets Graphs with undirected cycles No directed cycles - can’t be own cause Issue: Automatic net acquisition Update probabilities by observing data Learn topology: use statistical evidence of indep, heuristic search to find most probable structure

30 Holmes Example (Pearl)
Holmes is worried that his house will be burgled. For the time period of interest, there is a 10^-4 a priori chance of this happening, and Holmes has installed a burglar alarm to try to forestall this event. The alarm is 95% reliable in sounding when a burglary happens, but also has a false positive rate of 1%. Holmes’ neighbor, Watson, is 90% sure to call Holmes at his office if the alarm sounds, but he is also a bit of a practical joker and, knowing Holmes’ concern, might (30%) call even if the alarm is silent. Holmes’ other neighbor Mrs. Gibbons is a well-known lush and often befuddled, but Holmes believes that she is four times more likely to call him if there is an alarm than not.

31 Holmes Example: Model There a four binary random variables:
B: whether Holmes’ house has been burgled A: whether his alarm sounded W: whether Watson called G: whether Gibbons called W B A G

32 Holmes Example: Tables
B = #t B=#f A #t #f W=#t W=#f A=#t A=#f B #t #f A #t #f G=#t G=#f

33 Decision Making Design model of rational decision making
Maximize expected value among alternatives Uncertainty from Outcomes of actions Choices taken To maximize outcome Select maximum over choices Weighted average value of chance outcomes

34 Gangrene Example Medicine Amputate foot Worse 0.25 Full Recovery 0.7
1000 Die 0.05 Die 0.01 Live 0.99 850 Medicine Amputate leg Live 0.6 995 Live 0.98 700 Die 0.4 Die 0.02

35 Decision Tree Issues Problem 1: Tree size Solution 1: Hill-climbing
k activities : 2^k orders Solution 1: Hill-climbing Choose best apparent choice after one step Use entropy reduction Problem 2: Utility values Difficult to estimate, Sensitivity, Duration Change value depending on phrasing of question Solution 2c: Model effect of outcome over lifetime

36 Conclusion Reasoning with uncertainty Bayes’ Nets Decision Trees
Many real systems uncertain - e.g. medical diagnosis Bayes’ Nets Model (in)dependence relations in reasoning Noisy-OR simplifies model/computation Assumes causes independent Decision Trees Model rational decision making Maximize outcome: Max choice, average outcomes

37 Holmes Example (Pearl)
Holmes is worried that his house will be burgled. For the time period of interest, there is a 10^-4 a priori chance of this happening, and Holmes has installed a burglar alarm to try to forestall this event. The alarm is 95% reliable in sounding when a burglary happens, but also has a false positive rate of 1%. Holmes’ neighbor, Watson, is 90% sure to call Holmes at his office if the alarm sounds, but he is also a bit of a practical joker and, knowing Holmes’ concern, might (30%) call even if the alarm is silent. Holmes’ other neighbor Mrs. Gibbons is a well-known lush and often befuddled, but Holmes believes that she is four times more likely to call him if there is an alarm than not.

38 Holmes Example: Model There a four binary random variables:
B: whether Holmes’ house has been burgled A: whether his alarm sounded W: whether Watson called G: whether Gibbons called W B A G

39 Holmes Example: Tables
B = #t B=#f A #t #f W=#t W=#f A=#t A=#f B #t #f A #t #f G=#t G=#f


Download ppt "Reasoning Under Uncertainty"

Similar presentations


Ads by Google