Artificial Intelligence CS 165A Tuesday, November 20, 2007  Knowledge Representation (Ch 10)  Uncertainty (Ch 13)

Slides:



Advertisements
Similar presentations
Probability: Review The state of the world is described using random variables Probabilities are defined over events –Sets of world states characterized.
Advertisements

Uncertainty Everyday reasoning and decision making is based on uncertain evidence and inferences. Classical logic only allows conclusions to be strictly.
1 Slides for the book: Probabilistic Robotics Authors: Sebastian Thrun Wolfram Burgard Dieter Fox Publisher: MIT Press, Web site for the book & more.
CPSC 422 Review Of Probability Theory.
Probability.
Probability Review 1 CS479/679 Pattern Recognition Dr. George Bebis.
Probability Notation Review Prior (unconditional) probability is before evidence is obtained, after is posterior or conditional probability P(A) – Prior.
CS 547: Sensing and Planning in Robotics Gaurav S. Sukhatme Computer Science Robotic Embedded Systems Laboratory University of Southern California
Uncertainty Chapter 13. Uncertainty Let action A t = leave for airport t minutes before flight Will A t get me there on time? Problems: 1.partial observability.
KI2 - 2 Kunstmatige Intelligentie / RuG Probabilities Revisited AIMA, Chapter 13.
1 Bayesian Reasoning Chapter 13 CMSC 471 Adapted from slides by Tim Finin and Marie desJardins.
Uncertainty Management for Intelligent Systems : for SEP502 June 2006 김 진형 KAIST
Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.
Probability and Information Copyright, 1996 © Dale Carnegie & Associates, Inc. A brief review (Chapter 13)
CSc411Artificial Intelligence1 Chapter 5 STOCHASTIC METHODS Contents The Elements of Counting Elements of Probability Theory Applications of the Stochastic.
Uncertainty Chapter 13.
Uncertainty Chapter 13.
Probability and Statistics Review Thursday Sep 11.
Joint Probability Distributions
For Monday after Spring Break Read Homework: –Chapter 13, exercise 6 and 8 May be done in pairs.
Methods in Computational Linguistics II Queens College Lecture 2: Counting Things.
Handling Uncertainty. Uncertain knowledge Typical example: Diagnosis. Consider:  x Symptom(x, Toothache)  Disease(x, Cavity). The problem is that this.
Artificial Intelligence CS 165A Tuesday, November 27, 2007  Probabilistic Reasoning (Ch 14)
Machine Learning Queens College Lecture 3: Probability and Statistics.
An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati
Probability Calculations Matt Huenerfauth CSE-391: Artificial Intelligence University of Pennsylvania April 2005.
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
Uncertainty Chapter 13. Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
1 Chapter 13 Uncertainty. 2 Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
1 Reasoning Under Uncertainty Artificial Intelligence Chapter 9.
CSE PR 1 Reasoning - Rule-based and Probabilistic Representing relations with predicate logic Limitations of predicate logic Representing relations.
Chapter 13 February 19, Acting Under Uncertainty Rational Decision – Depends on the relative importance of the goals and the likelihood of.
Uncertainty. Assumptions Inherent in Deductive Logic-based Systems All the assertions we wish to make and use are universally true. Observations of the.
Probability and Information Copyright, 1996 © Dale Carnegie & Associates, Inc. A brief review.
Probability Course web page: vision.cis.udel.edu/cv March 19, 2003  Lecture 15.
Uncertainty in Expert Systems
Uncertainty Chapter 13. Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
Uncertainty Chapter 13. Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
Uncertainty ECE457 Applied Artificial Intelligence Spring 2007 Lecture #8.
Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight.
Uncertainty Let action A t = leave for airport t minutes before flight Will A t get me there on time? Problems: 1.partial observability (road state, other.
Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.
Conditional Probability, Bayes’ Theorem, and Belief Networks CISC 2315 Discrete Structures Spring2010 Professor William G. Tanner, Jr.
CSE 473 Uncertainty. © UW CSE AI Faculty 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one has stood.
Uncertainty Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
CS 2750: Machine Learning Probability Review Prof. Adriana Kovashka University of Pittsburgh February 29, 2016.
Uncertainty & Probability CIS 391 – Introduction to Artificial Intelligence AIMA, Chapter 13 Many slides adapted from CMSC 421 (U. Maryland) by Bonnie.
Anifuddin Azis UNCERTAINTY. 2 Introduction The world is not a well-defined place. There is uncertainty in the facts we know: What’s the temperature? Imprecise.
Probabilistic Robotics Probability Theory Basics Error Propagation Slides from Autonomous Robots (Siegwart and Nourbaksh), Chapter 5 Probabilistic Robotics.
Matching ® ® ® Global Map Local Map … … … obstacle Where am I on the global map?                                   
Pattern Recognition Probability Review
Review of Probability.
Probability and Information Theory
Chapter 10: Using Uncertain Knowledge
Where are we in CS 440? Now leaving: sequential, deterministic reasoning Entering: probabilistic reasoning and machine learning.
Uncertainty Chapter 13.
Conditional Probability, Bayes’ Theorem, and Belief Networks
Uncertainty.
Probability and Information
Uncertainty in Environments
CSE-490DF Robotics Capstone
Probability and Information
Professor Marie desJardins,
CS 188: Artificial Intelligence Fall 2008
CS 188: Artificial Intelligence Fall 2007
Class #21 – Monday, November 10
Bayesian Reasoning Chapter 13 Thomas Bayes,
Uncertainty Chapter 13.
Chapter 14 February 26, 2004.
Uncertainty Chapter 13.
Presentation transcript:

Artificial Intelligence CS 165A Tuesday, November 20, 2007  Knowledge Representation (Ch 10)  Uncertainty (Ch 13)

2 Notes HW #4 due by noon tomorrow Reminder: Final exam December 14, 4-7pm –Review in class on Dec. 6th

3 Situation Calculus – actions, events “Situation Calculus” is a way of describing change over time in first-order logic –Fluents: Functions or predicates that can vary over time have an extra argument, S i (the situation argument)  Predicate(args, S i )  Location of an agent, aliveness, changing properties,... –The Result function is used to represent change from one situation to another resulting from an action (or action sequence)  Result(GoForward, S i ) = S j  “S j is the situation that results from the action GoForward applied to situation S i  Result() indicates the relationship between situations Review

4 Situation Calculus Represents the world in different “situations” and the relationship between situations Review

5 Situation Calculus Represents the world in different “situations” and the relationship between situations Review

6 Examples How would you interpret the following sentences in First- Order Logic using situation calculus?  x, s Studying(x, s)  Failed(x, Result(TakeTest, s))  x, s TurnedOn(x, s)  LightSwitch(x)  TurnedOff(x, Result(FlipSwitch, s)) Review If you’re studying and then you take the test, you will fail. (or) Studying a subject implies that you will fail the test for that subject. If you flip the light switch when it is turned on, it will then be turned off.

7 There are other ways to deal with time Event calculus –Based on points in time rather than situations –Designed to allow reasoning over periods of time  Can represent actions with duration, overlapping actions, etc. Generalized events –Parts of a general “space-time chunk” Processes –Not just discrete events Intervals –Moments and durations of time Objects with state fluents –Not just events, but objects can also have time properties

8 Event calculus relations Initiates(e, f, t) –Event e at time t causes fluent f to become true Terminates(e, f, t) –Event e at time t causes fluent f to no longer be true Happens(e, t) –Event e happens at time t Clipped(f, t 1, t 2 ) –f is terminated by some event sometime between t 1 and t 2

9 Generalized events An ontology of time that allows for reasoning about various temporal events, subevents, durations, processes, intervals, etc. time Australia Space-time chunk

10 Time interval predicates After(ReignOf(ElizabethII), ReignOf(GeorgeVI)) Overlap(Fifties, ReignOf(Elvis)) Start(Fifties) = Start(AD1950) Meet(Fifties, Sixties) Ex:

11 Objects with state fluents President(USA)

12 Knowledge representation Chapter 10 covers many topics in knowledge representation, many of which are important to real, sophisticated AI reasoning systems –We’re only scratching the surface of this topic –Best covered in depth in an advanced AI course and in context of particular AI problems –Read through the Internet shopping world example in 10.5 Now we move on to probabilistic reasoning, a different way of representing and manipulating knowledge – Chapters 13 and 14

13 Quick Review of Probability From here on we will assume that you know this…

14 Probability notation and notes Probabilities of propositions –P(A), P(the sun is shining) Probabilities of random variables –P(X = x 1 ), P(Y = y 1 ), P(x 1 < X < x 2 ) P(A) usually means P(A = True) (A is a proposition, not a variable) –This is a probability value –Technically, P(A) is a probability function P(X = x 1 ) –This is a probability value (P(X) is a probability function) P(X) –This is a probability function or a probability density function Technically, if X is a variable, we should not write P(X) = 0.5 –But rather P(X = x 1 ) = 0.5

15 Discrete and continuous probabilities Discrete: Probability function P(X, Y) is described by an MxN matrix of probabilities – Possible values of each: P(X=x 1, Y=y 1 ) = p 1 –  P(X=x i, Y=y j ) = 1 – P(X, Y, Z) is an MxNxP matrix Continuous: Probability density function (pdf) P(X, Y) is described by a 2D function – P(x 1 < X < x 2, y 1 < Y < y 2 ) = p 1 –  P(X, Y) dX dY = 1

16 Discrete probability distribution X p(X)

17 Continuous probability distribution X p(X)

18 Continuous probability distribution X p(X) P(X=5) = 0 P(X=x 1 ) = 0 P(X=5) = ???

19 Three Axioms of Probability 1.The probability of every event must be nonnegative –For any event A, P(A)  0 2.Valid propositions have probability 1 –P(True) = 1 –P(A   A) = 1 3.For disjoint events A 1, A 2, … –P(A 1  A 2  …) = P(A 1 ) + P(A 2 ) + … From these axioms, all other properties of probabilities can be derived. –E.g., derive P(A) + P(  A) = 1

20 Some consequences of the axioms Unsatisfiable propositions have probability 0 –P(False) = 0 –P(A   A) = 0 For any two events A and B –P(A  B) = P(A) + P(B) – P(A  B) For the complement A c of event A –P(A c ) = 1 – P(A) For any event A –0  P(A)  1 For independent events A and B –P(A  B) = P(A) P(B)

21 Venn Diagram True AB A  B Visualize: P(True), P(False), P(A), P(B), P(  A), P(  B), P(A  B), P(A  B), P(A   B), …

22 Joint Probabilities A complete probability model is a single joint probability distribution over all propositions/variables in the domain –P(X 1, X 2, …, X i, …) A particular instance of the world has the probability –P(X 1 =x 1  X 2 =x 2  …  X i =x i  …) = p Rather than stating knowledge as –Raining  WetGrass We can state it as –P(Raining, WetGrass) = 0.15 –P(Raining,  WetGrass) = 0.01 –P(  Raining, WetGrass) = 0.04 –P(  Raining,  WetGrass) =  WetGrass WetGrass  Raining Raining

23 Conditional Probability Unconditional, or Prior, Probability –Probabilities associated with a proposition or variable, prior to any evidence –E.g., P(WetGrass), P(  Raining) Conditional, or Posterior, Probability –Probabilities after evidence is gathered –P(A | B) – “The probability of A given that we know B” –After (posterior to) procuring evidence –E.g., P(WetGrass | Raining) or Assumes P(Y) nonzero

24 The chain rule By the Chain Rule Precedence: ‘|’ is lowest E.g., P(X | Y, Z) means which? P( (X | Y), Z ) P(X | (Y, Z) ) Notes:

25 Joint probability distribution From P(X,Y), we can always calculate: P(X) P(Y) P(X|Y) P(Y|X) P(X=x 1 ) P(Y=y 2 ) P(X|Y=y 1 ) P(Y|X=x 1 ) P(X=x 1 |Y) etc X Y x1x1 x2x2 x3x3 y1y1 y2y2

x1x1 x2x2 x3x3 y1y1 y2y2 P(X,Y) x1x1 x2x2 x3x3 P(X) y1y1 y2y2 P(Y) x1x1 x2x2 x3x3 y1y1 y2y2 P(X|Y) x1x1 x2x2 x3x3 y1y1 y2y2 P(Y|X) P(X=x 1,Y=y 2 ) = ? P(X=x 1 ) = ? P(Y=y 2 ) = ? P(X|Y=y 1 ) = ? P(X=x 1 |Y) = ?

27 Probability Distributions Continuous vars Scalar*Scalar Function of two variablesMxN matrix Function of two variablesMxN matrix Function of one variableM vector Function of one variableN vector Scalar*Scalar Discrete vars Function (of one variable)M vector P(X=x) P(X,Y) P(X|Y) P(X|Y=y) P(X=x|Y) P(X=x|Y=y) P(X) * - actually zero. Should be P(x 1 < X < x 2 )

28 Bayes’ Rule Since and Then Bayes’ Rule

29 Bayes’ Rule Similarly, P(X) conditioned on two variables: Or N variables:

30 Bayes’ Rule Posterior probability (diagnostic knowledge) Likelihood (causal knowledge) Prior probability This simple equation is very useful in practice –Usually framed in terms of hypotheses (H) and data (D)  Which of the hypotheses is best supported by the data? Normalizing constant

31 Bayes’ rule example: Medical diagnosis Meningitis causes a stiff neck 50% of the time A patient comes in with a stiff neck – what is the probability that he has meningitis? Need to know two things: –The prior probability of a patient having meningitis (1/50,000) –The prior probability of a patient having a stiff neck (1/20) ? P(M | S) = (0.5)( )/(0.05) =

32 Example (cont.) Suppose that we also know about whiplash –P(W) = 1/1000 –P(S | W) = 0.8 What is the relative likelihood of whiplash and meningitis? –P(W | S) / P(M | S) So the relative likelihood of whiplash vs. meningitis is (0.016/0.0002) = 80

33 A useful Bayes rule example A test for a new, deadly strain of anthrax (that has no symptoms) is known to be 99.9% accurate. Should you get tested? The chances of having this strain are one in a million. What are the random variables? A – you have anthrax (boolean) T – you test positive for anthrax (boolean) Notation: Instead of P(A=True) and P(A=False), we will write P(A) and P(  A) What do we want to compute? P(A|T) What else do we need to know or assume? Priors: P(A), P(  A) Given: P(T|A), P(T|  A), P(  T|A), P(  T|  A) ATAT ATAT ATAT ATAT Possibilities

34 Example (cont.) We know: Given: P(T|A) = 0.999, P(T|  A) = 0.001, P(  T|A) = 0.001, P(  T|  A) = Prior knowledge: P(A) = 10 -6, P(  A) = 1 – Want to know P(A|T) P(A|T) = P(T|A) P(A) / P(T) Calculate P(T) by marginalization P(T) = P(T|A) P(A) + P(T|  A) P(  A) = (0.999)(10 -6 ) + (0.001)(1 – )  So P(A|T) = (0.999)(10 -6 ) /  Therefore P(  A|T)  What if you work at a Post Office?

35 All people People without anthrax People with anthrax Good T Bad T (0.1%)