Probability Review 1 CS479/679 Pattern Recognition Dr. George Bebis.

Slides:



Advertisements
Similar presentations
CS433: Modeling and Simulation
Advertisements

Random Variables ECE460 Spring, 2012.
Probability & Statistical Inference Lecture 3
Introduction to stochastic process
CPSC 422 Review Of Probability Theory.
Bayesian Decision Theory Chapter 2 (Duda et al.) – Sections
Probability Theory Part 2: Random Variables. Random Variables  The Notion of a Random Variable The outcome is not always a number Assign a numerical.
KI2 - 2 Kunstmatige Intelligentie / RuG Probabilities Revisited AIMA, Chapter 13.
Ai in game programming it university of copenhagen Welcome to... the Crash Course Probability Theory Marco Loog.
Probability and Information Copyright, 1996 © Dale Carnegie & Associates, Inc. A brief review (Chapter 13)
Tch-prob1 Chapter 4. Multiple Random Variables Ex Select a student’s name from an urn. S In some random experiments, a number of different quantities.
1 Engineering Computation Part 5. 2 Some Concepts Previous to Probability RANDOM EXPERIMENT A random experiment or trial can be thought of as any activity.
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
Continuous Random Variables and Probability Distributions
Review of Probability and Random Processes
Uncertainty Chapter 13.
Uncertainty Chapter 13.
Probability and Statistics Review Thursday Sep 11.
Lecture II-2: Probability Review
Modern Navigation Thomas Herring
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
Sampling Distributions  A statistic is random in value … it changes from sample to sample.  The probability distribution of a statistic is called a sampling.
Review of Probability.
Problem A newly married couple plans to have four children and would like to have three girls and a boy. What are the chances (probability) their desire.
Physics Fluctuomatics (Tohoku University) 1 Physical Fluctuomatics 2nd Mathematical Preparations (1): Probability and statistics Kazuyuki Tanaka Graduate.
: Appendix A: Mathematical Foundations 1 Montri Karnjanadecha ac.th/~montri Principles of.
STAT 552 PROBABILITY AND STATISTICS II
OUTLINE Probability Theory Linear Algebra Probability makes extensive use of set operations, A set is a collection of objects, which are the elements.
Chapter 1 Probability and Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011.
Jointly Distributed Random Variables
Principles of Pattern Recognition
PROBABILITY & STATISTICAL INFERENCE LECTURE 3 MSc in Computing (Data Analytics)
Dr. Gary Blau, Sean HanMonday, Aug 13, 2007 Statistical Design of Experiments SECTION I Probability Theory Review.
Theory of Probability Statistics for Business and Economics.
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
1 Chapter 13 Uncertainty. 2 Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
LECTURE IV Random Variables and Probability Distributions I.
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
Probability and naïve Bayes Classifier Louis Oliphant cs540 section 2 Fall 2005.
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Yung-Kyun Noh and Joo-kyung Kim Biointelligence.
Probability theory Petter Mostad Sample space The set of possible outcomes you consider for the problem you look at You subdivide into different.
Uncertainty Uncertain Knowledge Probability Review Bayes’ Theorem Summary.
Multiple Random Variables Two Discrete Random Variables –Joint pmf –Marginal pmf Two Continuous Random Variables –Joint Distribution (PDF) –Joint Density.
Computer Vision Group Prof. Daniel Cremers Autonomous Navigation for Flying Robots Lecture 5.2: Recap on Probability Theory Jürgen Sturm Technische Universität.
CS433 Modeling and Simulation Lecture 03 – Part 01 Probability Review 1 Dr. Anis Koubâa Al-Imam Mohammad Ibn Saud University
Random Variables Presentation 6.. Random Variables A random variable assigns a number (or symbol) to each outcome of a random circumstance. A random variable.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Uncertainty Chapter 13. Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
Uncertainty Chapter 13. Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
Conditional Probability Mass Function. Introduction P[A|B] is the probability of an event A, giving that we know that some other event B has occurred.
Probability (outcome k) = Relative Frequency of k
Basics on Probability Jingrui He 09/11/2007. Coin Flips  You flip a coin Head with probability 0.5  You flip 100 coins How many heads would you expect.
Copyright © 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Review of Statistics I: Probability and Probability Distributions.
Random Variables. Numerical Outcomes Consider associating a numerical value with each sample point in a sample space. (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
Continuous Random Variables and Probability Distributions
Uncertainty Let action A t = leave for airport t minutes before flight Will A t get me there on time? Problems: 1.partial observability (road state, other.
Chapter 2: Probability. Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance.
CSE 473 Uncertainty. © UW CSE AI Faculty 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one has stood.
Pattern Recognition Mathematic Review Hamid R. Rabiee Jafar Muhammadi Ali Jalali.
1 Review of Probability and Random Processes. 2 Importance of Random Processes Random variables and processes talk about quantities and signals which.
Pattern Recognition Probability Review
Review of Probability.
Graduate School of Information Sciences, Tohoku University
Appendix A: Probability Theory
Where are we in CS 440? Now leaving: sequential, deterministic reasoning Entering: probabilistic reasoning and machine learning.
Graduate School of Information Sciences, Tohoku University
Uncertainty Chapter 13.
Probability Review 11/22/2018.
Advanced Artificial Intelligence
Presentation transcript:

Probability Review 1 CS479/679 Pattern Recognition Dr. George Bebis

Why Bother About Probabilities? Accounting for uncertainty is a crucial component in decision making (e.g., classification) because of ambiguity in our measurements. Probability theory is the proper mechanism for accounting for uncertainty. Take into account a-priori knowledge, for example: "If the fish was caught in the Atlantic ocean, then it is more likely to be salmon than sea-bass 2

Definitions Random experiment – An experiment whose result is not certain in advance (e.g., throwing a die) Outcome – The result of a random experiment Sample space – The set of all possible outcomes (e.g., {1,2,3,4,5,6}) Event – A subset of the sample space (e.g., obtain an odd number in the experiment of throwing a die = {1,3,5}) 3

Intuitive Formulation of Probability Intuitively, the probability of an event α could be defined as: Assumes that all outcomes are equally likely (Laplacian definition) 4 where is the number of times that event happens in n trials where N(a) is the number of times that event α happens in n trials

Axioms of Probability 5 C) A1A1 A2A2 A3A3 A4A4 S

Prior (Unconditional) Probability This is the probability of an event in the absence of any evidence. P(Cavity)=0.1 means that “in the absence of any other information, there is a 10% chance that the patient is having a cavity”. 6

Posterior (Conditional) Probability This is the probability of an event given some evidence. P(Cavity/Toothache)=0.8 means that “there is an 80% chance that the patient is having a cavity given that he is having a toothache” 7

Posterior (Conditional) Probability (cont’d) Conditional probabilities can be defined in terms of unconditional probabilities: Conditional probabilities lead to the chain rule: P(A,B)=P(A/B)P(B)=P(B/A)P(A) 8

Law of Total Probability If A 1, A 2, …, A n is a partition of mutually exclusive events and B is any event, then: Special case : 9    S 

Bayes’ Theorem Conditional probabilities lead to the Bayes’ rule: where 10

Example Consider the probability of Disease given Symptom: 11 where:

Example (cont’d) Meningitis causes a stiff neck 50% of the time. A patient comes in with a stiff neck – what is the probability that he has meningitis? Need to know the following: – The prior probability of a patient having meningitis (P(M)=1/50,000) – The prior probability of a patient having a stiff neck (P(S)=1/20) P(M/S)=

General Form of Bayes’ Rule If A 1, A 2, …, A n is a partition of mutually exclusive events and B is any event, then the Bayes’ rule is given by: where 13

Independence Two events A and B are independent iff: P(A,B)=P(A)P(B) Using the formula above, we can show: P(A/B)=P(A) and P(B/A)=P(B) A and B are conditionally independent given C iff: P(A/B,C)=P(A/C) e.g., P(WetGrass/Season,Rain)=P(WetGrass/Rain) 14

Random Variables In many experiments, it is easier to deal with a summary variable than with the original probability structure. Example In an opinion poll, we ask 50 people whether agree or disagree with a certain issue. – Suppose we record a "1" for agree and "0" for disagree. – The sample space for this experiment has 2 50 elements. Suppose we are only interested in the number of people who agree. – Define the variable X=number of "1“ 's recorded out of 50. – The new sample space has only 51 elements. 15

Random Variables (cont’d) A random variable (r.v.) is a function that assigns a value to the outcome of a random experiment. 16 X(j)

Random Variables (cont’d) How do we compute probabilities using random variables? – Suppose the sample space is S= – Suppose the range of the random variable X is – We observe X=x j iff the outcome of the random experiment is an such that X(s j )=x j 17

Example Consider the experiment of throwing a pair of dice 18

Discrete/Continuous Random Variables A discrete r.v. can assume only discrete values. A continuous r.v. can assume a continuous range of values (e.g., sensor readings). 19

Probability mass function (pmf) The pmf (discrete r.v.) of a r.v. X assigns a probability for each possible value of X. Notation: given two r.v.'s, X and Y, their pmf are denoted as p X (x) and p Y (y); for convenience, we will drop the subscripts and denote them as p(x) and p(y). However, keep in mind that these functions are different! 20

Probability density function (pdf) The pdf (continuous r.v.) of a r.v. X shows the probability of being “close” to some number x Notation: given two r.v.'s, X and Y, their pdf are denoted as p X (x) and p Y (y); for convenience, we will drop the subscripts and denote them as p(x) and p(y). However, keep in mind that these functions are different! 21

Probability Distribution Function (PDF) Definition: Some properties of PDF: – (1) – (2) F(x) is a non-decreasing function of x If X is discrete, its PDF can be computed as follows: 22

Probability Distribution Function (PDF) (cont’d) 23 pmf

Probability Distribution Function (PDF) (cont’d) If X is continuous, its PDF can be computed as follows: The pdf can be obtained from the PDF using: 24

Example Gaussian pdf and PDF 25 0

Joint pmf (discrete case) For n random variables, the joint pmf assigns a probability for each possible combination of values: p(x 1,x 2,…,x n )=P(X 1 =x 1, X 2 =x 2, …, X n =x n ) Notation: the joint pmf /pdf of the r.v.'s X 1, X 2,..., X n and Y 1, Y 2,..., Y n are denoted as p X1X2...Xn (x 1,x 2,...,x n ) and p Y1Y2...Yn (y 1,y 2,...,y n ); for convenience, we will drop the subscripts and denote them p(x 1,x 2,...,x n ) and p(y 1,y 2,...,y n ), keep in mind, however, that these are two different functions. 26

Joint pmf (discrete r.v.) (cont’d) Specifying the joint pmf requires a large number of values – k n assuming n random variables where each one can assume one of k discrete values. – Can be simplified if we assume independence or conditional independence. P(Cavity, Toothache) is a 2 x 2 matrix Toothache Not Toothache Cavity Not Cavity Joint Probability Sum of probabilities = 1.0

Joint pdf (continuous r.v.) For n random variables, the joint pdf gives the probability of being “close” to a given combination of values: 28

Conditional pdf The conditional pdf can be derived from the joint pdf: Conditional pdfs lead to the chain rule: General case: 29

Independence Knowledge about independence between r.v.'s is very useful because it simplifies things. e.g., if X and Y are independent, then: 30 2D1D

Law of Total Probability The law of total probability: 31

Marginalization Given the joint pmf/pdf, we can compute the pmf/pdf of any subset of variables using marginalization: – Examples: 32

The joint pmf/pdf is very useful! 33

Normal (Gaussian) Distribution 34 d-dimensional: d x 1 d x d symmetric matrix 1-dimensional: -

Normal (Gaussian) Distribution (cont’d) Parameters and shape of Gaussian distribution – Number of parameters is – Shape determined by Σ 35

Normal (Gaussian) Distribution (cont’d) Mahalanobis distance: If the variables are independent, the multivariate normal distribution becomes: 36 Σ is diagonal

Expected Value 37

Expected Value (cont’d) The sample mean and expected value are related by: The expected value for a continuous r.v. is given by: 38

Variance and Standard Deviation 39

Covariance 40

Covariance Matrix – 2 variables 41 and Cov(X,Y)=Cov(Y,X) Σ = (i.e., symmetric matrix) can be computed using:

Covariance Matrix – n variables 42 (i.e., symmetric matrix)

Uncorrelated r.v.’s 43 DD

Properties of covariance matrix 44 (Note: we will review these concepts later in case you do not remember them)

Covariance Matrix Decomposition 45 where Φ -1 = Φ T

Linear Transformations 46

Linear Transformations (cont’d) 47 Whitening transform