Basic Probability (most slides borrowed, with permission, from Andrew Moore of CMU and Google)

Slides:



Advertisements
Similar presentations
The Monty Hall Problem Madeleine Jetter 6/1/2000.
Advertisements

Bayesian network classification using spline-approximated KDE Y. Gurwicz, B. Lerner Journal of Pattern Recognition.
LECTURE 11: BAYESIAN PARAMETER ESTIMATION
Lecture 10 – Introduction to Probability Topics Events, sample space, random variables Examples Probability distribution function Conditional probabilities.
Naïve Bayes Classifier
Lec 18 Nov 12 Probability – definitions and simulation.
Bayesian Learning No reading assignment for this topic
Games, Logic, and Math Kristy and Dan. GAMES Game Theory Applies to social science Applies to social science Explains how people make all sorts of decisions.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.
Visual Recognition Tutorial
Probabilistic Graphical Models Tool for representing complex systems and performing sophisticated reasoning tasks Fundamental notion: Modularity Complex.
Probability: Review TexPoint fonts used in EMF.
Introduction to Bayesian Learning Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán)
Decision Theory Naïve Bayes ROC Curves
CSE (c) S. Tanimoto, 2008 Bayes Nets 1 Probabilistic Reasoning With Bayes’ Rule Outline: Motivation Generalizing Modus Ponens Bayes’ Rule Applying.
Generative Models Rong Jin. Statistical Inference Training ExamplesLearning a Statistical Model  Prediction p(x;  ) Female: Gaussian distribution N(
1er. Escuela Red ProTIC - Tandil, de Abril, Bayesian Learning 5.1 Introduction –Bayesian learning algorithms calculate explicit probabilities.
Introduction to Bayesian Learning Ata Kaban School of Computer Science University of Birmingham.
Learning Goal 13: Probability Use the basic laws of probability by finding the probabilities of mutually exclusive events. Find the probabilities of dependent.
Naïve Bayes and Hadoop Shannon Quinn.
MACHINE LEARNING 6. Multivariate Methods 1. Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Motivating Example  Loan.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Thanks to Nir Friedman, HU
Bayes Classifier, Linear Regression 10701/15781 Recitation January 29, 2008 Parts of the slides are from previous years’ recitation and lecture notes,
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Probability Review Thursday Sep 13.
Probability and Statistics Review Thursday Sep 11.
Catch the Spirit of Mathematical Practices Mathematics Investigations Door 1Door 2 Door 3 Let’s Make a Deal...  Contestant picks one of three doors (1,
Jeff Howbert Introduction to Machine Learning Winter Classification Bayesian Classifiers.
Probability, Bayes’ Theorem and the Monty Hall Problem
Crash Course on Machine Learning
Lecture 10 – Introduction to Probability Topics Events, sample space, random variables Examples Probability distribution function Conditional probabilities.
Recitation 1 Probability Review
B. RAMAMURTHY EAP#2: Data Mining, Statistical Analysis and Predictive Analytics for Automotive Domain CSE651C, B. Ramamurthy 1 6/28/2014.
ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Classification: Naïve Bayes Readings: Barber
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Aprendizagem Computacional Gladys Castillo, UA Bayesian Networks Classifiers Gladys Castillo University of Aveiro.
1 CS 391L: Machine Learning: Bayesian Learning: Naïve Bayes Raymond J. Mooney University of Texas at Austin.
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Empirical Research Methods in Computer Science Lecture 6 November 16, 2005 Noah Smith.
Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.
Elementary manipulations of probabilities Set probability of multi-valued r.v. P({x=Odd}) = P(1)+P(3)+P(5) = 1/6+1/6+1/6 = ½ Multi-variant distribution:
Optimal Bayes Classification
Computer Vision Lecture 6. Probabilistic Methods in Segmentation.
Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.
Basics on Probability Jingrui He 09/11/2007. Coin Flips  You flip a coin Head with probability 0.5  You flip 100 coins How many heads would you expect.
Regress-itation Feb. 5, Outline Linear regression – Regression: predicting a continuous value Logistic regression – Classification: predicting a.
The Uniform Prior and the Laplace Correction Supplemental Material not on exam.
1 Machine Learning: Lecture 6 Bayesian Learning (Based on Chapter 6 of Mitchell T.., Machine Learning, 1997)
Generative classifiers: The Gaussian classifier Ata Kaban School of Computer Science University of Birmingham.
Bayesian Learning Reading: C. Haruechaiyasak, “A tutorial on naive Bayes classification” (linked from class website)
Bayesian Learning. Bayes Classifier A probabilistic framework for solving classification problems Conditional Probability: Bayes theorem:
KEY CONCEPTS IN PROBABILITY: SMOOTHING, MLE, AND MAP.
Matching ® ® ® Global Map Local Map … … … obstacle Where am I on the global map?                                   
Ray Karol 2/26/2013. Let’s Make a Deal Monte Hall Problem Suppose you’re on a game show, and you’re given a choice of three doors: Behind one door is.
Bayesian Learning Reading: Tom Mitchell, “Generative and discriminative classifiers: Naive Bayes and logistic regression”, Sections 1-2. (Linked from.
Chapter 5: Bayes’ Theorem (And Additional Applications)
Perceptrons Lirong Xia.
10701 / Machine Learning.
Where are we in CS 440? Now leaving: sequential, deterministic reasoning Entering: probabilistic reasoning and machine learning.
Overview of Supervised Learning
INTRODUCTION TO Machine Learning 3rd Edition
LECTURE 07: BAYESIAN ESTIMATION
Perceptrons Lirong Xia.
Presentation transcript:

Basic Probability (most slides borrowed, with permission, from Andrew Moore of CMU and Google)

A

A B

Notation Digression P(A) is shorthand for P(A=true) P(~A) is shorthand for P(A=false) Same notation applies to other binary RVs: P(Gender=M), P(Gender=F) Same notation applies to multivalued RVs: P(Major=history), P(Age=19), P(Q=c) Note: upper case letters/names for variables, lower case letters/names for values For multivalued RVs, P(Q) is shorthand for P(Q=q) for some unknown q

… k k …

k k

k k Integration over nuisance variables

R Q S P(H|F) = R/(Q+R) P(F|H) = R/(S+R)

Let’s-Make-A-Deal Problem Game show with 3 doors Behind one door is a prize Player chooses a door, say door 1 Host opens door 2 to reveal no prize Host offers player opportunity to switch from door 1 to door 3. Should player stay with door 1, switch to door 3, or does it not matter?

Let’s-Make-A-Deal Problem D = door containing prize C = contestant’s choice H = door revealed by host P(D = d | C, H) P(D = d | C, H) ~ P(H | D=d, C) P(D = d|C) P(D = d | C, H) ~ P(H | D=d, C) P(D = d) E.g., C=1, H=2: P(H=2|C=1,D=1) =.5 P(H=2|C=1,D=2) = 0 P(H=2|C=1,D=3) = 1

Let’s-Make-A-Deal Problem: Lessons Likelihood function is based on a generative model of the environment (host’s behavior) Griffiths and Tenenbaum made their own generative assumptions – About how the experimenter would pick an age t total Solving a problem involves modeling the generative environment – Equivalent to learning the joint probability distribution

If you have joint distribution, you can perform any inference in the domain.

e.g., model of the environment

Naïve Bayes Model

Bayes Classifier Assume you want to predict some output Y which has arity n and values y 1, y 2, y 3, … y n. Assume there are m input attributes called X 1, X 2, X 3, … X m. For each of n values y i, build a density estimator, D i, that estimates P(X 1, X 2, X 3, … X m |Y=y i ) Y predict = argmax y P(Y=y| X 1, X 2, X 3, … X m ) = argmax y P(X 1, X 2, X 3, … X m | Y=y) P(Y=y)

Machine Learning Vs. Cognitive Science For machine learning, density estimation is required to do classification, prediction, etc. For cognitive science, density estimation under a particular model is a theory about what is going on in the brain when an individual learns.