INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
: INTRODUCTION TO Machine Learning Parametric Methods.
Classification. Introduction A discriminant is a function that separates the examples of different classes. For example – IF (income > Q1 and saving >Q2)
Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Pattern Recognition and Machine Learning
CS Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning 3rd Edition
LECTURE 11: BAYESIAN PARAMETER ESTIMATION
Sampling: Final and Initial Sample Size Determination
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Machine Learning CMPT 726 Simon Fraser University CHAPTER 1: INTRODUCTION.
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
. PGM: Tirgul 10 Parameter Learning and Priors. 2 Why learning? Knowledge acquisition bottleneck u Knowledge acquisition is an expensive process u Often.
Computer vision: models, learning and inference
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation X = {
CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation Given.
Simple Bayesian Supervised Models Saskia Klein & Steffen Bollmann 1.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Chapter Two Probability Distributions: Discrete Variables
1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.
Bayesian Inference Ekaterina Lomakina TNU seminar: Bayesian inference 1 March 2013.
CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation Given.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Presented by Chen Yi-Ting.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Conjugate Priors Multinomial Gaussian MAP Variance Estimation Example.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN Modified by Prof. Carolina Ruiz © The MIT Press, 2014 for CS539 Machine Learning at WPI
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
INTRODUCTION TO Machine Learning 3rd Edition
Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Gaussian Processes For Regression, Classification, and Prediction.
Introduction to Machine Learning Multivariate Methods 姓名 : 李政軒.
Bayesian decision theory: A framework for making decisions when uncertainty exit 1 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e.
Machine Learning 5. Parametric Methods.
Dirichlet Distribution
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
CS Ensembles and Bayes1 Ensembles, Model Combination and Bayesian Combination.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Bayesian Belief Network AI Contents t Introduction t Bayesian Network t KDD Data.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ICS 280 Learning in Graphical Models
CS 2750: Machine Learning Density Estimation
Ch3: Model Building through Regression
Bayes Net Learning: Bayesian Approaches
Maximum Likelihood Estimation
Special Topics In Scientific Computing
CHAPTER 3: Bayesian Decision Theory
Distributions and Concepts in Probability Theory
INTRODUCTION TO Machine Learning
Statistical NLP: Lecture 4
INTRODUCTION TO Machine Learning 3rd Edition
Pattern Recognition and Machine Learning
INTRODUCTION TO Machine Learning
More Parameter Learning, Multinomial and Continuous Variables
The estimate of the proportion (“p-hat”) based on the sample can be a variety of values, and we don’t expect to get the same value every time, but the.
Parametric Methods Berlin Chen, 2005 References:
Presentation transcript:

INTRODUCTION TO MACHINE LEARNING Bayesian Estimation

Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Estimating parameters of a model from the data  Regression  Classification  Have some prior knowledge on possible parameter range  Before looking at the data  Distribution of the parameter

Generative Model Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 3

Bayes Rule Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 4

Multinomial variable Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 5  Sample of multinomial data taking one of K state  Sample Likelihood  Good way to specify prior distribution on state probabilities q

Dirichlet Distribution Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 6  Probability of each combination of state probabilities  Parameters: approximate proportions of data in state q i

Posteriori Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 7  Likelihood  Posteriori

Conjugate Prior Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 8  Posteriori and prior have the same form  Sequential learning  Instance by instance  Calculate posteriori for the current item  Make it prior for the next item

Continuous Variable Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 9  Instances are Gaussian Distributed with unknown parameters  Conjugate prior

Continuous Variable Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 10 Posteriori Mean is weighted combination of sample mean and prior mean More samples, estimate is closer to m Little prior uncertainty=>closer to prior mean

Precision/Variance Prior Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 11  More convenient to work with precision  Conjugate prior is a Gamma Distribution

Precision Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 12  Posteriori is a weighted sum of prior and sample statistics

Parameter Estimation Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 13  Used prior to refine distribution parameter estimates  User prior to refine parameter of some function of the input  Regression  Classification discriminant

Regression Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 14

Regression Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 15  Maximum Likelihood  Prediction  Gaussian Prior

Prior on weights Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 16

Examples 17