INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.

Slides:

Advertisements

Similar presentations

Pattern Recognition and Machine Learning

Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.

: INTRODUCTION TO Machine Learning Parametric Methods.

Classification. Introduction A discriminant is a function that separates the examples of different classes. For example – IF (income > Q1 and saving >Q2)

Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Pattern Recognition and Machine Learning

CS Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

INTRODUCTION TO Machine Learning 3rd Edition

LECTURE 11: BAYESIAN PARAMETER ESTIMATION

Sampling: Final and Initial Sample Size Determination

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Machine Learning CMPT 726 Simon Fraser University CHAPTER 1: INTRODUCTION.

Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.

Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.

. PGM: Tirgul 10 Parameter Learning and Priors. 2 Why learning? Knowledge acquisition bottleneck u Knowledge acquisition is an expensive process u Often.

Computer vision: models, learning and inference

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation X = {

CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation Given.

Simple Bayesian Supervised Models Saskia Klein & Steffen Bollmann 1.

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Chapter Two Probability Distributions: Discrete Variables

1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.

Bayesian Inference Ekaterina Lomakina TNU seminar: Bayesian inference 1 March 2013.

CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation Given.

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.

Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Presented by Chen Yi-Ting.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Conjugate Priors Multinomial Gaussian MAP Variance Estimation Example.

Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.

MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN Modified by Prof. Carolina Ruiz © The MIT Press, 2014 for CS539 Machine Learning at WPI

MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.

INTRODUCTION TO Machine Learning 3rd Edition

Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.

Gaussian Processes For Regression, Classification, and Prediction.

Introduction to Machine Learning Multivariate Methods 姓名 : 李政軒.

Bayesian decision theory: A framework for making decisions when uncertainty exit 1 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e.

Machine Learning 5. Parametric Methods.

Dirichlet Distribution

Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)

CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

CS Ensembles and Bayes1 Ensembles, Model Combination and Bayesian Combination.

Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.

Bayesian Belief Network AI Contents t Introduction t Bayesian Network t KDD Data.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

ICS 280 Learning in Graphical Models

CS 2750: Machine Learning Density Estimation

Ch3: Model Building through Regression

Bayes Net Learning: Bayesian Approaches

Maximum Likelihood Estimation

Special Topics In Scientific Computing

CHAPTER 3: Bayesian Decision Theory

Distributions and Concepts in Probability Theory

INTRODUCTION TO Machine Learning

Statistical NLP: Lecture 4

INTRODUCTION TO Machine Learning 3rd Edition

Pattern Recognition and Machine Learning

INTRODUCTION TO Machine Learning

More Parameter Learning, Multinomial and Continuous Variables

The estimate of the proportion (“p-hat”) based on the sample can be a variety of values, and we don’t expect to get the same value every time, but the.

Parametric Methods Berlin Chen, 2005 References:

Presentation transcript:

INTRODUCTION TO MACHINE LEARNING Bayesian Estimation

Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Estimating parameters of a model from the data  Regression  Classification  Have some prior knowledge on possible parameter range  Before looking at the data  Distribution of the parameter

Generative Model Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 3

Bayes Rule Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 4

Multinomial variable Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 5  Sample of multinomial data taking one of K state  Sample Likelihood  Good way to specify prior distribution on state probabilities q

Dirichlet Distribution Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 6  Probability of each combination of state probabilities  Parameters: approximate proportions of data in state q i

Posteriori Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 7  Likelihood  Posteriori

Conjugate Prior Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 8  Posteriori and prior have the same form  Sequential learning  Instance by instance  Calculate posteriori for the current item  Make it prior for the next item

Continuous Variable Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 9  Instances are Gaussian Distributed with unknown parameters  Conjugate prior

Continuous Variable Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 10 Posteriori Mean is weighted combination of sample mean and prior mean More samples, estimate is closer to m Little prior uncertainty=>closer to prior mean

Precision/Variance Prior Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 11  More convenient to work with precision  Conjugate prior is a Gamma Distribution

Precision Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 12  Posteriori is a weighted sum of prior and sample statistics

Parameter Estimation Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 13  Used prior to refine distribution parameter estimates  User prior to refine parameter of some function of the input  Regression  Classification discriminant

Regression Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 14

Regression Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 15  Maximum Likelihood  Prediction  Gaussian Prior

Prior on weights Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 16

Examples 17