INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, 2004 Lecture Slides for.

Slides:

Advertisements

Similar presentations

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) ETHEM ALPAYDIN © The MIT Press, 2010

Advertisements

INTRODUCTION TO Machine Learning 2nd Edition

Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.

ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.

CHAPTER 10: Linear Discrimination

CS Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

INTRODUCTION TO Machine Learning 3rd Edition

INTRODUCTION TO Machine Learning 3rd Edition

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 

Classification and risk prediction

Lecture Notes for CMPUT 466/551 Nilanjan Ray

Generative Models Rong Jin. Statistical Inference Training ExamplesLearning a Statistical Model  Prediction p(x;  ) Female: Gaussian distribution N(

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation X = {

CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation Given.

MACHINE LEARNING 6. Multivariate Methods 1. Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Motivating Example  Loan.

Bayes Classifier, Linear Regression 10701/15781 Recitation January 29, 2008 Parts of the slides are from previous years’ recitation and lecture notes,

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Multivariate Methods Slides from Machine Learning by Ethem Alpaydin Expanded by some slides from Gutierrez-Osuna.

CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation Given.

ECE 8443 – Pattern Recognition LECTURE 03: GAUSSIAN CLASSIFIERS Objectives: Normal Distributions Whitening Transformations Linear Discriminants Resources.

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.

11/12/2012ISC471 / HCI571 Isabelle Bichindaritz 1 Prediction.

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN Modified by Prof. Carolina Ruiz © The MIT Press, 2014 for CS539 Machine Learning at WPI

INTRODUCTION TO Machine Learning 3rd Edition

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.

Radial Basis Function ANN, an alternative to back propagation, uses clustering of examples in the training set.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.

CHAPTER 10: Logistic Regression. Binary classification Two classes Y = {0,1} Goal is to learn how to correctly classify the input into one of these two.

Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.

Introduction to Machine Learning Multivariate Methods 姓名 : 李政軒.

Bayesian decision theory: A framework for making decisions when uncertainty exit 1 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e.

Machine Learning 5. Parametric Methods.

Multivariate Methods Slides from Machine Learning by Ethem Alpaydin Expanded by some slides from Gutierrez-Osuna.

MACHINE LEARNING 3. Supervised Learning. Learning a Class from Examples Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.

CHAPTER 8: Nonparametric Methods Alpaydin transparencies significantly modified, extended and changed by Ch. Eick Last updated: March 4, 2011.

Multivariate Methods Slides from Machine Learning by Ethem Alpaydin Expanded by some slides from Gutierrez-Osuna.

Background for Machine Learning (I) Usman Roshan.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Lecture 7 Spring 2010 Dr. Jianjun Hu CSCE883 Machine Learning.

Multivariate Methods Slides from Machine Learning by Ethem Alpaydin

INTRODUCTION TO Machine Learning 3rd Edition

Ch8: Nonparametric Methods

CH 5: Multivariate Methods

INTRODUCTION TO Machine Learning

INTRODUCTION TO Machine Learning

CSCE833 Machine Learning Lecture 9 Linear Discriminant Analysis

Multivariate Methods Berlin Chen

Multivariate Methods Berlin Chen, 2005 References:

Radial Basis Functions: Alternative to Back Propagation

Linear Discrimination

Test #1 Thursday September 20th

Presentation transcript:

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for

CHAPTER 5: Multivariate Methods

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 3 Multivariate Data Multiple measurements (sensors) d inputs/features/attributes: d-variate N instances/observations/examples

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 4 Multivariate Parameters

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 5 Parameter Estimation

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 6 Estimation of Missing Values What to do if certain instances have missing attributes? Ignore those instances: not a good idea if the sample is small Use ‘missing’ as an attribute: may give information Imputation: Fill in the missing value  Mean imputation: Use the most likely value (e.g., mean)  Imputation by regression: Predict based on other attributes

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 7 Multivariate Normal Distribution

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 8 Multivariate Normal Distribution Mahalanobis distance: ( x – μ ) T ∑ –1 ( x – μ ) measures the distance from x to μ in terms of ∑ (normalizes for difference in variances and correlations) Bivariate: d = 2

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 9 Bivariate Normal

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 10

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 11 Independent Inputs: Naive Bayes If x i are independent, offdiagonals of ∑ are 0, Mahalanobis distance reduces to weighted (by 1/ σ i ) Euclidean distance: If variances are also equal, reduces to Euclidean distance

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 12 Parametric Classification If p (x | C i ) ~ N ( μ i, ∑ i ) Discriminant functions are

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 13 Estimation of Parameters

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 14 Different S i Quadratic discriminant

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 15 likelihoods posterior for C 1 discriminant: P (C 1 |x ) = 0.5

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 16 Common Covariance Matrix S Shared common sample covariance S Discriminant reduces to which is a linear discriminant

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 17 Common Covariance Matrix S

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 18 Diagonal S When x j j = 1,..d, are independent, ∑ is diagonal p (x|C i ) = ∏ j p (x j |C i )(Naive Bayes’ assumption) Classify based on weighted Euclidean distance (in s j units) to the nearest mean

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 19 Diagonal S variances may be different

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 20 Diagonal S, equal variances Nearest mean classifier: Classify based on Euclidean distance to the nearest mean Each mean can be considered a prototype or template and this is template matching

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 21 Diagonal S, equal variances * ?

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 22 Model Selection As we increase complexity (less restricted S), bias decreases and variance increases Assume simple models (allow some bias) to control variance (regularization) AssumptionCovariance matrixNo of parameters Shared, HypersphericSi=S=s2ISi=S=s2I1 Shared, Axis-alignedS i =S, with s ij =0d Shared, HyperellipsoidalSi=SSi=Sd(d+1)/2 Different, Hyperellipsoidal SiSi K d(d+1)/2

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 23 Discrete Features Binary features: if x j are independent (Naive Bayes’) the discriminant is linear Estimated parameters

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 24 Discrete Features Multinomial (1-of-n j ) features: x j  {v 1, v 2,..., v n j } if x j are independent

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 25 Multivariate Regression Multivariate linear model Multivariate polynomial model: Define new higher-order variables z 1 =x 1, z 2 =x 2, z 3 =x 1 2, z 4 =x 2 2, z 5 =x 1 x 2 and use the linear model in this new z space (basis functions, kernel trick, SVM: Chapter 10)