INTRODUCTION TO Machine Learning

Slides:

Advertisements

Similar presentations

CHAPTER 2: Supervised Learning. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Learning a Class from Examples.

Advertisements

INTRODUCTION TO Machine Learning 2nd Edition

Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.

CH2 - Supervised Learning Computational learning theory

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Feasibility of learning: the issues solution for infinite hypothesis sets VC generalization bound (mostly lecture 5 on AMLbook.com)

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Northwestern University Winter 2007 Machine Learning EECS Machine Learning Lecture 13: Computational Learning Theory.

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Part I: Classification and Bayesian Learning

For Better Accuracy Eick: Ensemble Learning

Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension.

Machine Learning CSE 681 CH2 - Supervised Learning.

Introduction to Machine Learning Supervised Learning 姓名 : 李政軒.

1 Machine Learning: Lecture 8 Computational Learning Theory (Based on Chapter 7 of Mitchell T.., Machine Learning, 1997)

MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.

INTRODUCTION TO Machine Learning 3rd Edition

Christoph Eick: Learning Models to Predict and Classify 1 Learning from Examples Example of Learning from Examples  Classification: Is car x a family.

Concept learning, Regression Adapted from slides from Alpaydin’s book and slides by Professor Doina Precup, Mcgill University.

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.

1 CSCI 3202: Introduction to AI Decision Trees Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AIDecision Trees.

Review of fundamental 1 Data mining in 1D: curve fitting by LLS Approximation-generalization tradeoff First homework assignment.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

CHAPTER 10: Logistic Regression. Binary classification Two classes Y = {0,1} Goal is to learn how to correctly classify the input into one of these two.

Machine Learning 5. Parametric Methods.

MACHINE LEARNING 3. Supervised Learning. Learning a Class from Examples Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

1 CS 391L: Machine Learning: Computational Learning Theory Raymond J. Mooney University of Texas at Austin.

CS623: Introduction to Computing with Neural Nets (lecture-18) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.

Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension.

Decision trees (concept learnig)

Computational Learning Theory

Maximum Likelihood Estimation

CH. 2: Supervised Learning

Christoph Eick: Learning Models to Predict and Classify 1 Learning from Examples Example of Learning from Examples  Classification: Is car x a family.

Bias and Variance of the Estimator

INTRODUCTION TO Machine Learning

INTRODUCTION TO Machine Learning

Data Mining Practical Machine Learning Tools and Techniques

Computational Learning Theory

10701 / Machine Learning Today: - Cross validation,

INTRODUCTION TO Machine Learning 3rd Edition

Machine Learning” Lecture 1

INTRODUCTION TO Machine Learning

Computational Learning Theory

INTRODUCTION TO Machine Learning

CS344 : Introduction to Artificial Intelligence

CSCE833 Machine Learning Lecture 9 Linear Discriminant Analysis

INTRODUCTION TO Machine Learning

Machine Learning: UNIT-3 CHAPTER-2

Supervised Learning Berlin Chen 2005 References:

INTRODUCTION TO Machine Learning 2nd Edition

INTRODUCTION TO Machine Learning 3rd Edition

Supervised machine learning: creating a model

Shih-Yang Su Virginia Tech

INTRODUCTION TO Machine Learning 3rd Edition

Presentation transcript:

INTRODUCTION TO Machine Learning Lecture Slides for INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, 2004 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml

CHAPTER 2: Supervised Learning

Learning a Class from Examples Class C of a “family car” Prediction: Is car x a family car? Knowledge extraction: What do people expect from a family car? Output: Positive (+) and negative (–) examples Input representation: x1: price, x2 : engine power Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Training set X Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Class C Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Hypothesis class H Error of h on H Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

S, G, and the Version Space most specific hypothesis, S most general hypothesis, G h Î H, between S and G is consistent and make up the version space (Mitchell, 1997) Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

VC Dimension N points can be labeled in 2N ways as +/– H shatters N if there exists h Î H consistent for any of these: VC(H ) = N An axis-aligned rectangle shatters 4 points only ! Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Probably Approximately Correct (PAC) Learning How many training examples N should we have, such that with probability at least 1 ‒ δ, h has error at most ε ? (Blumer et al., 1989) Each strip is at most ε/4 Pr that we miss a strip 1‒ ε/4 Pr that N instances miss a strip (1 ‒ ε/4)N Pr that N instances miss 4 strips 4(1 ‒ ε/4)N 4(1 ‒ ε/4)N ≤ δ and (1 ‒ x)≤exp( ‒ x) 4exp(‒ εN/4) ≤ δ and N ≥ (4/ε)log(4/δ) Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Noise and Model Complexity Use the simpler one because Simpler to use (lower computational complexity) Easier to train (lower space complexity) Easier to explain (more interpretable) Generalizes better (lower variance - Occam’s razor) Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Multiple Classes, Ci i=1,...,K Train hypotheses hi(x), i =1,...,K: Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Regression Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Model Selection & Generalization Learning is an ill-posed problem; data is not sufficient to find a unique solution The need for inductive bias, assumptions about H Generalization: How well a model performs on new data Overfitting: H more complex than C or f Underfitting: H less complex than C or f Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Triple Trade-Off There is a trade-off between three factors (Dietterich, 2003): Complexity of H, c (H), Training set size, N, Generalization error, E, on new data As N, E¯ As c (H), first E¯ and then E Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Cross-Validation To estimate generalization error, we need data unseen during training. We split the data as Training set (50%) Validation set (25%) Test (publication) set (25%) Resampling when there is few data Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Dimensions of a Supervised Learner Model : Loss function: Optimization procedure: Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)