Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Summarized by J.W. Ha Biointelligence Laboratory, Seoul National University.

Slides:

Advertisements

Similar presentations

Pattern Recognition and Machine Learning

Advertisements

Curse of Dimensionality Prof. Navneet Goyal Dept. Of Computer Science & Information Systems BITS - Pilani.

Pattern Recognition and Machine Learning

Biointelligence Laboratory, Seoul National University

Pattern Recognition and Machine Learning

Ch 11. Sampling Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by I.-H. Lee Biointelligence Laboratory, Seoul National.

Linear Models for Classification: Probabilistic Methods

Chapter 4: Linear Models for Classification

Laboratory for Social & Neural Systems Research (SNS) PATTERN RECOGNITION AND MACHINE LEARNING Institute of Empirical Research in Economics (IEW)

Visual Recognition Tutorial

Announcements  Homework 4 is due on this Thursday (02/27/2004)  Project proposal is due on 03/02.

Today Linear Regression Logistic Regression Bayesians v. Frequentists

Machine Learning CMPT 726 Simon Fraser University

Decision Theory Naïve Bayes ROC Curves

Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.

Information Theory Rong Jin. Outline  Information  Entropy  Mutual information  Noisy channel model.

Visual Recognition Tutorial

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

X= {x 0, x 1,….,x J-1 } Y= {y 0, y 1, ….,y K-1 } Channel Finite set of input (X= {x 0, x 1,….,x J-1 }), and output (Y= {y 0, y 1,….,y K-1 }) alphabet.

1. Entropy as an Information Measure - Discrete variable definition Relationship to Code Length - Continuous Variable Differential Entropy 2. Maximum Entropy.

Summarized by Soo-Jin Kim

PATTERN RECOGNITION AND MACHINE LEARNING

Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.

ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011.

Ch 6. Kernel Methods Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by J. S. Kim Biointelligence Laboratory, Seoul National University.

Biointelligence Laboratory, Seoul National University

§4 Continuous source and Gaussian channel

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.

Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.

Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Yung-Kyun Noh and Joo-kyung Kim Biointelligence.

Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.

1 E. Fatemizadeh Statistical Pattern Recognition.

Machine Learning CUNY Graduate Center Lecture 4: Logistic Regression.

Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,

Optimal Bayes Classification

Mathematical Foundations Elementary Probability Theory Essential Information Theory Updated 11/11/2005.

BCS547 Neural Decoding.

ECE 471/571 – Lecture 6 Dimensionality Reduction – Fisher’s Linear Discriminant 09/08/15.

Biointelligence Laboratory, Seoul National University

Linear Models for Classification

Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.

Lecture 2: Statistical learning primer for biologists

Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.

1 Lecture 7 System Models Attributes of a man-made system. Concerns in the design of a distributed system Communication channels Entropy and mutual information.

6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,

Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.

Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.

Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Joo-kyung Kim Biointelligence Laboratory,

Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.

Ch 2. Probability Distribution Pattern Recognition and Machine Learning, C. M. Bishop, Update by B.-H. Kim Summarized by M.H. Kim Biointelligence.

Ch 1. Introduction (Latter) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by J.W. Ha Biointelligence Laboratory, Seoul National.

Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.

Learning Machine Learning Further Introduction from Eleftherios inspired mainly from Ghahramani’s and Bishop’s lectures.

Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.

Ch 2. THERMODYNAMICS, STATISTICAL MECHANICS, AND METROPOLIS ALGORITHMS 2.6 ~ 2.8 Adaptive Cooperative Systems, Martin Beckerman, Summarized by J.-W.

The EM algorithm for Mixture of Gaussians & Classification with Generative models Jakob Verbeek December 2, 2011 Course website:

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.

Biointelligence Laboratory, Seoul National University

Ch 14. Combining Models Pattern Recognition and Machine Learning, C. M

CS668: Pattern Recognition Ch 1: Introduction

Special Topics In Scientific Computing

COT 5611 Operating Systems Design Principles Spring 2012

COT 5611 Operating Systems Design Principles Spring 2014

Pattern Recognition and Machine Learning

Biointelligence Laboratory, Seoul National University

Generally Discriminant Analysis

LECTURE 23: INFORMATION THEORY REVIEW

Adaptive Cooperative Systems Chapter 6 Markov Random Fields

Biointelligence Laboratory, Seoul National University

Ch 3. Linear Models for Regression (2/2) Pattern Recognition and Machine Learning, C. M. Bishop, Previously summarized by Yung-Kyun Noh Updated.

Presentation transcript:

Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by J.W. Ha Biointelligence Laboratory, Seoul National University

2(C) 2006, SNU Biointelligence Lab, The Curse of Dimensionality 1.5 Decision Theory 1.6 Information Theroy

3(C) 2006, SNU Biointelligence Lab, The Curse of Dimensionality The High Dimesionality Problem Ex. Mixture of Oil, Water, Gas - 3-Class (Homogeneous, Annular, Laminar) - 12 Input Variables - Scatter Plot of x6, x7 - Predict Point X - Simple and Naïve Approach

4(C) 2006, SNU Biointelligence Lab, The Curse of Dimensionality (Cont’d) The Shortcomings of Naïve Approach - The number of cells increase exponentially. - Needs a large training data set for cells not to be empty.

5(C) 2006, SNU Biointelligence Lab, The Curse of Dimensionality (Cont’d) Polynomial Curve Fitting Method(M Order) - Althogh D increases, it grows propotionally to D m The Volume of High Dimensional Sphere - Concentrated in a thin shell near the space

6(C) 2006, SNU Biointelligence Lab, The Curse of Dimensionality (Cont’d) Gaussian Distribution

7(C) 2006, SNU Biointelligence Lab, Decision Theory Make Optimal Decisions - Inferrence Step & Decision Step - Select Higher Posterior Probability Minimizing the Misclassification Rate - MAP → Minimizing Colored Area

8(C) 2006, SNU Biointelligence Lab, Decision Theory (Cont’d) Minimizing the Expected Loss - Class 마다 Missclassification 의 Damage 가 다르다. - Introduction of Loss Function(Cost Function) - MAP → Minimizing Expected Loss The Reject Option - Threshold θ - Reject if θ > Posterior Prob.

9(C) 2006, SNU Biointelligence Lab, Decision Theory (Cont’d) Inference and Decision - Three Distinct Approach 1. Obtain Posterior Probability & Generative Models 2. Obtain Posterior Probability & Discriminative Models 3. Find Discrimitive Function

10(C) 2006, SNU Biointelligence Lab, The Reason to Compute the Posterior 1. Minimizing Risk 2. Reject Option 3. Compensating for Class Priors 4. Combining Models 1.5 Decision Theory (Cont’d)

11(C) 2006, SNU Biointelligence Lab, Decision Theory (Cont’d) Loss Function for Regression - Multiple Target Variable Vector

12(C) 2006, SNU Biointelligence Lab, Minkowski Loss 1.5 Decision Theory (Cont’d)

13(C) 2006, SNU Biointelligence Lab, Information Theory Entropy - The noiseless coding theorem states that the entropy is lower bound on the number of bits needed to transmit the state of a random variable. - Higher Entropy, Lager Uncertainty

14(C) 2006, SNU Biointelligence Lab, Information Theory (Cont’d) Maximum Entropy Configuration for Continuous Variable - Constraints - Result - The distribution that maximize the differential entropy is the Gaussian Conditional Entropy : H[x,y] = H[y|x] + H[x]

15(C) 2006, SNU Biointelligence Lab, Information Theory (Cont’d) Relative Entropy [Kullback-Leibler divergence] Convexity Function (Jensen’s Inequality)

16(C) 2006, SNU Biointelligence Lab, Mutual Information 1.6 Information Theory (Cont’d) - I[x, y] = H[x] – H[x|y] = H[y] – H[y|x] - If x and y are independent, I[x,y] = 0 - the Reduction in the uncertainty about x by virtue of being told the value of y