Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Summarized by J.W. Ha Biointelligence Laboratory, Seoul National University.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Curse of Dimensionality Prof. Navneet Goyal Dept. Of Computer Science & Information Systems BITS - Pilani.
Pattern Recognition and Machine Learning
Biointelligence Laboratory, Seoul National University
Pattern Recognition and Machine Learning
Ch 11. Sampling Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by I.-H. Lee Biointelligence Laboratory, Seoul National.
Linear Models for Classification: Probabilistic Methods
Chapter 4: Linear Models for Classification
Laboratory for Social & Neural Systems Research (SNS) PATTERN RECOGNITION AND MACHINE LEARNING Institute of Empirical Research in Economics (IEW)
Visual Recognition Tutorial
Announcements  Homework 4 is due on this Thursday (02/27/2004)  Project proposal is due on 03/02.
Today Linear Regression Logistic Regression Bayesians v. Frequentists
Machine Learning CMPT 726 Simon Fraser University
Decision Theory Naïve Bayes ROC Curves
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Information Theory Rong Jin. Outline  Information  Entropy  Mutual information  Noisy channel model.
Visual Recognition Tutorial
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
X= {x 0, x 1,….,x J-1 } Y= {y 0, y 1, ….,y K-1 } Channel Finite set of input (X= {x 0, x 1,….,x J-1 }), and output (Y= {y 0, y 1,….,y K-1 }) alphabet.
1. Entropy as an Information Measure - Discrete variable definition Relationship to Code Length - Continuous Variable Differential Entropy 2. Maximum Entropy.
Summarized by Soo-Jin Kim
PATTERN RECOGNITION AND MACHINE LEARNING
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011.
Ch 6. Kernel Methods Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by J. S. Kim Biointelligence Laboratory, Seoul National University.
Biointelligence Laboratory, Seoul National University
§4 Continuous source and Gaussian channel
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Yung-Kyun Noh and Joo-kyung Kim Biointelligence.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
1 E. Fatemizadeh Statistical Pattern Recognition.
Machine Learning CUNY Graduate Center Lecture 4: Logistic Regression.
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,
Optimal Bayes Classification
Mathematical Foundations Elementary Probability Theory Essential Information Theory Updated 11/11/2005.
BCS547 Neural Decoding.
ECE 471/571 – Lecture 6 Dimensionality Reduction – Fisher’s Linear Discriminant 09/08/15.
Biointelligence Laboratory, Seoul National University
Linear Models for Classification
Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.
Lecture 2: Statistical learning primer for biologists
Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.
1 Lecture 7 System Models Attributes of a man-made system. Concerns in the design of a distributed system Communication channels Entropy and mutual information.
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Joo-kyung Kim Biointelligence Laboratory,
Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.
Ch 2. Probability Distribution Pattern Recognition and Machine Learning, C. M. Bishop, Update by B.-H. Kim Summarized by M.H. Kim Biointelligence.
Ch 1. Introduction (Latter) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by J.W. Ha Biointelligence Laboratory, Seoul National.
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
Learning Machine Learning Further Introduction from Eleftherios inspired mainly from Ghahramani’s and Bishop’s lectures.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Ch 2. THERMODYNAMICS, STATISTICAL MECHANICS, AND METROPOLIS ALGORITHMS 2.6 ~ 2.8 Adaptive Cooperative Systems, Martin Beckerman, Summarized by J.-W.
The EM algorithm for Mixture of Gaussians & Classification with Generative models Jakob Verbeek December 2, 2011 Course website:
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Biointelligence Laboratory, Seoul National University
Ch 14. Combining Models Pattern Recognition and Machine Learning, C. M
CS668: Pattern Recognition Ch 1: Introduction
Special Topics In Scientific Computing
COT 5611 Operating Systems Design Principles Spring 2012
COT 5611 Operating Systems Design Principles Spring 2014
Pattern Recognition and Machine Learning
Biointelligence Laboratory, Seoul National University
Generally Discriminant Analysis
LECTURE 23: INFORMATION THEORY REVIEW
Adaptive Cooperative Systems Chapter 6 Markov Random Fields
Biointelligence Laboratory, Seoul National University
Ch 3. Linear Models for Regression (2/2) Pattern Recognition and Machine Learning, C. M. Bishop, Previously summarized by Yung-Kyun Noh Updated.
Presentation transcript:

Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by J.W. Ha Biointelligence Laboratory, Seoul National University

2(C) 2006, SNU Biointelligence Lab, The Curse of Dimensionality 1.5 Decision Theory 1.6 Information Theroy

3(C) 2006, SNU Biointelligence Lab, The Curse of Dimensionality The High Dimesionality Problem Ex. Mixture of Oil, Water, Gas - 3-Class (Homogeneous, Annular, Laminar) - 12 Input Variables - Scatter Plot of x6, x7 - Predict Point X - Simple and Naïve Approach

4(C) 2006, SNU Biointelligence Lab, The Curse of Dimensionality (Cont’d) The Shortcomings of Naïve Approach - The number of cells increase exponentially. - Needs a large training data set for cells not to be empty.

5(C) 2006, SNU Biointelligence Lab, The Curse of Dimensionality (Cont’d) Polynomial Curve Fitting Method(M Order) - Althogh D increases, it grows propotionally to D m The Volume of High Dimensional Sphere - Concentrated in a thin shell near the space

6(C) 2006, SNU Biointelligence Lab, The Curse of Dimensionality (Cont’d) Gaussian Distribution

7(C) 2006, SNU Biointelligence Lab, Decision Theory Make Optimal Decisions - Inferrence Step & Decision Step - Select Higher Posterior Probability Minimizing the Misclassification Rate - MAP → Minimizing Colored Area

8(C) 2006, SNU Biointelligence Lab, Decision Theory (Cont’d) Minimizing the Expected Loss - Class 마다 Missclassification 의 Damage 가 다르다. - Introduction of Loss Function(Cost Function) - MAP → Minimizing Expected Loss The Reject Option - Threshold θ - Reject if θ > Posterior Prob.

9(C) 2006, SNU Biointelligence Lab, Decision Theory (Cont’d) Inference and Decision - Three Distinct Approach 1. Obtain Posterior Probability & Generative Models 2. Obtain Posterior Probability & Discriminative Models 3. Find Discrimitive Function

10(C) 2006, SNU Biointelligence Lab, The Reason to Compute the Posterior 1. Minimizing Risk 2. Reject Option 3. Compensating for Class Priors 4. Combining Models 1.5 Decision Theory (Cont’d)

11(C) 2006, SNU Biointelligence Lab, Decision Theory (Cont’d) Loss Function for Regression - Multiple Target Variable Vector

12(C) 2006, SNU Biointelligence Lab, Minkowski Loss 1.5 Decision Theory (Cont’d)

13(C) 2006, SNU Biointelligence Lab, Information Theory Entropy - The noiseless coding theorem states that the entropy is lower bound on the number of bits needed to transmit the state of a random variable. - Higher Entropy, Lager Uncertainty

14(C) 2006, SNU Biointelligence Lab, Information Theory (Cont’d) Maximum Entropy Configuration for Continuous Variable - Constraints - Result - The distribution that maximize the differential entropy is the Gaussian Conditional Entropy : H[x,y] = H[y|x] + H[x]

15(C) 2006, SNU Biointelligence Lab, Information Theory (Cont’d) Relative Entropy [Kullback-Leibler divergence] Convexity Function (Jensen’s Inequality)

16(C) 2006, SNU Biointelligence Lab, Mutual Information 1.6 Information Theory (Cont’d) - I[x, y] = H[x] – H[x|y] = H[y] – H[y|x] - If x and y are independent, I[x,y] = 0 - the Reduction in the uncertainty about x by virtue of being told the value of y