ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
CS479/679 Pattern Recognition Dr. George Bebis
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
LECTURE 11: BAYESIAN PARAMETER ESTIMATION
An Overview of Machine Learning
Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.
Pattern Classification. Chapter 2 (Part 1): Bayesian Decision Theory (Sections ) Introduction Bayesian Decision Theory–Continuous Features.
What is Statistical Modeling
Bayesian Decision Theory Chapter 2 (Duda et al.) – Sections
Probability Review 1 CS479/679 Pattern Recognition Dr. George Bebis.
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Neural Networks: A Statistical Pattern Recognition Perspective
Machine Learning CMPT 726 Simon Fraser University
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Introduction to Machine Learning course fall 2007 Lecturer: Amnon Shashua Teaching Assistant: Yevgeny Seldin School of Computer Science and Engineering.
Generative Models Rong Jin. Statistical Inference Training ExamplesLearning a Statistical Model  Prediction p(x;  ) Female: Gaussian distribution N(
1 lBayesian Estimation (BE) l Bayesian Parameter Estimation: Gaussian Case l Bayesian Parameter Estimation: General Estimation l Problems of Dimensionality.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Introduction to machine learning
Review of Probability.
Crash Course on Machine Learning
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
: Appendix A: Mathematical Foundations 1 Montri Karnjanadecha ac.th/~montri Principles of.
1 Linear Methods for Classification Lecture Notes for CMPUT 466/551 Nilanjan Ray.
Machine Learning Queens College Lecture 3: Probability and Statistics.
0 Pattern Classification, Chapter 3 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda,
Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 1 1 jrett.
Principles of Pattern Recognition
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Speech Recognition Pattern Classification. 22 September 2015Veton Këpuska2 Pattern Classification  Introduction  Parametric classifiers  Semi-parametric.
Dr. Gary Blau, Sean HanMonday, Aug 13, 2007 Statistical Design of Experiments SECTION I Probability Theory Review.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
1 E. Fatemizadeh Statistical Pattern Recognition.
Chapter 3: Maximum-Likelihood Parameter Estimation l Introduction l Maximum-Likelihood Estimation l Multivariate Case: unknown , known  l Univariate.
Mathematical Foundations Elementary Probability Theory Essential Information Theory Updated 11/11/2005.
Lecture 2: Statistical learning primer for biologists
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Machine Learning CUNY Graduate Center Lecture 2: Math Primer.
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
CSE 4705 Artificial Intelligence
Pattern Recognition Probability Review
Lecture 1.31 Criteria for optimal reception of radio signals.
Chapter 3: Maximum-Likelihood Parameter Estimation
Probability Theory and Parameter Estimation I
Probability theory retro
CS668: Pattern Recognition Ch 1: Introduction
Special Topics In Scientific Computing
Pattern Classification, Chapter 3
Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Course Outline MODEL INFORMATION COMPLETE INCOMPLETE
REMOTE SENSING Multispectral Image Classification
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
EE513 Audio Signals and Systems
Pattern Recognition and Machine Learning
LECTURE 23: INFORMATION THEORY REVIEW
Parametric Methods Berlin Chen, 2005 References:
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Machine Learning – a Probabilistic Perspective
Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)
Presentation transcript:

ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011

Pattern Recognition Overview Unknown Classifier/ Regressor Feature extraction: extract the most discriminative features to concisely represent the original data, typically involving dimensionality reduction Training/Learning: learn a mapping function that maps input to output Classification/regression: map the input to a discrete output value for classification and to continuous output value for regression. Feature extraction Training Raw Data Features Output Values Training Testing Raw Data Features Output Values Training Classification/ Regression Feature extraction Learned Classifier/ Regressor Training Testing

Pattern Recognition Overview (cont’d) Supervised learning Both input (feature) and output (class labels) are provided Unsupervised learning-only input is given Clustering Dimensionality reduction Density estimation Semi-supervised learning-some input has output labels and others do not have

Examples of Pattern Recognition Applications Computer/Machine Vision  object recognition, activity recognition, image segmentation, inspection Medical Imaging  Cell classification Optical Character Recognition  Machine or hand written character/digit recognition Brain Computer Interface  Classify human brain states from EEG signals Speech Recognition  Speaker recognition, speech understanding, language translation Robotics  Obstacle detection, scene understanding, navigation

Computer Vision Example: Facial Expression Recognition

Machine Vision Example

Example: Handwritten Digit Recognition

8 Probability Calculus P(X ˅ Y)=P(X)+P(Y) - P(X ˄ Y) U is the sample space X is a subset of the outcome or an event, i.e, X and Y are mutually exclusive

9 Probability Calculus (cont’d) Conditional independence The Chain Rule Given three events A, B, C

The Rules of Probability Sum Rule Product Rule

Bayes’ Theorem posterior  likelihood × prior

12

13 Bayes Rule Based on definition of conditional probability p(A i |E) is posterior probability given evidence E p(A i ) is the prior probability P(E|A i ) is the likelihood of the evidence given A i p(E) is the probability of the evidence    i ii iiii i ))p(AA|p(E ))p(AA|p(E p(E) ))p(AA|p(E E)|p(A p(B) A)p(A)|p(B p(B) B)p(A, B)|p(A A1A1 A2A2 A3A3 A4A4 A5A5 A6A6 E

Bayesian Rule (cont’d) Assume E 1 and E 2 are independent given H, the above equation may be written as where is the prior and is the likelihood of H given E 2

15 A Simple Example Consider two related variables: 1. Drug (D) with values y or n 2. Test (T) with values +ve or –ve And suppose we have the following probabilities: P(D = y) = P(T = +ve | D = y) = 0.8 P(T = +ve | D = n) = 0.01 These probabilities are sufficient to define a joint probability distribution. Suppose an athlete tests positive. What is the probability that he has taken the drug?

Expectation (or Mean) For discrete RV X For continuous RV X Conditional Expectation 16

Expectations Conditional Expectation (discrete) Approximate Expectation (discrete and continuous)

Variance The variance of a RV X Standard deviation Covariance of RVs X and Y, Chebyshev inequality 18

Variances and Covariances

Independence If X and Y are independent, then 20

Probability Densities p(x) is the density function, while P(x) is the cumulative distribution. P(x) is a non-decreasing function.

Transformed Densities

The Gaussian Distribution

Gaussian Mean and Variance

The Multivariate Gaussian  mean vector  covariance matrix

Minimum Misclassification Rate Two types of mistakes: False positive (type 1) False negative (type 2) The above is called Bayes error. Minimum Bayes error is achieved at x 0

Generative vs Discriminative Generative approach: Model Use Bayes’ theorem Discriminative approach: Model directly