Intro. ANN & Fuzzy Systems Lecture 23 Clustering (4)

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Gaussian Mixture.
Image Modeling & Segmentation
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) ETHEM ALPAYDIN © The MIT Press, 2010
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.
Maximum Likelihood And Expectation Maximization Lecture Notes for CMPUT 466/551 Nilanjan Ray.
Lecture 3 Nonparametric density estimation and classification
Chapter 20 of AIMA KAIST CS570 Lecture note
Visual Recognition Tutorial
EE-148 Expectation Maximization Markus Weber 5/11/99.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
EE 290A: Generalized Principal Component Analysis Lecture 6: Iterative Methods for Mixture-Model Segmentation Sastry & Yang © Spring, 2011EE 290A, University.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 
6/10/ Visual Recognition1 Radial Basis Function Networks Computer Science, KAIST.
Chapter 4 (Part 1): Non-Parametric Classification
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
A gentle introduction to Gaussian distribution. Review Random variable Coin flip experiment X = 0X = 1 X: Random variable.
Part 4 c Baum-Welch Algorithm CSE717, SPRING 2008 CUBS, Univ at Buffalo.
Expectation Maximization for GMM Comp344 Tutorial Kai Zhang.
Visual Recognition Tutorial
Kernel Methods Part 2 Bing Han June 26, Local Likelihood Logistic Regression.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Incomplete Graphical Models Nan Hu. Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall Chp 4.1 – 4.3.
Lecture note for Stat 231: Pattern Recognition and Machine Learning 4. Maximum Likelihood Prof. A.L. Yuille Stat 231. Fall 2004.
CHAPTER 7: Clustering Eick: K-Means and EM (modified Alpaydin transparencies and new transparencies added) Last updated: February 25, 2014.
Lab 3b: Distribution of the mean
Image Modeling & Segmentation Aly Farag and Asem Ali Lecture #2.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
Roghayeh parsaee  These approaches assume that the study sample arises from a homogeneous population  focus is on relationships among variables 
CS Statistical Machine learning Lecture 24
Computer Vision Lecture 6. Probabilistic Methods in Segmentation.
Bayesian Density Regression Author: David B. Dunson and Natesh Pillai Presenter: Ya Xue April 28, 2006.
Lecture 3: MLE, Bayes Learning, and Maximum Entropy
Intro. ANN & Fuzzy Systems Lecture 15. Pattern Classification (I): Statistical Formulation.
Kernel Methods Arie Nakhmani. Outline Kernel Smoothers Kernel Density Estimators Kernel Density Classifiers.
Intro. ANN & Fuzzy Systems Lecture 38 Mixture of Experts Neural Network.
Intro. ANN & Fuzzy Systems Lecture 16. Classification (II): Practical Considerations.
Intro. ANN & Fuzzy Systems Lecture 24 Radial Basis Network (I)
Intro. ANN & Fuzzy Systems Lecture 20 Clustering (1)
Computational Intelligence: Methods and Applications Lecture 26 Density estimation, Expectation Maximization. Włodzisław Duch Dept. of Informatics, UMK.
EM Algorithm 主講人:虞台文 大同大學資工所 智慧型多媒體研究室. Contents Introduction Example  Missing Data Example  Mixed Attributes Example  Mixture Main Body Mixture Model.
RECITATION 2 APRIL 28 Spline and Kernel method Gaussian Processes Mixture Modeling for Density Estimation.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.
Lecture 7 Spring 2010 Dr. Jianjun Hu CSCE883 Machine Learning.
Lecture 15. Pattern Classification (I): Statistical Formulation
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Ch3: Model Building through Regression
Ch8: Nonparametric Methods
Machine learning, pattern recognition and statistical data modelling
Latent Variables, Mixture Models and EM
Outline Parameter estimation – continued Non-parametric methods.
Course Outline MODEL INFORMATION COMPLETE INCOMPLETE
Lecture 11: Mixture of Gaussians
INTRODUCTION TO Machine Learning
LECTURE 21: CLUSTERING Objectives: Mixture Densities Maximum Likelihood Estimates Application to Gaussian Mixture Models k-Means Clustering Fuzzy k-Means.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
LECTURE 16: NONPARAMETRIC TECHNIQUES
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Nonparametric density estimation and classification
EM Algorithm 主講人:虞台文.
Clustering (2) & EM algorithm
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.
Presentation transcript:

Intro. ANN & Fuzzy Systems Lecture 23 Clustering (4)

Intro. ANN & Fuzzy Systems (C) 2001 by Yu Hen Hu 2 Outline Clustering as Density Estimation Non-parametric density estimate: Parzen Windows Mixture Density Estimate

Intro. ANN & Fuzzy Systems (C) 2001 by Yu Hen Hu 3 Density Estimate Probability density function p(x) describes the distribution of samples in the feature space. p(x) can be estimated using –Non-parametric method: parzen window (kernel) method –Parametric method: mixture Gaussian density model It is related to clustering when each cluster is modeled with a uni-Gaussian density

Intro. ANN & Fuzzy Systems (C) 2001 by Yu Hen Hu 4 Probability Density Let f(x) denote the prob. density function of x over a feature space X. Let {x 1, …, x N } be i.i.d. samples drawn from X, then pr. K of them fall within , P k has a binomial distribution, and E(K) = K/N K should be increased as N does, and V should be decreased as N does Let V be the volume enclosed by , then the pr. density f(x) can be estimated as K/(NV). –V should be small enough that f(x) are relatively constant over . –But with N finite, K will approach 0 when V does. 3 conditions:

Intro. ANN & Fuzzy Systems (C) 2001 by Yu Hen Hu 5 Parzen Windows A non-parametric method using interpolation functions Let be such that {x(k); 1  k  N} are drawn from unknown density p(x). Set V N = h N d where h N is a smoothing parameter and d is the dimension of x. Then the estimate of p(x) is: Conditions: and Example of window function: kerneldemo.m

Intro. ANN & Fuzzy Systems (C) 2001 by Yu Hen Hu 6 Mixture Density Estimation Mixture density model: p(i): prior (mixing) prob. p(x|  (i),i): component density Gaussian mixture model (GMM) Let {x k ; 1  k  N} be drawn from the mixture density, goal is to estimate  (i) and p(i) (negative) log likelihood

Intro. ANN & Fuzzy Systems (C) 2001 by Yu Hen Hu 7 EM Algorithm for GMM We want to minimize the negative log likelihood but p(i)s are unknown (missing data). EM algorithm: –Expectation: Given current estimates of p(x|  (i), i} (i.e. {  i }, and {  i }), find {p(i)}. –Maximization: With current estimates of {p(i)}, update estimates of p(x|  (i), i} (i.e. {  i }, and {  i }) by minimizing negative log likelihood. Expectation step: To minimize subject to  p(i) = 1, and p(i)  0. Use Lagrange multiplier, one may find

Intro. ANN & Fuzzy Systems (C) 2001 by Yu Hen Hu 8 EM: Maximization Step Parameters  (µ and  ) can be estimated as: The formula for updating {  i }, and {  i } is similar to the so-called fuzzy c-mean clustering algorithm.