Download presentation

Presentation is loading. Please wait.

1
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 10 Statistical Modelling Martin Russell

2
Slide 2 EE3J2 Data Mining Objectives To review basic statistical modelling To review the notion of probability distribution To review the notion of probability density function To introduce mixture densities To introduce the multivariate Gaussian density

3
Slide 3 EE3J2 Data Mining Discrete variables Suppose that Y is a random variable which can take any value in a discrete set X={x 1,x 2,…,x M } Suppose that y 1,y 2,…,y N are samples of the random variable Y If c m is the number of times that the y n = x m then an estimate of the probability that y n takes the value x m is given by:

4
Slide 4 EE3J2 Data Mining Discrete Probability Mass Function Symbol 1 2 3 4 5 6 7 8 9 Total Num.Occurrences 120 231 90 87 63 57 156 203 91 1098

5
Slide 5 EE3J2 Data Mining Continuous Random Variables In most practical applications the data are not restricted to a finite set of values – they can take any value in N-dimensional space Simply counting the number of occurrences of each value is no longer a viable way of estimating probabilities… …but there are generalisations of this approach which are applicable to continuous variables – these are referred to as non-parametric methods

6
Slide 6 EE3J2 Data Mining Continuous Random Variables An alternative is to use a parametric model In a parametric model, probabilities are defined by a small set of parameters Simplest example is a normal, or Gaussian model A Gaussian probability density function (PDF) is defined by two parameters – its mean and variance

7
Slide 7 EE3J2 Data Mining Gaussian PDF ‘Standard’ 1-dimensional Guassian PDF: – mean =0 –variance =1

8
Slide 8 EE3J2 Data Mining Gaussian PDF a b P(a x b)

9
Slide 9 EE3J2 Data Mining Gaussian PDF For a 1-dimensional Gaussian PDF p with mean and variance : Constant to ensure area under curve is 1 Defines ‘bell’ shape

10
Slide 10 EE3J2 Data Mining More examples =0.1 =1.0 =10.0 =5.0

11
Slide 11 EE3J2 Data Mining Fitting a Gaussian PDF to Data Suppose y = y 1,…,y n,…,y N is a set of N data values Given a Gaussian PDF p with mean and variance , define: How do we choose and to maximise this probability?

12
Slide 12 EE3J2 Data Mining Fitting a Gaussian PDF to Data Poor fitGood fit

13
Slide 13 EE3J2 Data Mining Maximum Likelihood Estimation Define the best fitting Gaussian to be the one such that p(y| , ) is maximised. Terminology: –p(y| , ), thought of as a function of y is the probability (density) of y –p(y| , ), thought of as a function of , is the likelihood of , Maximising p(y| , ) with respect to , is called Maximum Likelihood (ML) estimation of ,

14
Slide 14 EE3J2 Data Mining ML estimation of , Intuitively: –The maximum likelihood estimate of should be the average value of y 1,…,y N, (the sample mean) –The maximum likelihood estimate of should be the variance of y 1,…,y N. (the sample variance) This turns out to be true: p(y| , ) is maximised by setting:

15
Slide 15 EE3J2 Data Mining Multi-modal distributions In practice the distributions of many naturally occurring phenomena do not follow the simple bell- shaped Gaussian curve For example, if the data arises from several difference sources, there may be several distinct peaks (e.g. distribution of heights of adults) These peaks are the modes of the distribution and the distribution is called multi-modal

16
Slide 16 EE3J2 Data Mining Gaussian Mixture PDFs Gaussian Mixture PDFs, or Gaussian Mixture Models (GMMs) are commonly used to model multi-modal, or other non-Gaussian distributions. A GMM is just a weighted average of several Gaussian PDFs, called the component PDFs For example, if p 1 and p 2 are Gaussiam PDFs, then p(y) = w 1 p 1 (y) + w 2 p 2 (y) defines a 2 component Gaussian mixture PDF

17
Slide 17 EE3J2 Data Mining Gaussian Mixture - Example 2 component mixture model –Component 1: =0, =0.1 –Component 2: =2, =1 –w 1 = w 2 =0.5

18
Slide 18 EE3J2 Data Mining Example 2 2 component mixture model –Component 1: =0, =0.1 –Component 2: =2, =1 –w 1 = 0.2 w 2 =0.8

19
Slide 19 EE3J2 Data Mining Example 3 2 component mixture model –Component 1: =0, =0.1 –Component 2: =2, =1 –w 1 = 0.2 w 2 =0.8

20
Slide 20 EE3J2 Data Mining Example 4 5 component Gaussian mixture PDF

21
Slide 21 EE3J2 Data Mining Gaussian Mixture Model In general, an M component Gaussian mixture PDF is defined by: where each p m is a Gaussian PDF and

22
Slide 22 EE3J2 Data Mining Estimating the parameters of a Gaussian mixture model A Gaussian Mixture Model with M components has: –M means: 1,…, M –M variances 1,…, M –M mixture weights w 1,…,w M. Given a set of data y = y 1,…,y N, how can we estimate these parameters? I.e. how do we find a maximum likelihood estimate of 1,…, M, 1,…, M, w 1,…,w M ?

23
Slide 23 EE3J2 Data Mining Parameter Estimation If we knew which component each sample y t came from, then parameter estimation would be easy: –Set m to be the average value of the samples which belong to the m th component –Set m to be the variance of the samples which belong to the m th component –Set w m to be the proportion of samples which belong to the m th component But we don’t know which component each sample belongs to.

24
Slide 24 EE3J2 Data Mining Solution – the E-M algorithm Guess initial values For each n calculate the probabilities Use these probabilities to estimate how much each sample y n ‘belongs to’ the m th component Calculate: This is a measure of how much y n ‘belongs to’ the m th component REPEAT

25
Slide 25 EE3J2 Data Mining The E-M algorithm Parameter set p(y | ) (0) … (i) local optimum

26
Slide 26 EE3J2 Data Mining Multivariate Gaussian PDFs All PDFs so far have been 1-dimensional They take scalar values But most real data will be represented as D- dimensional vectors The vector equivalent of a Gaussian PDF is called a multivariate Gaussian PDF

27
Slide 27 EE3J2 Data Mining Multivariate Gaussian PDFs Contours of equal probability 1- dimensional Gaussian PDFs

28
Slide 28 EE3J2 Data Mining Multivariate Gaussian PDFs 1- dimensional Gaussian PDFs

29
Slide 29 EE3J2 Data Mining Multivariate Gaussian PDF The parameters of a multivariate Gaussian PDF are: –The (vector) mean –The (vector) variance –The covariance The covariance matrix

30
Slide 30 EE3J2 Data Mining Multivariate Gaussian PDFs Multivariate Gaussian PDFs are commonly used in pattern processing and data mining Vector data is often not unimodal, so we use mixtures of multivariate Gaussian PDFs The E-M algorithm works for multivariate Gaussian mixture PDFs

31
Slide 31 EE3J2 Data Mining Summary Basic statistical modelling Probability distributions Probability density function Gaussian PDFs Gaussian mixture PDFs and the E-M algorithm Multivariate Gaussian PDFs

32
Slide 32 EE3J2 Data Mining Summary

Similar presentations

© 2021 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google