Presentation is loading. Please wait.

Presentation is loading. Please wait.

Discrimination and Classification

Similar presentations


Presentation on theme: "Discrimination and Classification"— Presentation transcript:

1 Discrimination and Classification

2 The Optimal Classification Rule
Suppose that the data x1, … , xp has joint density function f(x1, … , xp ;q) where q is either q1 or q2. Let g(x1, … , xp) = f(x1, … , xn ;q1) and h(x1, … , xp) = f(x1, … , xn ;q2) We want to make the decision D1: q = q1 (g is the correct distribution) against D2: q = q2 (h is the correct distribution)

3 then the optimal regions (minimizing ECM, expected cost of misclassification) for making the decisions D1 and D2 respectively are C1 and C2 and where

4 Fishers Linear Discriminant Function.
Suppose that x1, … , xp is either data from a p-variate Normal distribution with mean vector: The covariance matrix S is the same for both populations p1 and p2.

5 We make the decision D1 : population is p1 if where and

6 In the case where the populations are unknown but estimated from data
Fisher’s linear discriminant function

7 The function Is called Fisher’s linear discriminant function

8 Discrimination of p-variate Normal distributions (unequal Covariance matrices)
Suppose that x1, … , xp is either data from a p-variate Normal distribution with mean vector: and covariance matrices, S1 and S2 respectively.

9 The optimal rule states that we should classify into populations p1 and p2 using:
That is make the decision D1 : population is p1 if l ≥ k

10 or where and

11 Summarizing we make the decision to classify in population p1 if:
where and

12 Discrimination of p-variate Normal distributions
(unequal Covariance matrices)

13 Classification or Cluster Analysis
Have data from one or several populations

14 Situation Have multivariate (or univariate) data from one or several populations (the number of populations is unknown) Want to determine the number of populations and identify the populations

15 Example

16

17 Hierarchical Clustering Methods
The following are the steps in the agglomerative Hierarchical clustering algorithm for grouping N objects (items or variables). Start with N clusters, each consisting of a single entity and an N X N symmetric matrix (table) of distances (or similarities) D = (dij). Search the distance matrix for the nearest (most similar) pair of clusters. Let the distance between the "most similar" clusters U and V be dUV. Merge clusters U and V. Label the newly formed cluster (UV). Update the entries in the distance matrix by deleting the rows and columns corresponding to clusters U and V and adding a row and column giving the distances between cluster (UV) and the remaining clusters.

18 Repeat steps 2 and 3 a total of N-1 times
Repeat steps 2 and 3 a total of N-1 times. (All objects will be a single cluster a termination of this algorithm.) Record the identity of clusters that are merged and the levels (distances or similarities) at which the mergers take place.

19 Different methods of computing inter-cluster distance

20 Example To illustrate the single linkage algorithm, we consider the hypothetical distance matrix between pairs of five objects given below:

21 Treating each object as a cluster, the clustering begins by merging the two closest items (3 & 5).
To implement the next level of clustering we need to compute the distances between cluster (35) and the remaining objects: d(35)1 = min{3,11} = 3 d(35)2 = min{7,10} = 7 d(35)4 = min{9,8} = 8 The new distance matrix becomes:

22 The new distance matrix becomes:
The next two closest clusters ((35) & 1) are merged to form cluster (135). Distances between this cluster and the remaining clusters become:

23 Distances between this cluster and the remaining clusters become:
d(135)2 = min{7,9} = 7 d(135)4 = min{8,6} = 6 The distance matrix now becomes: Continuing the next two closest clusters (2 & 4) are merged to form cluster (24).

24 Distances between this cluster and the remaining clusters become:
d(135)(24) = min{d(135)2,d(135)4)= min{7,6} = 6 The final distance matrix now becomes: At the final step clusters (135) and (24) are merged to form the single cluster (12345) of all five items.

25 The results of this algorithm can be summarized graphically on the following "dendogram"

26 for clustering the 11 languages on the basis of the ten numerals
Dendograms for clustering the 11 languages on the basis of the ten numerals

27

28

29

30

31

32 Dendogram Cluster Analysis of N=22 Utility companies Euclidean distance, Average Linkage

33 Dendogram Cluster Analysis of N=22 Utility companies Euclidean distance, Single Linkage


Download ppt "Discrimination and Classification"

Similar presentations


Ads by Google