Presentation on theme: "Chapter 12: Cluster analysis and segmentation of customers"— Presentation transcript:
1 Chapter 12: Cluster analysis and segmentation of customers
2 Commercial applications A chain of radio-stores uses cluster analysis for identifying three different customer types with varying needs.An insurance company is using cluster analysis for classifying customers into segments like the “self confident customer”, “the price conscious customer” etc.A producer of copying machines succeeds in classifying industrial customers into “satisfied” and “non-satisfied or quarrelling” customers.
4 Dependence and Independence methods Dependence Methods: We assume that a variable (i.e. Y) depends on (are caused or determined by) other variables (X1, X2 etc.)Examples: Regression, ANOVA, Discriminant AnalysisIndependence Methods: We do not assume that any variable(s) is (are) caused by or determined by others. Basically, we only have X1, X2 ….Xn (but no Y)Examples: Cluster Analysis, Factor Analysis etc.
5 Dependence and Independence methods Dependence Methods: The model is defined apriori (prior to survey and/or estimation)Examples: Regression, ANOVA, Discriminant AnalysisIndependence Methods: The model is defined aposteriori (after the survey and/or estimation has been carried out)Examples: Cluster Analysis, Factor Analysis etc.When using independence methods we let the data speak for themselves!
6 Dependence method: Multiple regression Y (Sales)X1 (Price)X2 (Price Competitor)X3 (Adverting)Obs1Obs2Obs3Obs4Obs5Obs6Obs7Obs8Obs9Obs10.95908085..10075The primary focus is on the variables!
7 Independence method: Cluster analysis X1X2X3Obs1Obs2Obs3Obs4Obs5Obs6Obs7Obs8Obs9Obs10532.41Cluster 1Cluster 2Cluster 3The primary focus is on the observations!
8 Cluster analysis output: A new cluster-variable with a cluster-number on each respondent X1X2X3ClusterObs1Obs2Obs3Obs4Obs5Obs6Obs7Obs8Obs9Obs10532.4..1
9 Cluster analysis: A cross-tab between the cluster- variable and background + opinions is established Age%-FemalesHousehold sizeOpinion 1Opinion 2Opinion 3323126.96.36.199.244542.94.03.43.356462.63.0“Younger male nerds”Core-families withTraditional values“Senior-relaxers”
10 Cluster profiling: (hypothetical) “Ecological shopper”Cluster 2:“Traditional shopper”Buy ecological foodAdvertisements funnyLow price important1 = Totally AgreeNote: Finally the clusters’ respective media-behaviour needs to be uncovered
11 A small example of cluster analysis Friendly (X02)Stagnant (X08) distancesClusterJohnBobCathyJohn-BobJohn-CathyBob-Cathy5143286AB
12 Governing principle Maximization of homogeneity within clusters and simultaneouslyMaximization of heterogeneity across clusters
13 Partitioning/k-means Non-overlapping(Exclusive) MethodsOverlapping MethodsNon-hierarchicalHierarchicalNon-hierarchical/Partitioning/k-means- Overlapping k-centroidsOverlapping k-meansLatent class techniques- Fuzzy clustering- Q-type Factor analysis (9)AgglomerativeDivisive- Sequential threshold- Parallel threshold- Neural Networks- Optimized partitioning (8)LinkageMethodsCentroidVarianceName in SPSS123456789Between-groups linkageWithin-groups linkageNearest neighbourFurthest neighbourCentroid clusteringMedian clusteringWard’s methodK-means cluster(Factor)- Centroid (5)- Median (6)- Average- Between (1)- Within (2)- Weighted- Single- Ordinary (3)- Density- Two stage Density- Complete (4)- Ward (7)Note: Methods in italics are availableIn SPSS. Neural networks necessitateSPSS’ data mining tool ClementineFigure Overview of clustering methods
14 2 Non overlapping Overlapping Single Linkage: Minimum distance * Complete Linkage:Maximum distance*HierarchicalNon-hierarchicalAverage Linkage:Average distance*Centroid method:Distance between centres*1a1b1c1b11b22AgglomerativeDivisiveWards method:Minimization ofwithin-cluster variance*Figure Illustration of important clustering issues in Figure 12.1
15 Euclidean distance (Default in SPSS): Y(x1, y1)(x2, y2)y2-y1x2-x1B*A*Xd = (x2-x1)2 + (y2-y1)2Other distances available in SPSS: City-Block uses of absolute differences instead of squared differences of coordinates. Moreover: Minkowski distance, Cosine distance, Chebychev distance, Pearson Correlation.
16 Euclidean distance Y B (3, 5) * 5-2 A * (1, 2) 3-1 X
17 Which two pairs of points are to be clustered first? G*AB**FC**D*EH**
18 Maybe A/B and D/E (depending on algorithm!) *AB**FC**D*EH**
21 How does one decide which cluster a “newcoming” point is to join? Measuring distances from point to clusters or points:“Farthest neighbour” (complete linkage)“Nearest neighbour” (single linkage)“Neighbourhood centre” (average linkage)
22 Quo vadis, C? (Continued) G*AB**10,58,57,011,0C*8,5D*9,012,09,5EH**
23 Complete linkage G * A B * * 10,5 C * D * 9,5 E H * * Minimize longest distance from cluster to pointG*AB**10,5C*D*9,5EH**
24 Average linkage G * A B * * 8,5 C * D * 9,0 E H * * Minimize average distance from cluster to pointG*AB**8,5C*D*9,0EH**
25 Single linkage Minimize shortest distance from cluster to point G * A B**7,0C*8,5D*EH**
26 Single linkage: Pitfall *A and C merge into the same cluster omitting B!*Chaining or Snake-like clusters*Cluster formation beginsAC*All the time the closest observation is put into the existing cluster(s)*B*****
27 Single linkage: Advantage ********Outliers****Entropy group****Good outlier detection and removal procedure in cases with “noisy” data sets
28 Cluster analysis Do our data at all permit the use of means? More potential pitfalls & problems:Do our data at all permit the use of means?Some methods (i.e. Wards) are biased toward production of clusters with approximately the same number of observations.Other methods (i. e. Centroid) require data as input that are metric scaled. So, strictly speaking it is not allowable to use this algorithm, when clustering data containing interval scales (Likert- or semantic differential scales).
29 Cluster analysis: Small artificial example 10,680,920,420,5832645Note: 6 points yield15 possible pairwisedistances - [n*(n-1)]/2
30 Cluster analysis: Small artificial example 10,6830,42260,92450,58
31 Cluster analysis: Small artificial example 10,6830,42260,92450,58
36 Dendrogram (Continued) OBS 1*OBS 2*“Supercluster”OBS 3*OBS 4*Step 4:Cluster 1 and 2 - fromStep 3 joint into a“Supercluster”OBS 5*OBS 6*0, , , , ,0A single observationremains unclustered (Outlier)
37 Textbooks in Cluster Analysis Brian S. EverittCluster Analysis for Social Scientists, 1983Maurice LorrCluster Analysis for Researchers, 1984Charles RomesburgCluster Analysis, 1984Aldenderfer and Blashfield
38 Case: Clustering of beer brands Brand profiles based om the 17 semantic differential scalesPurpose: to determine the market structure in terms of similar/different brandsHypothesis: reflects the competitive structure among brands due to consumers bahaviour