Chapter 12: Cluster analysis and segmentation of customers

Slides:



Advertisements
Similar presentations
Different types of data e.g. Continuous data:height Categorical data ordered (nominal):growth rate very slow, slow, medium, fast, very fast not ordered:fruit.
Advertisements

McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
Discrimination and Classification. Discrimination Situation: We have two or more populations  1,  2, etc (possibly p-variate normal). The populations.
Discrimination amongst k populations. We want to determine if an observation vector comes from one of the k populations For this purpose we need to partition.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Clustering.
Hierarchical Clustering
Cluster Analysis: Basic Concepts and Algorithms
1 CSE 980: Data Mining Lecture 16: Hierarchical Clustering.
Hierarchical Clustering. Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree-like diagram that.
N. Kumar, Asst. Professor of Marketing Database Marketing Cluster Analysis.
PARTITIONAL CLUSTERING
Metrics, Algorithms & Follow-ups Profile Similarity Measures Cluster combination procedures Hierarchical vs. Non-hierarchical Clustering Statistical follow-up.
AEB 37 / AE 802 Marketing Research Methods Week 7
Cluster Analysis.
Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Cluster Analysis (from Chapter 12)
Chapter 17 Overview of Multivariate Analysis Methods
Chapter Seventeen Copyright © 2006 McGraw-Hill/Irwin Data Analysis: Multivariate Techniques for the Research Process.
Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.
Clustering Petter Mostad. Clustering vs. class prediction Class prediction: Class prediction: A learning set of objects with known classes A learning.
Cluster Analysis: Basic Concepts and Algorithms
What is Cluster Analysis?
What is Cluster Analysis?
Multivariate Data Analysis Chapter 9 - Cluster Analysis
Clustering Ram Akella Lecture 6 February 23, & 280I University of California Berkeley Silicon Valley Center/SC.
Dr. Michael R. Hyman Cluster Analysis. 2 Introduction Also called classification analysis and numerical taxonomy Goal: assign objects to groups so that.
Clustering analysis workshop Clustering analysis workshop CITM, Lab 3 18, Oct 2014 Facilitator: Hosam Al-Samarraie, PhD.
Cluster Analysis Chapter 12.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
CLUSTER ANALYSIS.
© 2007 Prentice Hall20-1 Chapter Twenty Cluster Analysis.
Cluster analysis 포항공과대학교 산업공학과 확률통계연구실 이 재 현. POSTECH IE PASTACLUSTER ANALYSIS Definition Cluster analysis is a technigue used for combining observations.
1 Cluster Analysis Objectives ADDRESS HETEROGENEITY Combine observations into groups or clusters such that groups formed are homogeneous (similar) within.
Cluster Analysis Cluster Analysis Cluster analysis is a class of techniques used to classify objects or cases into relatively homogeneous groups.
Chapter 14 – Cluster Analysis © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
K-Means Algorithm Each cluster is represented by the mean value of the objects in the cluster Input: set of objects (n), no of clusters (k) Output:
CZ5225: Modeling and Simulation in Biology Lecture 3: Clustering Analysis for Microarray Data I Prof. Chen Yu Zong Tel:
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L10.1 Lecture 10: Cluster analysis l Uses of cluster analysis.
Applied Multivariate Statistics Cluster Analysis Fall 2015 Week 9.
Copyright © 2010 Pearson Education, Inc Chapter Twenty Cluster Analysis.
Clustering / Scaling. Cluster Analysis Objective: – Partitions observations into meaningful groups with individuals in a group being more “similar” to.
Chapter Seventeen Copyright © 2004 John Wiley & Sons, Inc. Multivariate Data Analysis.
1 Cluster Analysis Prepared by : Prof Neha Yadav.
CLUSTER ANALYSIS. What is Cluster analysis? Cluster analysis is a techniques for grouping objects, cases, entities on the basis of multiple variables.
Conjoint Analysis. 1. Managers frequently want to know what utility a particular product feature or service feature will have for a consumer. 2. Conjoint.
Clustering Machine Learning Unsupervised Learning K-means Optimization objective Random initialization Determining Number of Clusters Hierarchical Clustering.
Chapter_20 Cluster Analysis Naresh K. Malhotra
CLUSTER ANALYSIS. Cluster Analysis  Cluster analysis is a major technique for classifying a ‘mountain’ of information into manageable meaningful piles.
Cluster analysis and segmentation of customers
Unsupervised Learning
Machine Learning for the Quantified Self
Semi-Supervised Clustering
Clustering based on book chapter Cluster Analysis in Multivariate Analysis by Hair, Anderson, Tatham, and Black.
CZ5211 Topics in Computational Biology Lecture 3: Clustering Analysis for Microarray Data I Prof. Chen Yu Zong Tel:
Charity Morgan Functional Data Analysis April 12, 2005
Data Mining K-means Algorithm
Discrimination and Classification
Cluster analysis and segmentation of customers
Clustering and Multidimensional Scaling
Revision (Part II) Ke Chen
Data Mining – Chapter 4 Cluster Analysis Part 2
Chapter_20 Cluster Analysis
Cluster Analysis.
Clustering The process of grouping samples so that the samples are similar within each group.
Cluster analysis Presented by Dr.Chayada Bhadrakom
Hierarchical Clustering
Unsupervised Learning
Presentation transcript:

Chapter 12: Cluster analysis and segmentation of customers

Commercial applications A chain of radio-stores uses cluster analysis for identifying three different customer types with varying needs. An insurance company is using cluster analysis for classifying customers into segments like the “self confident customer”, “the price conscious customer” etc. A producer of copying machines succeeds in classifying industrial customers into “satisfied” and “non-satisfied or quarrelling” customers.

Input-data 1 Output-data 2 3 4 Cluster: X1 X2 … Xn Obs. 1 Obs. 2 Obs. i Obs, m Cluster 1 Cluster 2 Classify rows Factor 1 Factor 2 Factor: X1 X2 X3… Xj…Xn Obs. 1 Obs. 2 … Obs. m 3 4 Classify columns Figure 11.1 Relatedness of multivariate methods: cluster analysis and factor analysis

Dependence and Independence methods Dependence Methods: We assume that a variable (i.e. Y) depends on (are caused or determined by) other variables (X1, X2 etc.) Examples: Regression, ANOVA, Discriminant Analysis Independence Methods: We do not assume that any variable(s) is (are) caused by or determined by others. Basically, we only have X1, X2 ….Xn (but no Y) Examples: Cluster Analysis, Factor Analysis etc.

Dependence and Independence methods Dependence Methods: The model is defined apriori (prior to survey and/or estimation) Examples: Regression, ANOVA, Discriminant Analysis Independence Methods: The model is defined aposteriori (after the survey and/or estimation has been carried out) Examples: Cluster Analysis, Factor Analysis etc. When using independence methods we let the data speak for themselves!

Dependence method: Multiple regression Y (Sales) X1 (Price) X2 (Price Competitor) X3 (Adverting) Obs1 Obs2 Obs3 Obs4 Obs5 Obs6 Obs7 Obs8 Obs9 Obs10 1.700.000 1.400.000 1.200.000 1.500.000 . 95 90 80 85 .. 100 75 300.000 200.000 250.000 The primary focus is on the variables!

Independence method: Cluster analysis X1 X2 X3 Obs1 Obs2 Obs3 Obs4 Obs5 Obs6 Obs7 Obs8 Obs9 Obs10 5 3 2 . 4 1 Cluster 1 Cluster 2 Cluster 3 The primary focus is on the observations!

Cluster analysis output: A new cluster-variable with a cluster-number on each respondent X1 X2 X3 Cluster Obs1 Obs2 Obs3 Obs4 Obs5 Obs6 Obs7 Obs8 Obs9 Obs10 5 3 2 . 4 .. 1

Cluster analysis: A cross-tab between the cluster- variable and background + opinions is established Age %-Females Household size Opinion 1 Opinion 2 Opinion 3 32 31 1.4 3.2 2.1 2.2 44 54 2.9 4.0 3.4 3.3 56 46 2.6 3.0 “Younger male nerds” Core-families with Traditional values “Senior-relaxers”

Cluster profiling: (hypothetical) “Ecological shopper” Cluster 2: “Traditional shopper” Buy ecological food Advertisements funny Low price important 1 = Totally Agree 1 2 3 4 5 Note: Finally the clusters’ respective media-behaviour needs to be uncovered

A small example of cluster analysis Friendly (X02) Stagnant (X08)  distances Cluster John Bob Cathy John-Bob John-Cathy Bob-Cathy 5 1 4 3 2 8 6 A B

Governing principle Maximization of homogeneity within clusters and simultaneously Maximization of heterogeneity across clusters

Partitioning/k-means Non-overlapping (Exclusive) Methods Overlapping Methods Non-hierarchical Hierarchical Non-hierarchical/ Partitioning/k-means - Overlapping k-centroids Overlapping k-means Latent class techniques - Fuzzy clustering - Q-type Factor analysis (9) Agglomerative Divisive - Sequential threshold - Parallel threshold - Neural Networks - Optimized partitioning (8) Linkage Methods Centroid Variance Name in SPSS 1 2 3 4 5 6 7 8 9 Between-groups linkage Within-groups linkage Nearest neighbour Furthest neighbour Centroid clustering Median clustering Ward’s method K-means cluster (Factor) - Centroid (5) - Median (6) - Average - Between (1) - Within (2) - Weighted - Single - Ordinary (3) - Density - Two stage Density - Complete (4) - Ward (7) Note: Methods in italics are available In SPSS. Neural networks necessitate SPSS’ data mining tool Clementine Figure 12.1 Overview of clustering methods

2 Non overlapping Overlapping Single Linkage: Minimum distance * Complete Linkage: Maximum distance * Hierarchical Non-hierarchical Average Linkage: Average distance * Centroid method: Distance between centres * ¤ 1a 1b 1c 1b1 1b2 2 Agglomerative Divisive Wards method: Minimization of within-cluster variance * ¤ Figure 12.2 Illustration of important clustering issues in Figure 12.1

Euclidean distance (Default in SPSS): Y (x1, y1) (x2, y2) y2-y1 x2-x1 B * A * X d = (x2-x1)2 + (y2-y1)2 Other distances available in SPSS: City-Block uses  of absolute differences instead of squared differences of coordinates. Moreover: Minkowski distance, Cosine distance, Chebychev distance, Pearson Correlation.

Euclidean distance Y B (3, 5) * 5-2 A * (1, 2) 3-1 X

Which two pairs of points are to be clustered first? G * A B * * F C * * D * E H * *

Maybe A/B and D/E (depending on algorithm!) * A B * * F C * * D * E H * *

Quo vadis, C? G * A B * * C * D * E H * *

Quo vadis, C? (Continued) G * A B * * C * D * E H * *

How does one decide which cluster a “newcoming” point is to join? Measuring distances from point to clusters or points: “Farthest neighbour” (complete linkage) “Nearest neighbour” (single linkage) “Neighbourhood centre” (average linkage)

Quo vadis, C? (Continued) G * A B * * 10,5 8,5 7,0 11,0 C * 8,5 D * 9,0 12,0 9,5 E H * *

Complete linkage G * A B * * 10,5 C * D * 9,5 E H * * Minimize longest distance from cluster to point G * A B * * 10,5 C * D * 9,5 E H * *

Average linkage G * A B * * 8,5 C * D * 9,0 E H * * Minimize average distance from cluster to point G * A B * * 8,5 C * D * 9,0 E H * *

Single linkage Minimize shortest distance from cluster to point G * A B * * 7,0 C * 8,5 D * E H * *

Single linkage: Pitfall * A and C merge into the same cluster omitting B! * Chaining or Snake-like clusters * Cluster formation begins A C * All the time the closest observation is put into the existing cluster(s) * B * * * * *

Single linkage: Advantage * * * * ** * * Outliers * * * * Entropy group * * * * Good outlier detection and removal procedure in cases with “noisy” data sets

Cluster analysis Do our data at all permit the use of means? More potential pitfalls & problems: Do our data at all permit the use of means? Some methods (i.e. Wards) are biased toward production of clusters with approximately the same number of observations. Other methods (i. e. Centroid) require data as input that are metric scaled. So, strictly speaking it is not allowable to use this algorithm, when clustering data containing interval scales (Likert- or semantic differential scales).

Cluster analysis: Small artificial example 1 0,68 0,92 0,42 0,58 3 2 6 4 5 Note: 6 points yield 15 possible pairwise distances - [n*(n-1)]/2

Cluster analysis: Small artificial example 1 0,68 3 0,42 2 6 0,92 4 5 0,58

Cluster analysis: Small artificial example 1 0,68 3 0,42 2 6 0,92 4 5 0,58

Dendrogram * * * * * * 0,2 0,4 0,6 0,8 1,0 OBS 1 OBS 2 Step 0: OBS 3 Each observation is treated as a separate cluster OBS 3 * OBS 4 * OBS 5 * OBS 6 * Distance Measure 0,2 0,4 0,6 0,8 1,0

Dendrogram (Continued) OBS 1 * Cluster 1 OBS 2 * Step 1: Two observations with smallest pairwise distances are clustered OBS 3 * OBS 4 * OBS 5 * OBS 6 * 0,2 0,4 0,6 0,8 1,0

Dendrogram (Continued) OBS 1 * Cluster 1 OBS 2 * Step 2: Two other observations with smallest distances amongst remaining points/clusters are clustered OBS 3 * OBS 4 * OBS 5 * Cluster 2 OBS 6 * 0,2 0,4 0,6 0,8 1,0

Dendrogram (Continued) OBS 1 * Cluster 1 OBS 2 * OBS 3 * Step 3: Observation 3 joins with cluster 1 OBS 4 * OBS 5 * Cluster 2 OBS 6 * 0,2 0,4 0,6 0,8 1,0

Dendrogram (Continued) OBS 1 * OBS 2 * “Supercluster” OBS 3 * OBS 4 * Step 4: Cluster 1 and 2 - from Step 3 joint into a “Supercluster” OBS 5 * OBS 6 * 0,2 0,4 0,6 0,8 1,0 A single observation remains unclustered (Outlier)

Textbooks in Cluster Analysis Brian S. Everitt Cluster Analysis for Social Scientists, 1983 Maurice Lorr Cluster Analysis for Researchers, 1984 Charles Romesburg Cluster Analysis, 1984 Aldenderfer and Blashfield

Case: Clustering of beer brands Brand profiles based om the 17 semantic differential scales Purpose: to determine the market structure in terms of similar/different brands Hypothesis: reflects the competitive structure among brands due to consumers bahaviour

Case: Clustering of beer brands

Case: Clustering of beer brands

Case: Clustering of beer brands

Case: Clustering of beer brands

Case: Clustering of beer brands

Case: Clustering of beer brands

Case: Clustering of beer brands