1 Cluster Analysis Objectives ADDRESS HETEROGENEITY Combine observations into groups or clusters such that groups formed are homogeneous (similar) within.

Slides:



Advertisements
Similar presentations
CLUSTERING.
Advertisements

Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
N. Kumar, Asst. Professor of Marketing Database Marketing Cluster Analysis.
Clustering: Introduction Adriano Joaquim de O Cruz ©2002 NCE/UFRJ
Discriminant Analysis Database Marketing Instructor:Nanda Kumar.
Metrics, Algorithms & Follow-ups Profile Similarity Measures Cluster combination procedures Hierarchical vs. Non-hierarchical Clustering Statistical follow-up.
AEB 37 / AE 802 Marketing Research Methods Week 7
Cluster Analysis.
Cluster Analysis Hal Whitehead BIOL4062/5062. What is cluster analysis? Non-hierarchical cluster analysis –K-means Hierarchical divisive cluster analysis.
Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Cluster Analysis (from Chapter 12)
Chapter 17 Overview of Multivariate Analysis Methods
Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.
Cluster Analysis (1).
What is Cluster Analysis?
Multivariate Data Analysis Chapter 9 - Cluster Analysis
CLUSTERING (Segmentation)
Revision (Part II) Ke Chen COMP24111 Machine Learning Revision slides are going to summarise all you have learnt from Part II, which should be helpful.
Clustering Ram Akella Lecture 6 February 23, & 280I University of California Berkeley Silicon Valley Center/SC.
Dr. Michael R. Hyman Cluster Analysis. 2 Introduction Also called classification analysis and numerical taxonomy Goal: assign objects to groups so that.
Clustering analysis workshop Clustering analysis workshop CITM, Lab 3 18, Oct 2014 Facilitator: Hosam Al-Samarraie, PhD.
Clustering Unsupervised learning Generating “classes”
Segmentation Analysis
Cluster Analysis Chapter 12.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong
Introduction to the gradient analysis. Community concept (from Mike Austin)
CLUSTER ANALYSIS.
Weighted Chinese Restaurant Process for clustering barcodes Javier Cabrera John Lau Albert Lo DIMACS, Bristol U, and HKUST.
© 2007 Prentice Hall20-1 Chapter Twenty Cluster Analysis.
Cluster analysis 포항공과대학교 산업공학과 확률통계연구실 이 재 현. POSTECH IE PASTACLUSTER ANALYSIS Definition Cluster analysis is a technigue used for combining observations.
1 Motivation Web query is usually two or three words long. –Prone to ambiguity –Example “keyboard” –Input device of computer –Musical instruments How can.
Cluster Analysis Cluster Analysis Cluster analysis is a class of techniques used to classify objects or cases into relatively homogeneous groups.
Chapter 14 – Cluster Analysis © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
1 Hair, Babin, Money & Samouel, Essentials of Business Research, Wiley, Learning Objectives: 1.Explain the difference between dependence and interdependence.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
CLUSTERING AND SEGMENTATION MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
Data Science and Big Data Analytics Chap 4: Advanced Analytical Theory and Methods: Clustering Charles Tappert Seidenberg School of CSIS, Pace University.
K-Means Algorithm Each cluster is represented by the mean value of the objects in the cluster Input: set of objects (n), no of clusters (k) Output:
CS 8751 ML & KDDData Clustering1 Clustering Unsupervised learning Generating “classes” Distance/similarity measures Agglomerative methods Divisive methods.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall 6.8: Clustering Rodney Nielsen Many / most of these.
Cluster Analysis.
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L10.1 Lecture 10: Cluster analysis l Uses of cluster analysis.
Compiled By: Raj Gaurang Tiwari Assistant Professor SRMGPC, Lucknow Unsupervised Learning.
Definition Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to)
Applied Multivariate Statistics Cluster Analysis Fall 2015 Week 9.
Unsupervised Learning
Multidimensional Scaling and Correspondence Analysis © 2007 Prentice Hall21-1.
Clustering (1) Chapter 7. Outline Introduction Clustering Strategies The Curse of Dimensionality Hierarchical k-means.
Copyright © 2010 Pearson Education, Inc Chapter Twenty Cluster Analysis.
1 Cluster Analysis – 2 Approaches K-Means (traditional) Latent Class Analysis (new) by Jay Magidson, Statistical Innovations based in part on a presentation.
1 Cluster Analysis Prepared by : Prof Neha Yadav.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
Chapter_20 Cluster Analysis Naresh K. Malhotra
CLUSTER ANALYSIS. Cluster Analysis  Cluster analysis is a major technique for classifying a ‘mountain’ of information into manageable meaningful piles.
Topic 4: Cluster Analysis Analysis of Customer Behavior and Service Modeling.
Data Science Practical Machine Learning Tools and Techniques 6.8: Clustering Rodney Nielsen Many / most of these slides were adapted from: I. H. Witten,
Chapter 15 – Cluster Analysis
Lecturing 12 Cluster Analysis
Multidimensional Scaling
Revision (Part II) Ke Chen
Clustering and Multidimensional Scaling
Revision (Part II) Ke Chen
CSCI N317 Computation for Scientific Applications Unit Weka
Data Mining – Chapter 4 Cluster Analysis Part 2
Multidimensional Scaling
Chapter_20 Cluster Analysis
Cluster Analysis.
Clustering The process of grouping samples so that the samples are similar within each group.
Cluster analysis Presented by Dr.Chayada Bhadrakom
Presentation transcript:

1 Cluster Analysis Objectives ADDRESS HETEROGENEITY Combine observations into groups or clusters such that groups formed are homogeneous (similar) within the group and heterogeneous (different) from other groups on some variables (?). When we don’t have “some variables”, we can still form groups using Multidimensional Scaling (MDS) Techniques. MDS - continuous Space Cluster - discrete groups Main Application in Marketing: Market Segmentation Data requirement ~ generally interval or ratio (ordinal and nominal ??) Steps Decide on measures of distance (similarity or dissimilarity) Hierarchical Cluster ~decide on how to combine observations Non-hierarchical cluster (K-means or quick cluster) Interpretation of clusters How many clusters Cluster validation

2 Cluster Analysis: Measures of Distance~ Similarity or Dissimilarity Two types of measures of distance ( or proximity, similarity) Direct ~ we shall use in MDS Indirect »Derived from original variables or factor scores Indirect Measures of distance Non-metric ~ we shall use in MDS Metric Data Euclidean Distance Minkowski Distance Mahalanobis Distance Distance between BMW and Ford Euclidean Minkowski Mahalanobis i=BMW j=Ford k = nos. variables v1 v2 ED

3 Cluster Analysis: Hierarchical Clustering Methods to combine observations Centroid Nearest Neighbor or single linkage Farthest-neighbor or complete linkage Average linkage Ward’s Centriod Method s1 s2 s3 s4 s5s6 distance Nearest neighbor Dendogram Data should be scaled?

4 Cluster Analysis: Non-Hierarchical Clustering K-Means Cluster/ Quick Cluster The data are divided into k-groups each group representing a cluster STEPS Select k initial cluster centroids, the number of cluster desired Assign each observation to the cluster to which it is closest Reassign or relocate each observation to one of the k clusters according to predetermined stopping rule Say we want 3 clusters and first 3 observations are centroids Change criterion: Continue if > 2% Which Clustering Method is Best? 1. Hierarchical ~ Which one to use? ~ Advantage: no prior knowledge of nos. of clusters, ~ Disadvantage: Once assigned, no reassignment 2. K-Means / Quick Cluster ~ require prior knowledge, how many clusters? Complementary: Run Hierarchical, decide on no of clusters, Run K -Means

5. Interpretation of Clusters Pseudo F

6 Cluster Analysis: Validation. S1 = assignment based on cluster on 1-14 cases S2 = assignment based on separate cluster Hit rate =112/151 =74% Cross-validation Example from Text

7 Latent Segments Model to Incorporate Heterogeneity

8 Introduction Customer segmentation - partition consumers into homogeneous groups that differ in purchasing behavior It provides information about consumer preferences and market structure at segment level Consumers with similar socio- demographics have different purchasing behavior Brand choice probabilities can be used to define both market segment and market structure Theoretical model: Multinomial logit Conceptual appeal being grounded in economic theory Analytical tractability and ease of econometric estimation Excellent Empirical performance

9 Kamakura and Russell (1989) propose and test latent segmentation. Number of applications and numerous citation, 200+ Discrete interpretation of continuous distribution. Number of useful applications in Marketing and other areas. In our own work used to determine size of price sensitive segment (25% to 35%).