Chapter_20 Cluster Analysis Naresh K. Malhotra

Slides:



Advertisements
Similar presentations
Different types of data e.g. Continuous data:height Categorical data ordered (nominal):growth rate very slow, slow, medium, fast, very fast not ordered:fruit.
Advertisements

McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Cluster analysis Species Sequence P.symA AATGCCTGACGTGGGAAATCTTTAGGGCTAAGGTTTTTATTTCGTATGCTATGTAGCTTAAGGGTACTGACGGTAG P.xanA AATGCCTGACGTGGGAAATCTTTAGGGCTAAGGTTAATATTCCGTATGCTATGTAGCTTAAGGGTACTGACGGTAG.
Hierarchical Clustering
Hierarchical Clustering. Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree-like diagram that.
N. Kumar, Asst. Professor of Marketing Database Marketing Cluster Analysis.
Chapter 12: Cluster analysis and segmentation of customers
Metrics, Algorithms & Follow-ups Profile Similarity Measures Cluster combination procedures Hierarchical vs. Non-hierarchical Clustering Statistical follow-up.
AEB 37 / AE 802 Marketing Research Methods Week 7
Cluster Analysis.
Cluster Analysis Hal Whitehead BIOL4062/5062. What is cluster analysis? Non-hierarchical cluster analysis –K-means Hierarchical divisive cluster analysis.
Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Cluster Analysis (from Chapter 12)
Chapter 17 Overview of Multivariate Analysis Methods
6-1 ©2006 Raj Jain Clustering Techniques  Goal: Partition into groups so the members of a group are as similar as possible and different.
Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.
Lecture 4 Cluster analysis Species Sequence P.symA AATGCCTGACGTGGGAAATCTTTAGGGCTAAGGTTTTTATTTCGTATGCTATGTAGCTTAAGGGTACTGACGGTAG P.xanA AATGCCTGACGTGGGAAATCTTTAGGGCTAAGGTTAATATTCCGTATGCTATGTAGCTTAAGGGTACTGACGGTAG.
Multivariate Data Analysis Chapter 9 - Cluster Analysis
CLUSTERING (Segmentation)
Clustering. What is clustering? Grouping similar objects together and keeping dissimilar objects apart. In Information Retrieval, the cluster hypothesis.
Dr. Michael R. Hyman Cluster Analysis. 2 Introduction Also called classification analysis and numerical taxonomy Goal: assign objects to groups so that.
Clustering analysis workshop Clustering analysis workshop CITM, Lab 3 18, Oct 2014 Facilitator: Hosam Al-Samarraie, PhD.
Segmentation Analysis
Cluster Analysis Chapter 12.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
CLUSTER ANALYSIS.
© 2007 Prentice Hall20-1 Chapter Twenty Cluster Analysis.
Chapter XX Cluster Analysis. Chapter Outline Chapter Outline 1) Overview 2) Basic Concept 3) Statistics Associated with Cluster Analysis 4) Conducting.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
1 Cluster Analysis Objectives ADDRESS HETEROGENEITY Combine observations into groups or clusters such that groups formed are homogeneous (similar) within.
Cluster Analysis Cluster Analysis Cluster analysis is a class of techniques used to classify objects or cases into relatively homogeneous groups.
Cluster Analysis.
1 Hair, Babin, Money & Samouel, Essentials of Business Research, Wiley, Learning Objectives: 1.Explain the difference between dependence and interdependence.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
K-Means Algorithm Each cluster is represented by the mean value of the objects in the cluster Input: set of objects (n), no of clusters (k) Output:
Selecting Diverse Sets of Compounds C371 Fall 2004.
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 9-1 Chapter 9 Cluster Analysis.
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L10.1 Lecture 10: Cluster analysis l Uses of cluster analysis.
Applied Multivariate Statistics Cluster Analysis Fall 2015 Week 9.
Copyright © 2010 Pearson Education, Inc Chapter Twenty Cluster Analysis.
Cluster Analysis. 1. A cluster, by definition, is a group of similar objects. Cluster analysis is a technique for grouping individuals or objects into.
Clustering / Scaling. Cluster Analysis Objective: – Partitions observations into meaningful groups with individuals in a group being more “similar” to.
1 Cluster Analysis Prepared by : Prof Neha Yadav.
Multivariate statistical methods Cluster analysis.
CLUSTER ANALYSIS. What is Cluster analysis? Cluster analysis is a techniques for grouping objects, cases, entities on the basis of multiple variables.
Basic statistical concepts Variance Covariance Correlation and covariance Standardisation.
CLUSTER ANALYSIS. Cluster Analysis  Cluster analysis is a major technique for classifying a ‘mountain’ of information into manageable meaningful piles.
Unsupervised Learning
Multivariate statistical methods
Reliability Analysis.
Chapter 15 – Cluster Analysis
Clustering based on book chapter Cluster Analysis in Multivariate Analysis by Hair, Anderson, Tatham, and Black.
Cluster Analysis.
Lecturing 12 Cluster Analysis
Charity Morgan Functional Data Analysis April 12, 2005
Revision (Part II) Ke Chen
Revision (Part II) Ke Chen
Reliability Analysis.
Cluster Analysis in Bioinformatics
Data Mining – Chapter 4 Cluster Analysis Part 2
Product moment correlation
Chapter_20 Cluster Analysis
Chapter_19 Factor Analysis
Cluster Analysis.
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Cluster analysis Presented by Dr.Chayada Bhadrakom
Hierarchical Clustering
Unsupervised Learning
Presentation transcript:

Chapter_20 Cluster Analysis Naresh K. Malhotra Marketing Research-an applied orientation, 4th ed.

Cluster Analysis Cluster analysis is a class of techniques used to classify objects or cases into relatively homogeneous groups called clusters. Objects in each cluster tend to be similar to each other and dissimilar to objects in the other clusters. Areas of Factor Analysis: Segmenting the market Understanding buyer behavior Identifying new product opportunities Selecting test markets Reducing data

Conducting Cluster Analysis It is a five-step process- Formulate the problem Select a distance measure Select a clustering procedure Decide on the number of clusters Interpret and profile clusters Assess the validity of clustering

Cluster Analysis Process Formulate the Problem Selecting the variables on which the clustering is based. Inclusion of even one or two irrelevant variables may distort . The distance measure determines how similar or dissimilar the objects being clustered are.

Cluster Analysis Process Select a distance or similarity measure The objective of clustering is to group similar objects together. The methods are: Euclidean distance or its square. City-block or Manhattan distance Chebychev distance

Cluster Analysis Process Select a clustering procedure Two approaches: Hierarchical a. Agglomerative i. Linkage methods - Single linkage - Complete linkage - Average linkage ii. Variance methods (Ward’s method) iii. Centroid methods b. Divisive Nonhierarchical a. Sequential threshold b. Parallel threshold c. Optimizing partitioning

Cluster Analysis Process Decide on number of Clusters The methods are: Theoretical, conceptual or practical considerations Hierarchical clustering Nonhierarchical clustering The relative sizes of the clusters should be meaningful

Cluster Analysis Process Interpret and profile the clusters It involves examining the cluster centroids. The centroids represent the mean values of the objects contained in the cluster on each of the variables. The variables that significantly differentiate between clusters can be identified via discriminant analysis and one way analysis of variance.

Cluster Analysis Process Asses reliability and validity Perform cluster analysis on the same data using different distance measure. Use different methods of clustering and compare the results. Split the data randomly into halves. Delete variables randomly.

Cluster Analysis Process: SPSS Windows Finding Agglomeration Schedule, Cluster Member and ICICLE Plot Analyze > Classify > Hierarchical Cluster Statistics > Check on Agglomeration schedule and Check on Range of solutions > 2-4 (for 2-4 clusters) Plots > check on dendogram Method > Clusters method > ward’s method OK