We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byItzel Hand
Modified about 1 year ago
Marketing Research Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Chapter Twenty-One Factor and Cluster Analysis
© Marketing Research 7th EditionAaker, Kumar, Day Factor Analysis Technique that serves to combine questions or variables to create new factors Purpose To identify underlying constructs in the data To reduce the number of variables to a more manageable set
© Marketing Research 7th EditionAaker, Kumar, Day Factor Analysis (Contd.) Methodology Two commonly employed factor analytic procedures Principal Component Analysis Used when the need is to summarize information in a larger set of variables to a smaller set of factors Common Factor Analysis Used to uncover underlying dimensions surrounding the original variables
© Marketing Research 7th EditionAaker, Kumar, Day Factor Analysis (Contd.) Principal Component Analysis The objective of factor analysis is to represent each of these variables as a linear combination of a smaller set of factors This can be represented as X 1 = I 11 F 1 + I 12 F 2 + e 1 X 2 = I 21 F 1 + I 22 F 2 + e 2. X n = i n1 f 1 + i n2 f 2 + e n Where X 1,... x n represent standardized scores F 1,F 2 are the two standardized factor scores I 11, i 12,....I 52 are factor loadings E 1,...E 5 are error variances
© Marketing Research 7th EditionAaker, Kumar, Day Factor Analysis (Contd.) Factor A variable or construct that is not directly observable but needs to be inferred from the input variables Eigenvalue Criteria Represents the amount of variance in the original variables that is associated with a factor Scree Plot Criteria A plot of the eigenvalues against the number of factors, in order of extraction.
© Marketing Research 7th EditionAaker, Kumar, Day Factor Analysis (Contd.) Percentage of Variance Criteria The number of factors extracted is determined so that the cumulative percentage of variance extracted by the factors reaches a satisfactory level Significance Test Criteria Statistical significance of the separate eigenvalues is determined, and only those factors that are statistically significant are retained
© Marketing Research 7th EditionAaker, Kumar, Day Factor Analysis (Contd.) Factor Scores Values of each factor underlying the variables Factor Loadings Correlations between the factors and the original variables
© Marketing Research 7th EditionAaker, Kumar, Day Factor Analysis (Contd.) Communality The amount of the variable variance that is explained by the factor Factor Rotation Factor analysis can generate several solutions for any data set. Each solution is termed a particular factor rotation and is generated by a particular factor rotation scheme
© Marketing Research 7th EditionAaker, Kumar, Day Factor Analysis (Contd.) How Many Factors? Rule of Thumb All included factors (prior to rotation) must explain at least as much variance as an "average variable" Eigenvalues Criteria Eigenvalue represents the amount of variance in the original variables associated with a factor Sum of the square of the factor loadings of each variable on a factor represents the eigen value Only factors with eigenvalues greater than 1.0 are retained
© Marketing Research 7th EditionAaker, Kumar, Day Factor Analysis (Contd.) Scree Plot Criteria Plot of the eigenvalues against the number of factors in order of extraction The shape of the plot determines the number of factors Percentage of Variance Criteria Number of factors extracted is determined when the cumulative percentage of variance extracted by the factors reaches a satisfactory level
© Marketing Research 7th EditionAaker, Kumar, Day Factor Analysis (Contd.) Common Factor Analysis The factor extraction procedure is similar to that of principal component analysis except for the input correlation matrix Communalities or shared variance is inserted in the diagonal instead of unities in the original variable correlation matrix
© Marketing Research 7th EditionAaker, Kumar, Day Cluster Analysis Technique that serves to combine objects to create new groups Used to group variables, objects or people The input is any valid measure of correlations between objects, such as Correlations Distance measures (Euclidean distance) Association coefficients Also, the number of clusters or the level of clustering can be input
© Marketing Research 7th EditionAaker, Kumar, Day Cluster Analysis (Contd.) Hierarchical Clustering Can start with all objects in one cluster and divide and subdivide them until all objects are in their own single-object cluster Non-hierarchical Approach Permits objects to leave one cluster and join another as clusters are being formed
© Marketing Research 7th EditionAaker, Kumar, Day Hierarchical Clustering Single Linkage Clustering criterion based on the shortest distance Complete Linkage Clustering criterion based on the longest distance Average Linkage Clustering criterion based on the average distance
© Marketing Research 7th EditionAaker, Kumar, Day Hierarchical Clustering (Contd.) Ward's Method Based on the loss of information resulting from grouping of the objects into clusters (minimize within cluster variation) Centroid Method Based on the distance between the group centroids (the centroid is the point whose coordinates are the means of all the observations in the cluster)
© Marketing Research 7th EditionAaker, Kumar, Day Non-hierarchical Clustering Sequential Threshold Cluster center is selected and all objects within a prespecified threshold is grouped Parallel Threshold Several cluster centers are selected and objects within threshold level are assigned to the nearest center Optimizing Modifies the other two methods in that the objects can be later reassigned to clusters on the basis of optimizing some overall criterion measure
© Marketing Research 7th EditionAaker, Kumar, Day Number of Clusters Determination of the Appropriate Number of Clusters Can Be Done in One of the Four Ways The number of clusters can be specified by the analyst in advance The levels of clustering can be specified by the analyst in advance The number of clusters can be determined from the pattern of clusters generated in the program The ratio of within-group variance and the between- group variance an be plotted against the number of clusters. The point at which a sharp bend occurs indicates the number of clusters
What we Measure vs. What we Want to Know "Not everything that counts can be counted, and not everything that can be counted counts." - Albert Einstein.
Multivariate Description. What Technique? Response variable(s)... Predictors(s) No Predictors(s) Yes... is one distribution summary regression models...
Discrimination amongst k populations. We want to determine if an observation vector comes from one of the k populations For this purpose we need to partition.
Exploratory Data Analysis and Multivariate Strategies Andrew Mead (School of Life Sciences)
Data reduction & classification Mark Tranmer CCSR.
Version 1.0 – 19 Jan 2009 Functional Genomics and Microarray Analysis (2)
Chapter 2 Overview of the Data Mining Process 1. Introduction Data Mining – Predictive analysis Tasks of Classification & Prediction Core of Business.
Data Mining Techniques and Applications, 1 st edition Hongbo Du ISBN © 2010 Cengage Learning Chapter Four Basic techniques for cluster.
Lecture 4. Linear Models for Regression. Outline Linear Regression Least Square Solution Subset Least Square subset selection/forward/backward Penalized.
CLUSTERING. Introduction A cluster is a collection of data objects that have similarity with objects within the same cluster and have dissimilarity with.
Sampling Design & Procedure. Population The aggregate of the all the elements, sharing some common set of characteristics that comprises the universe.
Factor Analysis with SAS Karl L. Wuensch Dept of Psychology East Carolina University.
Educational Research: Causal-Comparative Studies EDU 8603 Educational Research Richard M. Jacobs, OSA, Ph.D.
SAMPLING AND ESTIMATION. PARAMETERS AND STATISTICS A parameter is a quantity used to describe a population, and a statistic is a quantity computed from.
Chapter 15 ANOVA. Comparing Means for Several Populations When we wish to test for differences in means for only 1 or 2 populations, we use one- or two-sample.
BioInformatics (3). Computational Issues Data Warehousing: –Organising Biological Information into a Structured Entity (World’s Largest Distributed DB)
Wealth Index. Objectives To define the wealth index To explain how to identify the appropriate variables to include in the wealth index To present how.
Review Chapter 4 Sections 1-6. The Coordinate Plane 4-1.
1 Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION.
1 First example. 2 Solution No, main effects appear in the MARGINAL means. No, any treatment effect will appear in patterns of MEANS in the table, not.
The following slides have been adapted from to be presented at the Follow-up course on Microarray Data Analysis.
Clustering Clustering of data is a method by which large sets of data is grouped into clusters of smaller sets of similar data. The example below demonstrates.
MICS4 Survey Design Workshop Multiple Indicator Cluster Surveys Survey Design Workshop Advanced Sampling.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview of Lecture Parametric vs Non-Parametric Statistical Tests. Single Sample.
Chapter 11 Performance-Measure Approaches for selecting Optimum Portfolios By Cheng Few Lee Joseph Finnerty John Lee Alice C Lee Donald Wort.
Using of Clustering Techniques in Optimal Placement of Phasor Measurements Units GHEORGHE GRIGORAS GHEORGHE CARTINA MIHAI GAVRILAS Technical University.
Analysis of Variance Chapter 12 McGraw-Hill/Irwin Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.
Questions From Yesterday Equation 2: r-to-z transform –Equation is correct –Comparable to other p-value estimates (z = r sqrt[n]) ANOVA will not be able.
Using Trees to Depict a Forest Bin Liu, H.V. Jagadish Department of EECS University of Michigan Ann Arbor, USA Proceedings of Very Large Data Base Endowment.
© 2016 SlidePlayer.com Inc. All rights reserved.