Presentation on theme: "Variable Cluster Analysis: A useful approach to identify underlying dimensions of a questionnaire Usree Kirtania, MS; Cynthia Davis, MS Institute for."— Presentation transcript:
1 Variable Cluster Analysis: A useful approach to identify underlying dimensions of a questionnaire Usree Kirtania, MS; Cynthia Davis, MS Institute for Community Health Promotion, Nov 2006BROWNUNIVERSITYOBJECTIVE: To identify underlying dimensions of a questionnaire using Variable cluster analysis (VARCLUS) approach.Introduction_____________________________________________________Variable Cluster Analysis, (implemented in SAS through PROC VARCLUS), is another variable reduction method that often has distinct advantages over the traditional Factor Analysis (FA) approach. This method borrowed some ideas from the Factor Analysis method and some from the Hierarchical Clustering method and produces either disjoint or hierarchical clusters. These distinct clusters from VARCLUS help to identify underlying dimensions of a questionnaire which are essential for developing a well constructed scale score.Data________________________________________________We applied the VARCLUS method in a 94 item food habit questionnaire (STFHQ) from the SISTERTALK study,which is a weight control intervention program for African American women (N=461).Each introductory item is followed by several behavioral items. We used 57 behavioral items in the analysis.Example: Introductory item: How often did you eat bacon or sausage?Behavioral item: How often was it low fat or turkey bacon?Multi level responses: (Almost always/often, sometimes, rarely, never).For all behavioral items higher score indicates higher fat intake behavior.Missing values generated in behavioral items due to response ‘never’ in an introductory item were imputed with zero.Fat related eating behaviors using VARCLUS procedure_________________________________________________Preliminary VARCLUS suggested 7 distinct clusters.59% total variation explained by these 7 clusters. Cluster items and1 –R2 ratio has been presented below.Restaurant (=0.72)Sitdn1 (0.41)Sitdn3 (0.41)Sitdn5 (0.52)Sitdn7 (0.50)Lean fat food (=0.48)Grdmt1 (0.52)Grdmt2 (0.59)Redmt1 (0.62)Redmt3 (0.52)Higher fat food (=0.74)Chick1 (0.39)Ffish1 (0.56)Sitdn6 (0.58)Ffood6 (0.60)Chinese food (=0.65)Chins1 (0.32)Chins2 (0.46)Chins4 (0.50)Milk fat (=0.71)Milk1 (0.19)Milk21 (0.16)Fruit as snack/ dessert(=0.76)Otdes4 (0.23)Snack3(0.23)VARCLUS Procedure______________________________________________________VARCLUS procedure vs. Factor Analysis (FA)______________________________________________2nd eigenvalueVARCLUSFactor Analysis (FA)Adding fat (=0.59)Hotcr1 (0.49)Sandw1 (0.51)Potat3 (0.57)X1, X2, X3, X4, X5X1, X3, X4X2, X5X1, X3X4All variables start in one cluster.MAXEIGEN=optionusing Correlation matrixORPERCENT=optionusing Covariance matrixEstimate communalitiesusing Squared MultipleCorrelation (SMC)1.7Threshold(Default)12) 2nd eigenvalue > specifiedthreshold => additional dimensions.Fat related eating behaviors using Factor Analysis__________________________________________________FA produced 7 factors. 62% total variation explained by these 7 factors. Higher fat food factor (8 items α=0.74).Chinese food factor (3 items α=0.65). Restaurant factor (4 items α=0.72). Milk fat factor (3 items α=0.70). Low/lean fat food factor (4 items α=0.44). Fruit as snack/dessert factor (2 items α=0.76). Adding fat factor (2 items α=0.44).ConclusionsNo estimating communalities makes the VARCLUS procedure simple.Due to distinct clusters, VARCLUS is an easier method to detect and to explain underlying dimensions compared to the Factor Analysis approach, which produces overlapping factors.So, we should consider the VARCLUS approach and use it more often along with FA because of it’s simplicity and interpretability.Contact: Usree Kirtania. MS.Statistical Data Analyst0.7Divisive Clustering3) The initial cluster divided intotwo clusters.Number of clustersClusters that meet2nd eigenvalue < specifiedthresholdNumber of factorsScree plotKaiser-Guttman rule6) VARCLUS uses the first principalcomponent based on correlation matrixor the first centroid component basedon covariance matrix.4) The procedure stops when eachcluster satisfies 2nd eigenvalue< specified threshold criterion.RotationVarimaxPromaxRotationOrtho-oblique7) VARCLUS generates(1-R2 own cluster)1-R2 Ratio =(1-R2next closest cluster)5) Variables have relatively highcorrelation with their own cluster andlow correlation with other clusters.Cluster representative(1-R2) ratio(Lower is better)Factor representativeFactor loading(Higher is better)
Your consent to our cookies if you continue to use this website.