Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unit 6a: Motivating Principal Components Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 1

Similar presentations


Presentation on theme: "Unit 6a: Motivating Principal Components Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 1"— Presentation transcript:

1 Unit 6a: Motivating Principal Components Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 1 http://xkcd.com/388/

2 Interitem Correlations Reliability… and multilevel modeling, revisited (AHHH!) Transition to PCA, by VVV: Visualizing Variables as Vectors © Andrew Ho, Harvard Graduate School of Education Unit 6a– Slide 2 Multiple Regression Analysis (MRA) Multiple Regression Analysis (MRA) Do your residuals meet the required assumptions? Test for residual normality Use influence statistics to detect atypical datapoints If your residuals are not independent, replace OLS by GLS regression analysis Use Individual growth modeling Specify a Multi-level Model If time is a predictor, you need discrete- time survival analysis… If your outcome is categorical, you need to use… Binomial logistic regression analysis (dichotomous outcome) Multinomial logistic regression analysis (polytomous outcome) If you have more predictors than you can deal with, Create taxonomies of fitted models and compare them. Form composites of the indicators of any common construct. Conduct a Principal Components Analysis Use Cluster Analysis Use non-linear regression analysis. Transform the outcome or predictor If your outcome vs. predictor relationship is non-linear, Use Factor Analysis: EFA or CFA? Course Roadmap: Unit 6a Today’s Topic Area

3 © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 3 Here’s a dataset in which teachers’ responses to what the investigators believed were multiple indicators/ predictors of a single underlying construct of Teacher Job Satisfaction:  The data are described in TSUCCESS_info.pdf. Here’s a dataset in which teachers’ responses to what the investigators believed were multiple indicators/ predictors of a single underlying construct of Teacher Job Satisfaction:  The data are described in TSUCCESS_info.pdf. DatasetTSUCCESS.txt OverviewResponses of national sample of teachers to six questions about job satisfaction. Source Administrator and Teacher Survey of the High School and Beyond (HS&B) dataset, 1984 administration, National Center for Education Statistics (NCES). All NCES datasets are also available free from the EdPubs on-line supermarket.High School and BeyondNational Center for Education StatisticsEdPubs Sample Size5269 teachers (4955 with complete data). More Info HS&B was established to study educational, vocational, and personal development of young people beginning in their elementary or high school years and following them over time as they began to take on adult responsibilities. The HS&B survey included two cohorts: (a) the 1980 senior class, and (b) the 1980 sophomore class. Both cohorts were surveyed every two years through 1986, and the 1980 sophomore class was also surveyed again in 1992. Multiple Indicators of a Common Construct

4 © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 4 ColVarVariable DescriptionLabels 1 X1 You have high standards of teacher performance. 1 = strongly disagree2 = disagree 3 = slightly disagree4 = slightly agree 5 = agree6 = strongly agree 2 X2 You are continually learning on the job. 1 = strongly disagree2 = disagree 3 = slightly disagree4 = slightly agree 5 = agree6 = strongly agree 3 X3 You are successful in educating your students. 1 = not successful2 = a little successful 3 = successful4 = very successful 4 X4 It’s a waste of time to do your best as a teacher. 1 = strongly agree2 = agree, 3 = slightly agree4 = slightly disagree, 5 = disagree6 = strongly disagree 5 X5 You look forward to working at your school. 1 = strongly disagree2 = disagree 3 = slightly disagree4 = slightly agree 5 = agree6 = strongly agree 6 X6 How much of the time are you satisfied with your job? 1 = never2 = almost never 3 = sometimes4 = always As is typical of many datasets, TSUCCESS contains:  Multiple variables – or “indicators” – that record teacher’s responses to the survey items.  These multiple indicators are intended to provide teachers with replicate opportunities to report their job satisfaction (“teacher job satisfaction” being the focal “construct” in the research). As is typical of many datasets, TSUCCESS contains:  Multiple variables – or “indicators” – that record teacher’s responses to the survey items.  These multiple indicators are intended to provide teachers with replicate opportunities to report their job satisfaction (“teacher job satisfaction” being the focal “construct” in the research). To incorporate these multiple indicators successfully into subsequent analysis – whether as outcome or predictor – you must deal with several issues: 1.You must decide whether each of the indicators should be treated as a separate variable in subsequent analyses, or whether they should be combined to form a “composite” measure of the underlying construct of teacher job satisfaction. 2.To form such a composite, you must be able to confirm that the multiple indicators actually “belong together” in a single composite. 3.If you can confirm that the multiple indicators do indeed belong together in a composite, you must decide on the “best way” to form that composite. To incorporate these multiple indicators successfully into subsequent analysis – whether as outcome or predictor – you must deal with several issues: 1.You must decide whether each of the indicators should be treated as a separate variable in subsequent analyses, or whether they should be combined to form a “composite” measure of the underlying construct of teacher job satisfaction. 2.To form such a composite, you must be able to confirm that the multiple indicators actually “belong together” in a single composite. 3.If you can confirm that the multiple indicators do indeed belong together in a composite, you must decide on the “best way” to form that composite. Always know your items. Read each one. Take the test.

5 © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 5 Var Variable Description Labels X1 You have high standards of teacher performance. 1 = strongly disagree2 = disagree 3 = slightly disagree4 = slightly agree 5 = agree6 = strongly agree X2 You are continually learning on the job. 1 = strongly disagree2 = disagree 3 = slightly disagree4 = slightly agree 5 = agree6 = strongly agree X3 You are successful in educating your students. 1 = not successful2 = a little successful 3 = successful4 = very successful X4 It’s a waste of time to do your best as a teacher. 1 = strongly agree2 = agree, 3 = slightly agree4 = slightly disagree, 5 = disagree6 = strongly disagree X5 You look forward to working at your school. 1 = strongly disagree2 = disagree 3 = slightly disagree4 = slightly agree 5 = agree6 = strongly agree X6 How much of the time are you satisfied with your job? 1 = never2 = almost never 3 = sometimes4 = always Different Indicators Have Different Metrics: i.Indicators X1, X2, X4, & X5 are measured on 6-point scales. ii.Indicators X3 & X6 are measured on 4- point scales. iii.Does this matter, and how do we deal with it in the compositing process? iv.Is there a “preferred” scale length? Some Indicators “Point” In A “Positive” Direction And Some In A “Negative” Direction: i.Notice the coding direction of X4, compared to the directions of the rest of the indicators. ii.When we composite the indicators, what should we do about this? Coding Indicators On The “Same” Scale Does Not Necessarily Mean That They Have The Same “Value” At The Same Scale Points: i.Compare scale point “3” for indicators X3 and X6, for instance. ii.How do we deal with this, in compositing? Different Indicators Have Different Metrics: i.Indicators X1, X2, X4, & X5 are measured on 6-point scales. ii.Indicators X3 & X6 are measured on 4- point scales. iii.Does this matter, and how do we deal with it in the compositing process? iv.Is there a “preferred” scale length? Some Indicators “Point” In A “Positive” Direction And Some In A “Negative” Direction: i.Notice the coding direction of X4, compared to the directions of the rest of the indicators. ii.When we composite the indicators, what should we do about this? Coding Indicators On The “Same” Scale Does Not Necessarily Mean That They Have The Same “Value” At The Same Scale Points: i.Compare scale point “3” for indicators X3 and X6, for instance. ii.How do we deal with this, in compositing? Indicators are not created equally.  Different scales  Positive or negative wording/direction/“polarity”  Different variances on similar scales  Different means on similar scales (difficulty)  Different associations with the construct (discrimination) Indicators are not created equally.  Different scales  Positive or negative wording/direction/“polarity”  Different variances on similar scales  Different means on similar scales (difficulty)  Different associations with the construct (discrimination) Always know the scale of your items. Score your test.

6 *----------------------------------------------------------------------------------- * Input the raw dataset, name and label the variables and selected values. *----------------------------------------------------------------------------------- * Input the target dataset: infile X1-X6 using "C:\My Documents\ … \Datasets\TSUCCESS.txt" * Label the variables: label variable X1 "Have high standards of teaching" label variable X2 "Continually learning on job" label variable X3 "Successful in educating students" label variable X4 "Waste of time to do best as teacher" label variable X5 "Look forward to working at school" label variable X6 "Time satisfied with job" * Label the values of the variables: label define lbl1 1 "Strongly Disagree" 2 "Disagree" 3 "Slightly Disagree" /// 4 "Slightly Agree" 5 "Agree" 6 "Strongly Agree" label values X1 X2 X3 lbl1 label define lbl2 1 "Strongly Agree" 2 "Agree" 3 "Slightly Agree" /// 4 "Slightly Disagree" 5 "Disagree" 6 "Strongly Disagree" label values X4 lbl2 label define lbl3 1 "Not Successful" 2 "Somewhat Successful" /// 3 "Successful" 4 "Very Successful" label values X3 lbl3 label define lbl4 1 "Almost Never" 2 "Sometimes" /// 3 "Almost Always" 4 "Always" label values X6 lbl4 *----------------------------------------------------------------------------------- * Input the raw dataset, name and label the variables and selected values. *----------------------------------------------------------------------------------- * Input the target dataset: infile X1-X6 using "C:\My Documents\ … \Datasets\TSUCCESS.txt" * Label the variables: label variable X1 "Have high standards of teaching" label variable X2 "Continually learning on job" label variable X3 "Successful in educating students" label variable X4 "Waste of time to do best as teacher" label variable X5 "Look forward to working at school" label variable X6 "Time satisfied with job" * Label the values of the variables: label define lbl1 1 "Strongly Disagree" 2 "Disagree" 3 "Slightly Disagree" /// 4 "Slightly Agree" 5 "Agree" 6 "Strongly Agree" label values X1 X2 X3 lbl1 label define lbl2 1 "Strongly Agree" 2 "Agree" 3 "Slightly Agree" /// 4 "Slightly Disagree" 5 "Disagree" 6 "Strongly Disagree" label values X4 lbl2 label define lbl3 1 "Not Successful" 2 "Somewhat Successful" /// 3 "Successful" 4 "Very Successful" label values X3 lbl3 label define lbl4 1 "Almost Never" 2 "Sometimes" /// 3 "Almost Always" 4 "Always" label values X6 lbl4 © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 6 Standard data-input and indicator-naming statements. Label items descriptively, ideally with item stems/prompts. Make absolutely sure that your item scales are oriented in the same direction: Positive should mean something similar. Look at your data…  Every row is a person. A person-by-item matrix, a standard data representation in psychometrics.  Note that we have some missing data.  Every row is a person. A person-by-item matrix, a standard data representation in psychometrics.  Note that we have some missing data.

7 Exploratory Data Analysis for Item Responses Are these items on the same “scale”? © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 7

8 © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 8 Missing Data and Pairwise Correlations: Pairwise Deletion vs. Casewise/Listwise Deletion Diagonals of correlation matrices are always 1 (or left blank). In this case, the n-count is the number of teachers who responded to Question 1. Complete data vs. Casewise/Listwise deletion ( keep if NMISSING==0 ; drop if NMISSING>0 ). Note the differing n- counts across variables in complete data but not in casewise/listwise deleted data. We’ll proceed with listwise deletion here, but keep in mind the assumption that data are missing at random from the population. If missing data are few, no worries. Otherwise, explicitly state your assumptions and your approach, and consider advanced techniques like “multiple imputation.”

9 © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 9 Pairwise Correlations and the Argument for a “Construct” IndicatorX1X2X3X4X5X6 X1:Have high standards of teaching 0.560.160.210.250.19 X2:Continually learning on the job 0.55 (5058) 0.170.230.270.22 X3:Successful in educating students 0.16 (5069) 0.16 (5082) 0.300.360.43 X4:Waste of time to do best as teacher 0.21 (5071) 0.23 (5079) 0.30 (5094) 0.450.40 X5:Look forward to working at school 0.25 (5069) 0.27 (5070) 0.36 (5088) 0.45 (5091) 0.55 X6:Time satisfied with job 0.19 (5060) 0.22 (5069) 0.44 (5094) 0.40 (5082) 0.55 (5081) Bivariate correlations estimated under pairwise deletion Bivariate correlations estimated under listwise deletion (n=4955) single compositesame construct To justify forming a single composite, you must argue that all indicators measure the same construct:  Here, generally positive inter-correlations support a “uni-dimensional” view.  But, the small & heterogeneous values of indicator inter-correlations also suggest:  Either there is considerable measurement error in each indicator,  Or that some, or all, of indicators may also measure other unrelated constructs.  This is bad news for the “internal consistency” (reliability) of the ultimate composite. single compositesame construct To justify forming a single composite, you must argue that all indicators measure the same construct:  Here, generally positive inter-correlations support a “uni-dimensional” view.  But, the small & heterogeneous values of indicator inter-correlations also suggest:  Either there is considerable measurement error in each indicator,  Or that some, or all, of indicators may also measure other unrelated constructs.  This is bad news for the “internal consistency” (reliability) of the ultimate composite. Sample inter-correlations among the indicators: positive  Are all positive (thankfully!), small moderatemagnitude differ widely  Are of small to moderate magnitude but differ widely (unfortunately!). Sample inter-correlations among the indicators: positive  Are all positive (thankfully!), small moderatemagnitude differ widely  Are of small to moderate magnitude but differ widely (unfortunately!).

10 © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 10 Three ways of looking at “reliability” Three Definitions of Reliability 1.Reliability is the correlation between two sets of observed scores from a replication of a measurement procedure. 2.Reliability is the proportion of “observed score variance” that is accounted for by “true score variance.” 3.Reliability is like an average of pairwise interitem correlations, “scaled up” according to the number of items on the test (because averaging over more items decreases error variance). Three Definitions of Reliability 1.Reliability is the correlation between two sets of observed scores from a replication of a measurement procedure. 2.Reliability is the proportion of “observed score variance” that is accounted for by “true score variance.” 3.Reliability is like an average of pairwise interitem correlations, “scaled up” according to the number of items on the test (because averaging over more items decreases error variance). Three Necessary Intuitions 1.Any observed score is one of many possible replications. 2.Any observed score is the sum of a “true score” (average of all theoretical replications) and an error term. 3.Averaging over replications gives us better estimates of “true” scores by averaging over error terms. Three Necessary Intuitions 1.Any observed score is one of many possible replications. 2.Any observed score is the sum of a “true score” (average of all theoretical replications) and an error term. 3.Averaging over replications gives us better estimates of “true” scores by averaging over error terms.

11 © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 11 1) Correlation between two replications of a measurement procedure Robert Brennan, reliability guru, likes to use this aphorism: A person with one watch knows what time it is. A person with two watches is never quite sure.

12 2) Proportion of Observed Score Variance accounted for by True Score Variance observed variancetrue variance The Reliability of a measure (or composite) is a population parameter that describes how much of the observed variance in the measure (or composite) is actually true variance: T T E =

13 © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 13 Interlude: To Standardize or not to Standardize For an additive composite of “standardized” indicators: First, each indicator is standardized to a mean of 0 and a standard deviation of 1: Then, the standardized indicator scores are summed For an additive composite of “standardized” indicators: First, each indicator is standardized to a mean of 0 and a standard deviation of 1: Then, the standardized indicator scores are summed For an additive composite of “raw” indicators: Each indicator remains in its original metric. Composite scores are the sum of the scores on the raw indicators, for each person in the sample: where X 1i is the raw score of the i th teacher on the 1 st indicator, and so on … For an additive composite of “raw” indicators: Each indicator remains in its original metric. Composite scores are the sum of the scores on the raw indicators, for each person in the sample: where X 1i is the raw score of the i th teacher on the 1 st indicator, and so on …

14 © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 14 3) Average Interitem Covariance, Scaled Up alpha : Our straightforward command to obtain Cronbach’s alpha, an “internal consistency” estimate of population reliability.  Running this on standardized variables, STD1-STD6 (or, running this on unstandardized variables and using the std option) gives us “standardized coefficient alpha” alpha : Our straightforward command to obtain Cronbach’s alpha, an “internal consistency” estimate of population reliability.  Running this on standardized variables, STD1-STD6 (or, running this on unstandardized variables and using the std option) gives us “standardized coefficient alpha”  Recall that covariance is an “unstandardized” correlation.  A covariance on standardized variables is thus a correlation.  This is the straight average of our interitem correlations from Slide 9.  Recall that covariance is an “unstandardized” correlation.  A covariance on standardized variables is thus a correlation.  This is the straight average of our interitem correlations from Slide 9. The long-run average of errors is zero. Correlation between averages will rise. Proportion of observed score variance will rise as error variance drops.

15 © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 15 Providing each indicator in a composite is measuring the same underlying construct:  The more indicators you include in the composite, the higher the reliability of the composite.  Because:  Measurement errors in each indicator are random, and cancel out in the composite.  Any true variation in each indicator combines and surfaces through the noise. Providing each indicator in a composite is measuring the same underlying construct:  The more indicators you include in the composite, the higher the reliability of the composite.  Because:  Measurement errors in each indicator are random, and cancel out in the composite.  Any true variation in each indicator combines and surfaces through the noise. The Spearman-Brown “Prophecy” Formula

16 © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 16 Three ways of looking at “reliability” Three Definitions of Reliability 1.Reliability is the correlation between two sets of observed scores from a replication of a measurement procedure. 2.Reliability is the proportion of “observed score variance” that is accounted for by “true score variance.” 3.Reliability is like an average of pairwise interitem correlations, “scaled up” according to the number of items on the test (because averaging over more items decreases error variance). Three Definitions of Reliability 1.Reliability is the correlation between two sets of observed scores from a replication of a measurement procedure. 2.Reliability is the proportion of “observed score variance” that is accounted for by “true score variance.” 3.Reliability is like an average of pairwise interitem correlations, “scaled up” according to the number of items on the test (because averaging over more items decreases error variance). Three Necessary Intuitions 1.Any observed score is one of many possible replications. 2.Any observed score is the sum of a “true score” (average of all theoretical replications) and an error term. 3.Averaging over replications gives us better estimates of “true” scores by averaging over error terms. Three Necessary Intuitions 1.Any observed score is one of many possible replications. 2.Any observed score is the sum of a “true score” (average of all theoretical replications) and an error term. 3.Averaging over replications gives us better estimates of “true” scores by averaging over error terms.

17 © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 17 A Baseline Reliability Analysis  We use unstandardized items but include the std option, to standardize.  casewise deletion leads to 4955 observations across all items.  Positive signage because we already reversed the polarity of X4.  We use unstandardized items but include the std option, to standardize.  casewise deletion leads to 4955 observations across all items.  Positive signage because we already reversed the polarity of X4.

18 © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 18 Reliability from a Multilevel Modeling Perspective: Reshaping Data The Data in Wide Format: Every participant is a row. Every item is a column. The Data in Wide Format: Every participant is a row. Every item is a column. The Data in Long Format: Every item score is a row. A single column for all score replications.. The Data in Long Format: Every item score is a row. A single column for all score replications.. Think it might be possible to consider teachers as grouping variables for item scores? xtset ID ?

19 © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 19 Reliability from a Multilevel Modeling Perspective: Intraclass Correlation


Download ppt "Unit 6a: Motivating Principal Components Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 1"

Similar presentations


Ads by Google