Presentation on theme: " A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account."— Presentation transcript:
A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account to measure the variable Typical research lingo “How will you operationalize that variable?
Identify operational definitions of the following latent constructs: ◦ Intelligence of an individual ◦ Market value of a firm ◦ Employee theft ◦ Organizational performance ◦ Accounting Fraud ◦ Customer retention ◦ Team diversity
Consistency, dependability, or stability in measurements over observations or time Degree to which we measure something without error (It is a theory of error) Necessary but not sufficient condition for validity Reliability forms the upper bound for validity
X = T + E T = score or measurement that would be obtained under perfect conditions E = error because measurement is never perfect, assumes error is random X expected to equal T because E = 0, thus E(X) = E(T)
2 x = 2 t + 2 e In other words, the total variance associated with any measure is equal to true score variance plus error variance Where does error variance come from?
r xt = t 2 x 2 In other words, the theoretical definition of reliability: Reliability is the portion of the overall variance in a measure which is true score variance
Correlation equal to the ratio of the standard deviation of true scores to the standard deviation of observed scores Squared correlation indicates proportion of variance in observed scores due to true differences among people r xt 2 = r xx = t 2 / x 2
Since r xt 2 = r xx = t 2 / x 2 Reliability defined as ratio of true score variance to observed-score variance Square root of reliability is correlation between observed and true scores - called the reliability index
2 x = 2 t + 2 e so, 2 t = 2 x - 2 e reliability = 2 t / 2 x so by substitution, reliability = ( 2 x - 2 e ) / 2 x so, reliability = 1 - ( 2 e / 2 x ) so, reliability can range from 0 to 1 and if r xx =.80 then 80% variance is systematic Does not distinguish between true variance and systematic error In reliability theory, unsystematic variance = error
Equivalent Forms Test-Retest Internal Consistency Interrater/Interobserver (not considered a true model of reliability) Differ in how they treat error
Test Retest - Observations taken at two different time periods using the same measure are correlated Coefficient of stability Error includes anything that changes between one administration of measures and next (including real changes) Not good for unstable constructs
Equivalent Forms - Observations using two different forms of measurement are correlated Coefficient of equivalence (and stability if forms are administered at different times) Error includes any changes between administrations plus anything that varies between one form and next (including real changes over time) Not good for unstable constructs (e.g., mood) unless measures are taken within a short period of time
Internal Consistency - Items within a given measure are correlated Coefficient of equivalence ◦ Split-half Even-odd Early-late Random split ◦ Coefficient alpha Does not assess stability of a measure over time
Can be viewed as variation of equivalent forms Measures within a given form are split and correlated Because correlation is based half of the items, can correct using Spearman-Brown r xx = 2r 1/2 1/2 / (1 + r 1/2 1/2 ) Where r 1/2 1/2 is the correlation between halves Flawed approach (many possible halves)
Can be derived from Spearman-Brown formula Assumes measures are homogenous Average of all possible split-half reliability coefficients of a given measure Does NOT tell you if there is a single factor or construct in your scale (does NOT indicate unidimensionality)
k = number of items (or measures) Alternatively, the formula can be written: Where C is the sum of all items in a covariance matrix With this formula, what effect should k have?
Number of items usually increases internal consistency reliability (if items are of equal quality) Increased correlations between items increases internal consistency reliability With enough items, alpha can be high even when correlations between items is low Alpha should NOT be taken to indicate that the scale is unidimensional Benchmark for “acceptable” alpha =.7 (Nunnally and Bernstein, 1994)
Interobserver or Interrater - Measures between different raters are correlated Coefficient of equivalence for raters Not the same as agreement (agreement is absolute, but reliability is correlational/ relational) Error includes anything that leads to differences between observer/judge ratings
Often we need a measure of agreement - kappa is a measure of agreement between two judges on categories Researchers often use % agreement - not appropriate % agreement doesn’t take into account chance levels (e.g., two categories, chance = 50%) kappa deals with this by using a contingency table Use kappa if you need to measure interrater agreement on categorical coding
kappa - a measure of agreement between two observers taking into account agreement that could occur by chance (expected agreement). kappa = Observed agreement - Expected agreement -------------------------------------- 100% - Expected agreement
High reliability, low agreement (job applicants rated by interviewers on 1-5 scale) App 1App 2App3 Intv1544 Intv2433 Intv3322
Nature of construct being measured Sources of error relevant to construct ◦ Test/retest - all changes due to time ◦ Equivalent forms - all changes due to time and all differences between forms ◦ Internal consistency - differences between items within a measure ◦ Interrater/Interobserver - differences between observers and judges
Early stages of research in an area, lower reliabilities may be acceptable Higher reliabilities are required when measures used to differentiate among groups Extremely high reliabilities required when making important decisions Rule of thumb =.7 minimum reliability; preferably.8+ (Nunnally & Bernstein, 1994) but.95+ may be TOO high
Attenuation of correlations r xy = r * xy r xx r yy ) r * xy =correlation between true scores of X,Y Correction for attenuation r * xy = r xy / r xx r yy )
Look at the hypothesis statements you wrote last week For each variable (independent, dependent, mediator, moderator) from last week, discuss how you would operationally define and measure this variable. After you have identified your measure, indicate how you would assess the reliability of your measure, if possible. If it is not possible to assess reliability, why not?