# Identifying Good Measurement

Identifying Good Measurement
Detailed Learning Objectives
1. Recognize the difference between a conceptual variable and its operationalization. 2. List three ways psychologists typically operationalize variables: self-report, observational, and physiological. 3. Classify variable scales as categorical or quantitative. 4. Describe the difference between the validity and the reliability of a measure. 5. Identify three types of reliability (test-retest, interrater, and internal), and know when each type is relevant.

6. Review scatterplots, focusing on how scatterplots show the direction and strength of a relationship. 7. Apply the correlation coefficient, r, as a way to describe the direction and strength of a relationship. (In this chapter, r is relevant as a common statistic to describe reliability and validity.) 8. Identify face and content validity. 9. Identify predictive, concurrent, convergent, and discriminant validity. 10. Describe how scatterplots, r, and known groups can be used to evaluate predictive, concurrent, convergent, and discriminant validity.

Conceptual and operational variables Three common types of measures Scales of measurement

Constructs and Operationalizations
5-item scale Well-being No. of smiles Brain scan Construct Operationalization Three types of operationalization Self-report Observational Physiological

Scales of Measurement Categorical Quantitative
Ordinal (meaningful values but unequal intervals between units) Interval (equal intervals between units but no meaningful zero) Ratio (equal intervals and a meaningful zero)

Discussion starter The claim: “College students are getting more narcissistic.”

NPI Example Items: Forced-Choice Format (Ames, Rose, & Cameron, 2006)
Narcissistic response Non-narcissistic response I know that I am good because everybody keeps telling me so. I like to be the center of attention. I think I am a special person. I insist upon getting the respect that is due me. Everybody likes to hear my stories. I am going to be a great person. When people compliment me I sometimes get embarrassed. I prefer to blend in with the crowd. I am no better nor no worse than most people. I usually get the respect that I deserve. Sometimes I tell good stories. I hope I am going to be successful. Ames, D. R., Rose, P., and Anderson, C. P. (2006). The NPI-16 as a short measure of narcissism. Journal of Research in Personality, 40, The article, with all 16 items, is available from the author’s website :http://www.columbia.edu/~da358/

Three types of reliability Test-retest Interrater Internal Using a scatterplot to evaluate reliability Using the correlation coefficient r to evaluate reliability

Test-retest reliability Interrater reliability Internal reliability When is each kind of reliability necessary? Why is reliability an empirical question? What does reliability tell us?

Test-Retest Reliability: Consistent scores every time we test

Interrater Reliability: Consistent scores no matter who is rating

Interrater Reliability Example
Demo of interrater reliability Pair up: Three kids 0:38 to 2:02 Girl in pink Girl in yellow Girl in blue Two people in each group count: How many times does she look away from the teacher? How many times does she clap? How many times does she put her hands in her lap?

Internal Reliability: Consistent scores no matter how you ask
Ames, D. R., Rose, P., & Anderson, C. P. (2006). The NPI-16 as a short measure of narcissism. Journal of Research in Personality, 40, 440–450.

Internal Reliability Internal reliability
(not to be confused with internal validity!) The extent to which multiple measures, or items, are all answered the same by the same set of people. Cronbach’s alpha: An average of all of the possible item-total correlations.

Narcissistic response Non-narcissistic response
I know that I am good because everybody keeps telling me so. I like to be the center of attention. I think I am a special person. I insist upon getting the respect that is due me. Everybody likes to hear my stories. I am going to be a great person. When people compliment me I sometimes get embarrassed. I prefer to blend in with the crowd. I am no better nor no worse than most people. I usually get the respect that I deserve. Sometimes I tell good stories. I hope I am going to be successful. “The NPI-16 had an alpha of .72, while the full 40-item measure revealed an alpha of .84” (Ames et al., 2006, p. 442). Some items from the NPI (16-item version) Ames, D. R., Rose, P., and Anderson, C. P. (2006). The NPI-16 as a short measure of narcissism. Journal of Research in Personality, 40, 440–450.

Measurement validity of abstract constructs Face validity and content validity Predictive validity and concurrent validity Convergent validity and discriminant validity Relationship between reliability and validity

Subjective forms Empirically derived forms

Face and Content Validity
Face validity: Does it look like a good measure? (often assessed by asking experts) Content validity: Does it include all the important components of the construct?

Predictive and Concurrent Validity
Correlation method

Predictive and Concurrent Validity
Known groups method

Predictive and Concurrent Validity
Known groups method

Convergent and Discriminant Validity

Convergent and Discriminant Validity

Homework: Reliability
What kind(s) of reliability would need to be evaluated? Draw a scatterplot or describe a result that would indicate that the measure has good reliability and one that shows it has poor reliability. Researchers place unobtrusive video recording devices in the living rooms of 20 children. Later, coders view tapes of the living areas and code how many minutes each child spends playing video games. Clinical psychologists have developed a 7-item self-report measure to quickly identify people who are at risk for panic disorder. A restaurant owner uses a response card with four items in order to evaluate how satisfied customers are with the food, service, ambience, and overall experience.

How might you show that this measure has predictive validity?
Homework: Validity How might you show that this measure has predictive validity? How might you show that this measure has convergent and discriminant validity? Clinical psychologists have developed a 7-item self-report measure to quickly identify people who are at risk for panic disorder. A restaurant owner uses a response card to evaluate how satisfied customers are with the food. It contains one item, “Please rate the quality of the food:” on a scale from 1 (very dissatisfied) to 4 (very satisfied).

Relationship Between Reliability and Validity
Can a measure be reliable but not valid? Examples: Shoe size as an intelligence test (reliable, not valid) Number of children you have as a measure of interest in children (reliably measured, but correlated with interest?) Can a measure be valid but not reliable? (No) Not reliable and not valid Reliable and valid Reliable but not valid

Reliability Is Necessary, But Not Sufficient for, Validity

Interrogating Construct Validity as a Consumer

Interrogating Construct Validity as a Consumer
Diener’s measure of happiness Gallup poll’s measure of happiness

Correlations of the 16-item NPI with:
Ames et al. (2006, p. 444) What kind of validity are these correlations supporting? Correlations of the 16-item NPI with: 40-item NPI: ** Extraversion: ** Agreeableness: ** Self-esteem: ** Belief in a just world: .04 Ames, D. R., Rose, P., & Anderson, C. P. (2006). The NPI-16 as a short measure of narcissism. Journal of Research in Personality, 40,

Reliability in articles
“The NPI-16 had an alpha of .72, while the full 40-item measure revealed an alpha of .84” (Ames et al., 2006, p. 442). Ames, D. R., Rose, P., & Anderson, C. P. (2006). The NPI-16 as a short measure of narcissism. Journal of Research in Personality, 40, 440–450.