Presentation is loading. Please wait.

Presentation is loading. Please wait.

Validity: Conceptual Issues Furr & Bacharach Chapter 8.

Similar presentations


Presentation on theme: "Validity: Conceptual Issues Furr & Bacharach Chapter 8."— Presentation transcript:

1

2 Validity: Conceptual Issues Furr & Bacharach Chapter 8

3 Contrasting Reliability & Validity Both fundamental to a sophisticated understanding of psychometrics Must have a clear understanding of the relationship between the two

4 Definitions – notice differences Reliability Degree to which differences in test scores reflect differences among people in their levels of the trait that affects those scores, whatever that trait may be Quantitative property of the test scores Validity Tied to interpretation of test score Tied to theory and implication of scores

5 LINK Validity requires reliability Stable traits (Intelligence & IQ) Measure at two point in time, scores should be stable across time (test-retest reliability) If not, the test cannot be a valid test of IQ States (Depression & BDI) If poor internal consistency, can’t be valid Reliability does not imply validity Stable Trait (Autism & AQ) May have excellent test-retest reliability or good internal consistency, but may not be interpreted in a valid manner

6 Iowa story Don’t want to hire people who might abuse clients anymore!!! Personality tests… Is there a test that measures the construct? Does it validly measure abusive personality? Is there a test that was designed to predict the likelihood that a particular individual will abuse people?

7 What is validity? Definition Implications of the contemporary definition of validity

8 Validity ----- Definition Basic Definition The degree to which a test measures what it is supposed to measure Contemporary Definition “The degree to which evidence and theory support the interpretations of test scores entailed by the proposed uses” of the test

9 Implications of the contemporary definition

10 Implication 1 Interpretation and use of test scores

11 Validity  about interpretation & use of test scores NEO-PI-R Conscientiousness scale – 48 items High scores reflect an “active process of planning, organizing and carrying out tasks,“ and people with high scores on this scale are “purposeful, strong willed, and determined”

12 NEO-PI-R  Conscientiousness Scale What is the correct question about the scale’s validity or invalidity? Are the test items valid or invalid? Are the test scores valid or invalid? Is the interpretation of the test scores valid or invalid?

13 Not “are items or scores valid or invalid?” The question is: Are the authors’ interpretations of the scores valid or invalid? Are conscientiousness scores validly interpreted in terms of planfulness, organization, and determination?

14 Proposed use of scores… Employers may use NEO-PI-R Conscientiousness Scale to screen potential employees BELIEF: Differentiates potentially better and worse employees? Predictive power of conscientiousness scale score?

15 Hammer is a useful tool if you need to drive a nail…

16 What if you need to saw a piece of wood? Hammer is not a useful tool irrespective of the need

17 Simplistic & inaccurate to say…  “Conscientiousness scale is valid without regard to the way in which it will be interpreted and used” Rather (what is accurate) Scores can be interpreted validly as an indicator of conscientiousness Scale is not valid as a measure of intelligence or extraversion Not a valid predictor of successful employment

18 Compare:  “Scores on the Conscientiousness scale of the NEO-PI-R are validly interpreted as a measure of conscientiousness.” vs.  “The Conscientiousness scale of the NEO-PI-R is valid.”

19 Implication 2 Validity is a matter of degree Strong vs. weak NOT valid vs. invalid Select test if strong enough evidence supporting intended interpretation and use http://www.wired.com/wired/archive/9. 12/aqtest.html http://www.wired.com/wired/archive/9. 12/aqtest.html

20 Concern about the Autism Spectrum Quotient… Marginal internal consistency, so reliability is already of concern What about validity? Is it valid to interpret a high score on the test as reflecting a high degree of autism traits?

21 Interpretation of AQ

22 Regret vs. Autism? (r =.45)

23 AQ http://www.wired.co m/wired/archive/9.12 /aqtest.html http://www.wired.co m/wired/archive/9.12 /aqtest.html

24 What is to be measured? What are the relative strengths of the alternatives that are available to measure that construct? Select best measures of specific characteristics to be assessed

25 Implication 3 Validity of a test’s interpretation is based on evidence and theory Human resources: “…in her experience, use of NEO-PI-R was useful in selection”

26 “Personality Color Test” Based on color psychology (Max Luscher) Color preferences reveal something about your personality Survey of scientific literature finds almost no empirical evidence of validity of color preferences as a measure of personality characteristics

27 Evidence for “color test” Less than clear Cite implies validity Web site: “Is the test reliable? We leave that to your opinion. We can only say that there are a number of corporations and colleges that use the Lûscher test as part of their hiring/admissions processes. It can be a useful tool for doctors and psychologists as well and is used to get a quick overview of potential issues patients may have in their lives.” http://colorquiz.com/

28 “Color Quiz” Is the test useful as a measure of personality? Denied employment based on such a test?

29 Empirical evidence & theoretical underpinnings? Data from high quality research must be available. Theory alone is not adequate.

30 Contemporary view of validity Although 3 forms, content, criterion, and construct, contemporary perspective highlights CONSTRUCT VALIDITY

31 Standards Standards for Educational and Psychological Testing - revised (1999) Co-published by American Education Research Association (AERA) American Psychological Association (APA) National Council on Measurement in Education (NCME

32 Remember Contemporary perspective highlights CONSTRUCT VALIDITY

33 Standards outline 5 types of evidence relevant for establishing validity of test interpretations (AERA, APA, NCME, 1999) Construct Validity Associations With Other Variables Internal Structure Test Content Response Processes Consequences of Use

34 Construct Validity Test Content

35 Validity Evidence: Test Content Match between the actual content of a test and the content that should be included in the test. Psychological nature of the construct should dictate the appropriate content of the test.

36 Face Validity Face validity – the degree to which a measure appears to be related to a specific construct in the judgment of non-experts such as test takers and representatives of the legal system. LOOKS relevant, and this fact may increase likelihood that the test will be well received by users and takers

37 Threats to content validity Construct-irrelevant content – e.g., test includes questions on content not covered in book, lecture, or discussion Construct under-representation – e.g., test content fails to represent the full scope of the content implied from the construct Related practical issues – e.g., time, respondent fatigue, respondent attention, and etc. – Is content a fair representation?

38 Content Validity vs. Face Validity Content validity is the degree to which the content reflects the full domain of the construct & can only be evaluated by experts who have a deep understanding of the construct Face validity is the degree to which non-experts perceive the test to be relevant to what they believe is being measured by it

39 Construct Validity Internal Structure

40 Validity Evidence: Internal Structure of the Test For a test to be validly interpreted as a measure of a particular construct, the actual structure of the test should match the theoretically based structure of the construct Does the theoretical basis suggest a unidimensional or a multi-dimensional structure?

41 Internal Structure Often assess via examination of factor structure (factor analysis) Items that are more strongly correlated with each other than other items form clusters called factors… Factor analysis should clarify the number of factors within a set of test questions Example: Self esteem – is the construct uni- or multi-dimensional?

42 Factor analysis 1.Clarifies number of factors 2.Reveals associations among the factors within a multi-dimensional test 3.Identifies which items are linked to which factors

43 Rosenberg Self-Esteem Inventory (RSEI; Rosenberg 1989) 1.On the whole, I am satisfied with myself 2.At times, I think I am no good at all. 3.I feel that I have a number of good qualities 4.I am able to do things as well as most other people 5.I feel I do not have much to be proud of 6.I certainly feel useless at time 7.I feel that I’m a person of worth, at least on an equal plan with others 8.I wish I could have more respect for myself 9.All in all, I am inclined to feel that I am a failure 10.I take a positive attitude toward myself

44 RSEI - Scree Plot Number of factors evident in the plot? Question: This scree plot provides evidence for what type of structure a.Unidimensional b.Multidimensional

45 Construct Validity Response Processes

46 Validity Evidence: Response Processes Match between the psychological processes that respondents actually use when completing a measure and the processes that they should use. When I say start, raise your finger when you feel 10 s have elapsed. Assumption: should use “feel” (feels like time is up) but could use another process such as covert counting, copying others, or looking at a second hand on a watch

47 Response processes If a different response process used is different than the one assumed to be used, then the scores may not be interpretable as the test developer intended Attention to the internal feel of time passing vs. use of some selected process to intentionally mark passage of time

48 Construct Validity Associations With Other Variables

49 Validity Evidence: Association With Other Variables Match between a measure’s actual associations with other measures and the associations that the test should have with the other measures.

50 Convergent evidence The degree to which test scores are correlated with tests of related constructs

51 Discriminant evidence Degree to which test scores are uncorrelated with tests of unrelated constructs

52 Example Hypothesis: Schizophrenia and autism are diametrically opposed constructs

53

54 Measure of autism should be uncorrelated with measures of schizophrenia

55 Support for C & B’s theory? NO: Convergent evidence - autism measure correlated positively with sz measures Finding: AU & SZ are related constructs? i.e., Crespi & Badcock are wrong Or Not really yes, but could assume strong correlations indicate weak validity of AQ as a measure of autism construct

56 Concurrent validity evidence The degree to which test scores are correlated with other relevant variables that are measured at the same time as the primary test of interest SAT is a measure of skills needed for academic success? Compare SAT administered during high school senior year to hs senior year GPA

57 Predictive validity evidence The degree to which test scores are correlated with relevant variables that are measured at a future point in time. SAT is a measure of skills needed for academic success? Compare SAT administered during senior year of high school to college freshman year GPA

58 Validity Evidence: Consequences of Testing Social consequences of test are a facet of validity… Standards for Educational and Psychological Testing Validity includes “the intended and unintended consequences of test use” E.g., does a construct and its measurement benefit one group?

59 Not all agree… Consequences of a testing program should be considered a facet of the scientific evaluation of the meaning of a test score. Some feel that this is an intrusion of politics into science… Can science be separated from personal and social values?

60 Summary Conceptual basis for validity Construct Validity Associations With Other Variables Internal Structure Test Content Response Processes Consequences of Use

61 Validity Standard for Education and Psychological Tests (1999) The degree to which evidence and theory support the interpretations of test scores entailed by the proposed uses of a test

62 Validity Are decisions based on valid interpretations of test scores? Educational placement Access to services Hiring Clinical decisions


Download ppt "Validity: Conceptual Issues Furr & Bacharach Chapter 8."

Similar presentations


Ads by Google