The end of construct validity

Slides:



Advertisements
Similar presentations
Psychometrics to Support RtI Assessment Design Michael C. Rodriguez University of Minnesota February 2010.
Advertisements

Standardized Scales.
Chapter 8 Flashcards.
Conceptualization, Operationalization, and Measurement
Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
Principles of Measurement Lunch & Learn Oct 16, 2013 J Tobon & M Boyle.
Survey Methodology Reliability and Validity EPID 626 Lecture 12.
Professor Gary Merlo Westfield State College
Part II Sigma Freud & Descriptive Statistics
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT
Designing Research Concepts, Hypotheses, and Measurement
Chapter 4A Validity and Test Development. Basic Concepts of Validity Validity must be built into the test from the outset rather than being limited to.
RESEARCH METHODS Lecture 18
Concept of Measurement
Beginning the Research Design
Chapter Two SCIENTIFIC METHODS IN BUSINESS
Chapter 7 Evaluating What a Test Really Measures
Measuring Social Life Ch. 5, pp
Measurement and Data Quality
Item Response Theory for Survey Data Analysis EPSY 5245 Michael C. Rodriguez.
Business Research Method Measurement, Scaling, Reliability, Validity
Instrumentation.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Technical Adequacy Session One Part Three.
CHAPTER 6, INDEXES, SCALES, AND TYPOLOGIES
CSD 5100 Introduction to Research Methods in CSD Observation and Data Collection in CSD Research Strategies Measurement Issues.
ScWk 240 Week 6 Measurement Error Introduction to Survey Development “England and America are two countries divided by a common language.” George Bernard.
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Getting Started Copyright © 2007 Allyn & Bacon Mayer’s Personality: A Systems Approach PART 1: EXPLORING PERSONALITYCHAPTER 2: RESEARCH IN PERSONALITY.
Measurement Validity.
Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.
1 Validity – Outline 1. Definition 2. Two different views: Traditional 3. Two different views: CSEPT 4. Face Validity 5. Content Validity: CSEPT 6. Content.
Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Validity Validity is an overall evaluation that supports the intended interpretations, use, in consequences of the obtained scores. (McMillan 17)
Validity and Item Analysis Chapter 4. Validity Concerns what the instrument measures and how well it does that task Not something an instrument has or.
Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”
JS Mrunalini Lecturer RAKMHSU Data Collection Considerations: Validity, Reliability, Generalizability, and Ethics.
The Practice of Social Research Chapter 6 – Indexes, Scales, and Typologies.
Psychometrics. Goals of statistics Describe what is happening now –DESCRIPTIVE STATISTICS Determine what is probably happening or what might happen in.
Spring 2015 Kyle Stephenson
Stat 281: Introduction to Probability and Statistics A prisoner had just been sentenced for a heinous crime and was returned to his cell. An inquisitive.
Graduate School for Social Research Autumn 2015 Research Methodology and Methods of Social Inquiry socialinquiry.wordpress.com Causality.
Chapter 6 - Standardized Measurement and Assessment
1 Announcement Movie topics up a couple of days –Discuss Chapter 4 on Feb. 4 th –[ch.3 is on central tendency: mean, median, mode]
PSY 432: Personality Chapter 1: What is Personality?
Advising on test validity Denny Borsboom University of Amsterdam.
Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity.
Lesson 3 Measurement and Scaling. Case: “What is performance?” brandesign.co.za.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
RELIABILITY AND VALIDITY Dr. Rehab F. Gwada. Control of Measurement Reliabilityvalidity.
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
Survey Methodology Reliability and Validity
MGMT 588 Research Methods for Business Studies
VALIDITY by Barli Tambunan/
Lecture 5 Validity and Reliability
CHAPTER 6, INDEXES, SCALES, AND TYPOLOGIES
Reliability and Validity in Research
Concept of Test Validity
Evaluation of measuring tools: validity
Validity and Reliability
Week 3 Class Discussion.
پرسشنامه کارگاه.
Measuring Social Life: How Many? How Much? What Type?
VALIDITY Ceren Çınar.
Unit IX: Validity and Reliability in nursing research
Developing and Evaluating Theories of Behavior
RESEARCH METHODS Lecture 18
Presentation transcript:

The end of construct validity Denny Borsboom University of Amsterdam

Two kinds of validity The working researcher’s idea: Validity concerns the question of whether a test measures what it should measure The construct validity idea: Validity is an evaluative, integrated judgement of the degree to which test score interpretations are justified in the light of empirical evidence and theoretical rationales (and, possibly, social consequences that follow from test use)

What I will argue The working researchers’ conception is theoretically and practically superior The construct validity position has some sophication but that is mainly windowdressing; in general, it precisely misses the point of what validity is

The pillars of construct validity Construct validity is an evaluative judgement about ‘test score interpretations’ in terms of ‘constructs’ that is a function of evidence and a matter of degree I will argue that this view does not align with the working researcher’s view at all has quite unreasonable consequences that one should not be comfortable with

Why construct validity theory is dysfunctional

The social consequences of construct validity theory

The social consequences of construct validity theory

A black hole that traps all psychometric problems

Why construct validity has nothing to do with tests (and why this is wrong)

Every interpretation can have construct validity

There as as many ‘construct validities’ as there are judges

Measurement instruments can ‘become valid’

Some measurement instruments ‘were valid’...

...but then ‘ceased to be’ valid...

Reference is unimportant ‘Aether’ ‘DNA’ ‘Phlogiston’ ‘Black hole’

Validity depends on the presence of ‘interpreters’

How construct validity is sold Construct validity is an evaluative, integrated judgement of the degree to which test score interpretations are justified in the light of empirical evidence and theoretical rationales (and, possibly, social consequences that follow from test use)

What construct validity really is Somebody’s evaluative, integrated and fluctuating judgement of the degree to which test score interpretations, that may have nothing to do with measurement, are justified in the light of time-dependent empirical evidence and that person’s theoretical rationales (and, possibly, that person’s guesses about social consequences that follow from test use as well as his or her valuation of these outcomes)

Why all this sophistication misses the point

Construct validity is an evaluative, integrated judgement of the degree to which test score interpretations are justified in the light of empirical evidence and theoretical rationales (and, possibly, social consequences that follow from test use) However, validity is... a property, not a judgment a property of instruments, not of inferences a function of truth, not of evidence the object of validation research, not its result

VALIDITY

A simple alternative: A test is valid for measuring an attribute if and only if variation in the attribute causally produces variation in the measurement outcomes

Attribute structure

Attribute structure

Score structure Attribute structure

Score structure Response process Attribute structure

IQ-scores 82 134 70 115 99 Response process g

IQ-scores X 82 134 70 115 99 Response process f(X| ) g 

Substantive theory Formal model IQ-score X patterns Response f(X| ) process f(X| ) g 

? ? ? Substantive theory Formal model IQ-score X patterns Response process f(X| ) ? g  ?

Where to look for validity Traditionally, evidence for validity is sought in external relations: relations between test scores and other test scores In criterion validity the evidence comes from correlations with a criterion (or with the criterion) In construct validity, the evidence comes from correlations with lots of other variables (MTMMs)

.09 .15 .56 Attractiveness Extraversion Working memory .55 Masculinity .40 Race Visual memory .35 Annual income IQ-scores Job performance .30 .37 .41 Annual income Sex Numerical ability .50 SES .20 .78 Physique Length Genetic differences But even if we knew all correlations between all conceivable tests, the validity problem would remain

Where to look for validity Validity is not a matter of external relations between the test scores and other test scores It is a matter of which processes take attribute differences into response differences For many tests we have no idea of what happens between item administration and item response This is the reason that the validity problem has proven hard to crack

Where to look for validity Ingredients for validity: A theory on the structure of the attribute A theory on the processes that take levels of the attribute into observed score patterns A formal model to test the theory against data The question of validity then becomes: is this theory true?

Example: The balance scale test Weight item Distance item What happens when the blocks are removed ? Conflict Weight item

Example: The balance scale test Theory on the structure of the attribute: Cognitive development involves an ordered series of discrete transitions between stages Theory on the processes that take levels of the attribute into observed score patterns: Children in different developmental stages use different cognitive rules to solve balance scale items, which results in different response patterns Statistical model to test the theory against data Developmental stages are conceptualized as latent classes with theoretically driven response vectors

Balance scale Test scores X 001100 111100 001100 110011 110011 Response process Rule 1 Rule 2 Rule 3 P(X=x| ) Developmental stages Latent classes

The question of validity: Is this theory of response behavior correct?

How does this relate to other issues? The validity concept is usually applied to many questions simultaneously: Does the test measure the intended attribute? How well do the test scores predict other attributes? Is the use of the test legally defensible? Will using the test improve the human condition? which are put under one umbrella; I only deal with (1) (2-...) are better left to psychometrics, law, politics, etc.

Does this mean that other issues are unimportant? No. Interpretations, uses, and consequences matter a great deal But they are not thereby issues of validity Moreover, they usually belong in the public sphere, not in the domain of validity theory

Bottom line To find out what you measure, you have to find out how your instrument works - there is no other way If you know how your instrument is supposed to work, and you know how it works, you have a definite answer to the validity problem However, if don’t know how your instrument is supposed to work, and you don’t know how it works, you are in trouble

Validity is... ...measuring the right thing