EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY

Slides:



Advertisements
Similar presentations
Agenda Levels of measurement Measurement reliability Measurement validity Some examples Need for Cognition Horn-honking.
Advertisements

Chapter 8 Flashcards.
Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
Lecture 7: reliability & validity Aims & objectives –This lecture will explore a variety of techniques for ensuring that research is conducted with reliable.
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
MEASUREMENT CONCEPTS © 2012 The McGraw-Hill Companies, Inc.
Chapter 5 Measurement, Reliability and Validity.
Part II Sigma Freud & Descriptive Statistics
Part II Sigma Freud & Descriptive Statistics
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT
Chapter 4A Validity and Test Development. Basic Concepts of Validity Validity must be built into the test from the outset rather than being limited to.
Reliability and Validity of Research Instruments
Assessment: Reliability, Validity, and Absence of bias
Chapter 4 Validity.
Test Validity: What it is, and why we care.
Reliability or Validity Reliability gets more attention: n n Easier to understand n n Easier to measure n n More formulas (like stats!) n n Base for validity.
VALIDITY.
RELIABILITY & VALIDITY
Concept of Measurement
RELIABILITY consistency or reproducibility of a test score (or measurement)
Lecture 7 Psyc 300A. Measurement Operational definitions should accurately reflect underlying variables and constructs When scores are influenced by other.
Measurement: Reliability and Validity For a measure to be useful, it must be both reliable and valid Reliable = consistent in producing the same results.
Classroom Assessment A Practical Guide for Educators by Craig A
Validity Lecture Overview Overview of the concept Different types of validity Threats to validity and strategies for handling them Examples of validity.
Multivariate Methods EPSY 5245 Michael C. Rodriguez.
Measurement and Data Quality
Measurement in Exercise and Sport Psychology Research EPHE 348.
Reliability and Validity what is measured and how well.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Unanswered Questions in Typical Literature Review 1. Thoroughness – How thorough was the literature search? – Did it include a computer search and a hand.
Validity & Practicality
Validity. Face Validity  The extent to which items on a test appear to be meaningful and relevant to the construct being measured.
Reliability & Validity
Tests and Measurements Intersession 2006.
Generalizability Theory Nothing more practical than a good theory!
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
Measurement Validity.
Measurement and Questionnaire Design. Operationalizing From concepts to constructs to variables to measurable variables A measurable variable has been.
CHAPTER OVERVIEW The Measurement Process Levels of Measurement Reliability and Validity: Why They Are Very, Very Important A Conceptual Definition of Reliability.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
1 EPSY 546: LECTURE 1 SUMMARY George Karabatsos. 2 REVIEW.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Chapter 4 Validity Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright ©2006.
The Theory of Sampling and Measurement. Sampling First step in implementing any research design is to create a sample. First step in implementing any.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Validity and Item Analysis Chapter 4. Validity Concerns what the instrument measures and how well it does that task Not something an instrument has or.
Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”
Measurement Issues General steps –Determine concept –Decide best way to measure –What indicators are available –Select intermediate, alternate or indirect.
Measurement MANA 4328 Dr. Jeanne Michalski
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Chapter 6 - Standardized Measurement and Assessment
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Language Assessment Lecture 7 Validity & Reliability Instructor: Dr. Tung-hsien He
Measuring Research Variables
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
MGMT 588 Research Methods for Business Studies
Lecture 5 Validity and Reliability
Product Reliability Measuring
Questions What are the sources of error in measurement?
Concept of Test Validity
Evaluation of measuring tools: validity
Tests and Measurements: Reliability
Human Resource Management By Dr. Debashish Sengupta
پرسشنامه کارگاه.
PSY 614 Instructor: Emily Bullock Yowell, Ph.D.
Reliability and Validity of Measurement
EPSY 5245 EPSY 5245 Michael C. Rodriguez
Measurement Concepts and scale evaluation
Presentation transcript:

EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

GENERALIZABILITY THEORY

TRUE SCORE MODEL Recall the true score model: X+n Observed Test Score of person n, Tn True Test Score (unknown) en Random Error (unknown)

TRUE SCORE MODEL Recall the true score model: One may view that the true score model narrowly defines error. 1 variable, simple ANOVA: Between (true score) var + Within (random error) var.

GENERALIZABILTY THEORY Generalizability Theory extends the true score model by acknowledging that multiple factors affect the measurement variance. Multivariable ANOVA: The observed test response is a function of 2 or more variables, their interactions, and random measurement error.

G-THEORY MODEL (example) Xnjt =  Grand mean + n –  Person n’s effect + j –  Item j’s effect + t –  Time t’s effect + nt – n – t +  Person  Time effect + nj – n – j +  Person  Item effect + tj – t – j +  Time  Item effect + residual Three way interaction, and error

G-THEORY VARIANCE PARTITION Systematic Persons 2P Measurement Error (facet contributions) Items 2I Time 2T Person  Time 2 PT Person  Item 2 PI Time  Item 2 TI 3-way inter + error 2PIT, error

G-THEORY OF DECISIONS Relative decisions: Decisions based on the rank ordering of persons (e.g., college admission, pass-fail testing). Variance contributing to measurement error for relative decisions: 2Relat = 2PI + 2PT + 2PIT,error (all variance components associated with the interaction of persons)

G-THEORY OF DECISIONS Absolute decisions: Decisions based on the level of the observed score, without regard to the performance of others. (e.g., driver’s license). Variance contributing to measurement error for absolute decisions : 2Abs = 2T + 2I + 2PI + 2PT + 2IT + 2PIT,error (all variance components associated with the facets, which introduce “constant” effects to absolute decisions)

GENERALIZABILITY COEFFICIENT Indicates how accurately the observed test scores allows us to generalize about persons’ behavior in a designed universe of situations (Cronbach, 1972).

STUDIES G-Study (Generalizability Study): Aims to estimate the variance components underlying a measurement process by defining the universe of admissible observations as broadly as possible.

STUDIES D-Study (Design Study): Using G-study results to address “what if” questions about variation in measurement design (Thompson & Melancon, 1987). This helps pinpoint sources of error to specify protocol modifications to obtain the desired level of generalizability.

EXAMPLES OF G- THEORY Nice illustrations are offered in: Webb, Rowley, & Shavelson (1988) and Crowley, Thompson, & Worchel (1994)

VALIDITY

TEST VALIDITY VALIDITY: A test is valid if it measures what it claims to measure. Types: Face, Content, Concurrent, Predictive, Construct.

TEST VALIDITY Face validity: When the test items appear to measure what the test claims to measure. Content Validity: When the content of the test items, according to domain experts, adequately represent the latent trait that the test intends to measure.

TEST VALIDITY Concurrent validity: When the test, which intends to measure a particular latent trait, correlates highly with another test that measures that trait. Predictive validity: When the scores of the test predict some meaningful criterion.

TEST VALIDITY Construct validity: A test has construct validity when the results of using the test fit hypotheses concerning the theoretical nature of the latent trait. The higher the fit, the higher the construct validity.

MESSICK’S UNIFIED CONSTRUCT VALIDITY Content: Item content relevance, representativeness, and technical quality (includes face). Substantive: Theoretical rationales for the observed consistencies in the test responses. Structural: Fidelity of scoring structure to the structure of the content domain. Generalizability: The extent to which the score properties and interpretations generalize over population groups, settings, and tasks. External: Concurrent/convergent, discrim., pred. Consequential: refers to the (potential and actual) consequences of test use.