Presentation is loading. Please wait.

Presentation is loading. Please wait.

Validity and Validation Methods. Workshop Flow The construct of MKT –Gain familiarity with the construct of MKT –Examine available MKT instruments in.

Similar presentations


Presentation on theme: "Validity and Validation Methods. Workshop Flow The construct of MKT –Gain familiarity with the construct of MKT –Examine available MKT instruments in."— Presentation transcript:

1 Validity and Validation Methods

2 Workshop Flow The construct of MKT –Gain familiarity with the construct of MKT –Examine available MKT instruments in the field Assessment Design –Gain familiarity with the Evidence-Centered Design approach –Begin to design a framework for your own assessment Assessment Development –Begin to create your own assessment items in line with your framework Assessment Validation –Learn basic tools for how to refine and validate an assessment Plan next steps for using assessments

3 Domain Modeling (Design Pattern) (Define Test Specs) Domain Analysis Define item Template Define item Specs Develop Pool of items Collect/ Analyze Validity Data Refine items Refine items Assemble Test Document Technical Info Assessment Development Process

4 Validity: The Cardinal Virtue of Assessment The degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment. »-- Mislevy, Steinberg, and Almond, 2003 Validation is a process of accumulating evidence to provide a scientifically sound validity argument to support the intended interpretation of test scores »-- Standards for Educational and Psychological Testing (AERA / APA / NCME, 1999) Jargon Note: Two kinds of evidence

5 Assessment Reliability The extent to which an instrument yields consistent, stable, and uniform results over repeated administrations under the same conditions each time Figure obtained from the website:

6 Steps of item Validation StepMethod 1. Expert Panel Review (Formative) Alignment and Ratings of items 2. Feasibility of itemsThink-Alouds 3. Field testingTesting with a large sample 4. Expert Panel Review (Summative) Alignment and Ratings of items Iterative Refinement

7 1.Expert Panel Review (Formative) Are the items aligned with… –The test specifications? –Content covered in the curriculum? –State or national standards? Is the complexity level aligned with intended use (e.g., target population, grade-level)? Are the items prompts and rubrics aligned?

8 2. Feasibility of Items (Think- Alouds) Does the item make sense to the teacher? Does the item elicit the cognitive processes intended? Can the item be completed in the available time? Can respondents use the diagrams, charts, tables as intended? Is the language clear? Are there differences in approaches by experts and novices (or teachers exposed or not to the relevant instruction)?

9 SimCalc Example: Think-Alouds SimCalc Expected proportional reasoning: 3.5 white x white = dark 5 dark Found: Just draw the bars!! Proportional Reasoning Problem #3

10 Conducting Think-Alouds Sample –N: You learn the most in the first 3-6 –Who Experts and Novices Low, Medium, and High Achievers Varying in proficiency in English Data capture and analysis –Data can be extremely rich analyzed with varying levels of detail Often sufficient to do real-time note-taking Videotaping can be helpful –Document Problems with item clarity (language, graphics) Response processes – What strategies are they using?

11 3. Field Testing Item-level concerns –Are there ceiling or floor effects? –What is the range of responses we can expect from a variety of teachers? –Is the amount of variation in responses sufficient to support statistical analysis? –What is the distribution of responses across distracters? –Do the items discriminate among teachers performing at different levels? Assessment-level concerns –Are there biases among subgroups? –Does the assessment have high internal reliability? –What is the factor structure of the test?

12 Key Item Statistic: Percent Correct What percent of people get it correct? Gives us a sense of: –The item difficulty –The range of responses Alerts you to potential problems: –Floor = roughly 0-10% –Ceiling = roughly %

13 SimCalc Example: Exploratory Results for item #20 SimCalc Quartiles of total test score

14 SimCalc Example: Exploratory Results for item #43 Skip SimCalc

15 SimCalc Example: Exploratory Results for item #6 ResponseCount Correct (12)160 (70%) Additive error (8)42 (18%) Other20 (9%) Skip8 (3%) SimCalc

16 Conducting a Field Test Test under conditions as close to real as possible –Analogous population of teachers –Administration conditions –Formatting –Scoring Gather and use demographic data Determine sample size based on –The number of teachers you can get –The kinds of statistical analyses you decide to conduct e.g., 5-10 respondents per item for fancy statistics Can use simple and fancy statistics

17 Field Testing with Teachers by Mail Purchasing national mailing lists of teachers –http://www.schooldata.com/ –http://www.qeddata.com Best practices mailing sequence (Cook et al., 2000) –An introductory postcard announcing that a survey will be sent –About a week later, a packet containing the survey –About two weeks later, a reminder postcard –About two weeks later, a second packet containing the survey and a reminder letter –About three weeks later, a third appeal postcard

18 Steps of item Validation StepMethod 1. Expert Panel Review (Formative) Alignment and Ratings of items 2. Feasibility of itemsThink-Alouds 3. Field testing for psychometric information Testing with a large sample 4. Expert Panel Review (Summative) Alignment and Ratings of items Iterative Refinement

19 4. Expert Panel Review (Summative) Similar questions as in Step 1 (Formative review) Same or different panel of experts Ratings and alignment collected after items are fully refined Results of summative expert panel review provide evidence of alignment of items with standards/curriculum, content validity, and grade-level appropriateness This could be reported in technical documentation

20 Steps of item Validation StepMethod 1. Expert Panel Review (Formative) Alignment and Ratings of items 2. Feasibility of itemsThink-Alouds 3. Field testing for psychometric information Testing with a large sample 4. Expert Panel Review (Summative) Alignment and Ratings of items Iterative Refinement

21 Creating a Validity Argument Integrates all evidence into a coherent account of the degree to which existing evidence and theory support the intended interpretation of test scores

22 For a Sound Validity Argument, at Minimum, Pay Attention to… Sources of EvidenceProcedures 1. Test contentConduct alignment of items to standards/curriculum by content experts 2. Response processesHave at least one or two teachers do think-alouds Administer test to at least one group 3. Relationships to other variables If possible, conduct one or more of the following: Conduct instructional sensitivity study Correlate with existing measures Correlate with construct-irrelevant variables 4. Internal structureEstablish internal reliability (alpha) Assess inter-scorer reliability, if there is a rubric 5. Consequences of testingBe aware of the limitations of your test, not going beyond intended purposes and its intended role on your project

23 Activity #5 Conduct Think-Aloud Break into groups of 3 and select roles –1 interviewer –1 interviewee –1 observer to complete observation recording sheet Select set of 2 items Conduct think-alouds. Interviewer and observers take notes on the form in the protocol. Repeat two more times, switching roles, with new items. Revise your own items. Following, we will have a discussion about –Insights about development of assessment items –Questions and challenges Be the observer for your own items!

24 Activity #5 Think-Aloud Pointers Find out how long problems take to do Uncover issues of item clarity and level of difficulty Derive a model of the knowledge and thinking that the students engage when solving each problem. In observation notes, describe: –How problems are solved, focusing on the underlying knowledge, skills, and structures of item performance –Actions, thought processes, and strategies

25 Activity #5 Think-Aloud Pointers Interviewers SHOULD –Prompt the teacher to keep talking –Ask clarifying questions about what teachers are saying (but not as scaffolding) Interviewers SHOULD NOT –Help teachers in any way during the interview (e.g., no hints, tips, or scaffolding). Be sure to avoid unintentional hints by being more encouraging when answers are correct.

26 Steps of item Validation StepMethod 1. Expert Panel Review (Formative) Alignment and Ratings of items 2. Feasibility of itemsThink-Alouds 3. Field testing for psychometric information Testing with a large sample 4. Expert Panel Review (Summative) Alignment and Ratings of items Iterative Refinement

27 Some Useful References Validation –AERA, APA, & NCME (1999). Standards for educational and psychological testing. Washington, DC: AERA. –Baxter, G. P., Shavelson, R. J., Herman, S. J., Brown, K. A., & Valadez, J. R. (1993). Mathematics performance assessment: Technical quality and diverse student impact. Journal for Research in Mathematics Education, 24(3), –Cronbach, L. J. (1971). Test validation. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp ). Washington, DC: American Council on Education. –Hoag, R. D., Meginbir, L., Khan, Y., & Weatherall, D. (1985). A multitrait- multimethod analysis of the Preschool Behavior Questionnaire. Journal of Abnormal Child Psychology, 13, –Mehta, P. D., Foorman, B. R., Branum-Martin, L., & Taylor, W. P. (2005). Literacy and a unidimensional multilevel construct: Validation, sources of influence, and implications in a longitudinal study in Grades 1 to 4. Scientific Studies of Reading, 9,

28 Some Useful References Validation contd –Messick, S. (1989). Validity.(In R. L. Linn (Ed.), Educational measurement (3rd ed., pp ). –Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), –Pellegrino, J., Chudowsky, N., Glaser, R. (Eds.). (2001). Knowing what students know: The science and design of educational assessment. Washington, DC: National Academy Press. –Tremblay, R. E., Vitaro, F., Gagnon, C., Piche, C. & Royer, N. (1992). A prosocial scale for the Preschool Behavior Questionnaire: Concurrent and predictive correlates. International Journal of Behavioral Development, 15, –Weir, K., & Duveen, G. (1981). Further development and validation of the Prosocial Behavior Questionnaire for use by teachers. Journal of Child Psychology and Psychiatry, 22,

29 Some Useful References Expert Panel Review –Webb, N. L. (2002). Alignment: Powerful tool for focusing instruction, curricula, and assessment. Presentation at the CCSSO State Collaborative on Assessment and Students Standards, San Diego, CA. –Webb, N. L. (2005). Alignment, depth of knowledge, and change. Paper presented at the 50 th Annual Meeting of the Florida Educational Research Association, Miami, FL. Think-Alouds –Ericsson, K. A., & Simon, H. A. (1993). Protocol Analysis: Verbal reports on data. Cambridge, MA: MIT Press. –Flaherty, E. G. (1974). The thinking aloud technique and problem-solving ability. Journal of Educational Research, 68, Psychometrics –Crocker, L., & Algina, J. (1986). Introduction to classical & modern test theory. Orlando, FL: Harcourt Brace Jovanovich, Inc.


Download ppt "Validity and Validation Methods. Workshop Flow The construct of MKT –Gain familiarity with the construct of MKT –Examine available MKT instruments in."

Similar presentations


Ads by Google