A ssessment & E valuation
Assessment Answers questions related to individuals, “What did the student learn?” Uses tests and other activities to determine student knowledge and ability. Identifies Adverbs Adds two-digit numbers Defines democracy A+ Def: assessment - to determine the amount of [e.g., knowledge]
Evaluation Answers global questions, “How effective is Program X?” and “What difference does it make?” Formative Summative Def: evaluation - to determine the significance or worth by careful appraisal and study
Types of Tests Used to evaluate changes in skills and knowledge
Test Types: Norm-Referenced Compare an individual's performance to the performance of other people. Require varying item difficulties. Assume not everybody is going to "get it" Discern those who "got it" from those who didn't.
Normal Distribution
Test Types: Norm-Referenced Norm-referenced tests compare the individual to the group. Accomplished statistically by “norming” the test with large numbers of people. Consider: You’re evaluating a GRE preparation class. GRE scores for the group of 100 enrolled students average as follows. Can you make recommendations to the school that will help them improve the instruction? Why or why not?
Test Types: Criterion-Referenced Compares an individual's performance to the acceptable standard of performance for those tasks. Requires completely specified objectives. Asks: Can this person do that which has been specified in the objectives? Results in yes-no decisions about competence.
Test Types: Criterion-Referenced Applications Diagnosis of individual skill deficiencies Certification of skills Evaluation and revision of instruction
Ideas in Testing Measurement Error Validity Reliability
Measurement Error Many causes: mechanical or scoring errors poor wording (confusing, ambiguous) poor subject matter, content (validity) score variation from one time to another (reliability) score variation from "equivalent" tests test administration procedure inter-rater reliability mood of the student
Validity Does the test assess what's important? Does it really seek out the skill and knowledge linked to the world? (content validity) Types: Content Validity Assessed by a panel of experts: Face validity Construct validity Predictive Validity (e.g. SAT, GRE)
Reliability Are the scores produced by the test trustworthy and stable over time? Assessed by: parallel (equivalent) forms or test-retest internal consistency
Utility of Test Scores Selection & screening (before): mastery of prerequisites -- for remediation mastery of terminal objectives -- for acceleration Individual diagnosis and prescription (along the way) Practice (along the way) Grades & summative scores (at or after the end): promotion certification and licensure Administrative: course evaluation trainer accountability
Finding Tests Evaluator constructed Commercially available Hybrids - localization, repurposing What is the main issue with repurposing a validated test?