Standardized Tests and Teaching Criteria for Evaluating Standardized Tests 15.9 What Is a Standardized Test? The Nature of Standardized Tests The Purposes of Standardized Tests
The Nature of Standardized Tests Standardized Tests Have uniform procedures for administration and scoring. Allow comparison of student scores by age, grade level, local and national norms. Attempt to include material common across most classrooms
Contribute to accountability Provide information about student progress and program placement Diagnose students strengths and weaknesses Provide information for planning and instruction Help in program evaluation Purposes of Standardized Tests
Evaluating Standardized Tests Reliability – Are test scores stable, dependable and relatively free from error? Validity – Does the test measure what it is purported to measure? 15.21
22 Correlation coefficient Indicates direction of relationship (positive or negative) Indicates strength of relationship (0.00 to 1.00) r = Correlation Coefficient is a statistical measure of relationship between two variables.
Pearson correlation coefficient r = the Pearson coefficient r measures the amount that the two variables (X and Y) vary together (i.e., covary) taking into account how much they vary apart Pearsons r is the most common correlation coefficient; there are others.
Computing the Pearson correlation coefficient To put it another way: Or
Sum of Products of Deviations Measuring X and Y individually (the denominator): –compute the sums of squares for each variable Measuring X and Y together: Sum of Products –Definitional formula –Computational formula n is the number of (X, Y) pairs
Correlation Coefficent: the equation for Pearsons r: expanded form:
Example What is the correlation between study time and test score:
Calculating values to find the SS and SP:
Calculating SS and SP
Correlation Coefficient Interpretation Coefficient Range Strength of Relationship Practically None Low Moderate High Moderate Very High
Reliability Test-retest: The extent to which a test yields the same score when given to a student on two different occasions Alternate-forms: Two different forms of the same test on two different occasions to determine the consistency of the scores Split-half: Divide the test items into two halves; scores are compared to determine test score consistency 15.32
Methods of Studying Reliability Interrater Reliability- The consistency of a test to measure a skill, trait, or domain across examiners. This type of reliability is most important when responses are subjective or open-ended.
Types of Validity… Content: Tests ability to sample the content that is being measured Criterion-related: 1.Concurrent: The relation between a tests score and other available criteria 2.Predictive: The relationship between tests score and future performance Construct: The extent to which there is evidence that a test measures a particular construct 15.36