2Chapter Overview Norm-referenced and criterion-referenced tests Comparing norm-referenced and criterion-referenced testsThe test blueprintObjective test itemsConstructing matching test itemsMultiple choice itemsEssay test itemsValidity and reliabilityMarks and marking systemsStandardized testsPerformance assessmentPortfolio assessment
3Norm-Referenced and Criterion-Referenced Tests A test that determines a student’s place or rank among other students is called a norm-referenced test (NRT). This type of test compares a student to a norm group (a large sample of pupils at the same age or grade).A test that compares a student’s performance to an absolute standard or criterion of mastery is called a criterion-referenced test (CRT). This tells whether a student needs additional instruction on some skill or set of skills.
4Norm-Referenced Tests The major advantage of an NRT is that it covers many different content areas on a single test.The NRT measures a variety of specific and general skills at once, but cannot measure them thoroughly—a teacher is not as sure that individual students have mastered the individual skills in question.The major disadvantage of an NRT is that it is too general to be useful in identifying individual strengths or weaknesses tied to individual texts or workbooks.
5Criterion-Referenced Tests The major advantage of a CRT is that it can yield highly specific information about individual skills or behaviors.The major disadvantage of a CRT is that many such tests would be needed to make decisions about the many skills or behaviors typically taught in school.
6Figure Relationship of the purpose of testing and information desired to the type of test required.Insert figure here
7The Test BlueprintA test blueprint is a table that matches the test items to be written with the content areas and levels of behavioral complexity taught.The test blueprint helps ensure that a test samples learning acrossThe range of content areas covered.The cognitive and or affective processes considered important.
8Constructing a Test Blueprint Classify each instructional objective by level (knowledge, application, etc.).Record the number of items to be constructed for each objective in the cell corresponding to its behavioral category.Total the items for each instructional objective and record the number in the total row.Total the number of items falling in each behavior and record the number at the bottom of the table.Compute the column and row percentages by dividing each total by the number of items in the test.
9Objective Test Items Objective test item formats include: True-false MatchingMultiple choiceCompletion or short answerTwo methods for reducing the effect of guessing in true-false areTo encourage all students to guess when they do not know the answerTo require revision of statements that are false
10Constructing Matching Test Items In constructing matching items:Make lists homogenous, with the same kind of events, people, or circumstances.Place the shorter list first and list options in chronological, numbered, or alphabetical order.Provide approximately three more options than descriptions to reduce correct guesses.Write directions to identify what the lists contain and specify the basis for matching.Check the options for multiple correct answers.
11Multiple-Choice Items Things to avoid when constructing multiple-choice items When constructing multiple-choice items, avoid:Stem clues in which the same word or a derivative appears in the stem and an option.Grammatical clues in which an article, verb, or pronoun eliminates options from being correct.Repeating same words across options that could have been provided only once in the stem.Making response options of unequal length indicating the longer option may be correct.The use of “all of the above” (discourages discrimination) or “none of the above” (encourages guessing).
12Multiple-Choice Items Suggestions for writing multiple-choice items For higher-level multiple-choice items use:Pictorial, graphical, or tabular stimuliAnalogies that demonstrate relationships among itemsPreviously learned principles or proceduresFor writing completion items:Require a single-word answerPose questions/problem in brief, definite statementsMake sure an accurate response is in in the text, workbook, or notesOmit only one or two key wordsPut the blank at the end of the statementFor numerical answers, indicate the units
13Essay Test ItemsExtended response essay test items allow students to determine response length and complexity.Restricted-response essay test items pose specific problems for which students must recall and organize the proper information, derive defensible conclusions, and express them within a stated time or length.
14Essay Test Items When should they be used? Essay test items are most appropriate when:Instructional objectives specify high-level cognitive processes. Essay test items require supplying information rather than simply recognizing information.Relatively few test items (students) need grading. One recommendation is to use only one or two essay questions in conjunction with objective itemsTest security is a consideration.
15Suggestions for Writing Essay Test Items Identify mental processes to be measured (e.g., application, analysis, decision-making).Unambiguously identify the student’s task.Begin essay questions with key words (e.g., “compare”, “predict”, “give reasons for”).Require evidence for controversial questions.Avoid optional items.Establish reasonable time and/or page limits.Restrict essay items to that which cannot be easily measured via multiple-choice items.Relate items to an objective on the test blueprint
16To Increase Accuracy and Consistency in Scoring Essay Tests Specify the response length.Use several restricted-response essay items instead of one extended response item.Prepare a scoring scheme in which you specify beforehand all ingredients necessary to achieve each of the grades that could be assigned.
17Validity and Reliability Validity refers to whether a test measures what it says it measures.Three types of validity are:ContentConcurrentPredictive
18Content, Concurrent, and Predictive Validity Content validity is established by examining a test’s contents.Concurrent validity is established by correlating the scores on an a new test with scores on an established test given to the same individuals.Predictive validity is established by correlating the scores on a new test with some future behavior of the examinee that is representative of the test’s content.
19ReliabilityReliability refers to whether a test yields the same or similar scores consistently. Three types are:Test-retest (established by giving the test twice to the same individual and correlating the first set of scores with the second).Alternative form (established by giving two parallel but different forms of the test to the same individuals and correlating the two sets of scores).Internal consistency (established by determining the extent to which a test measures a single basic concept).
20Marks and Marking Systems Marks are based upon comparisons, usually comparisons of students with one or more of the following:Other studentsEstablished standardsAptitudeActual versus potential effortActual versus potential improvement
21Standardized TestsStandardized tests are developed by test construction specialists to determine a student’s performance level relative to others of similar age and grade.They are standardized because they are administered and scored according to specific and uniform procedures.
22Performance Assessment A performance assessment asks learners to show what they know by measuring complex cognitive skills with authentic, real-world tasks.
23Portfolio AssessmentA portfolio is a planned collection of learner achievement that documents what a student has accomplished and the steps taken to get there.