Creating Effective Achievement Tests

Creating Effective Achievement Tests
moving beyond the publisher’s test bank Often times the curriculum you use will come with a publisher’s test bank. It is easy to simply take an exam provided in the test bank, photocopy it, and hand it out. But blindly using these exams can result in several problems, including 1) items in the test bank do not function as intended, 2) an overabundance of items that are not sufficiently covered by the teacher, and 3) concepts that the teacher adds to the curriculum that the test bank does not assess at all. Today’s discussion provides an overview of the systematic process of designing, developing, and implementing an exam that will assess the information you are trying to measure in a reliable and “valid” manner. Goal: Students will understand the “big picture” of putting together an exam. Objectives: Use the lesson’s learning objectives to guide the items writing process. Develop a test blueprint. Obtain relevant test item pools. Synthesize the test.

4 steps of test construction Clarify the learning goals. Develop a test blueprint. Obtain relevant test item pools. Synthesize the test. There are four basic steps to consider when putting together a test. We will discuss each of these in more detail on the following slides.

Clarify the learning goals. Return to the unit plan. Rank the objectives by their relative importance. Weight each objective as a percentage of the learning goal. Look over each objective of the lesson plan. If one objective is more important to goal achievement than another, wouldn’t it seem logical to have more questions assessing that objective? Similarly, if one objective is more important to goal achievement than another, wouldn’t it seem logical to have spent more time teaching to that objective? These two ideas go hand-in-hand; the more important something is the more time we should spend on it and the more concerned we should be when it comes to measuring student's achievement of it. One simple way to weight the objectives is to simply divide the number of objectives into percentage weighting scale. If all objectives were equally as important then all objectives would be weighted the same. But some objectives will be more important than others so you can make adjustments based on your professional expertise. For example, if I were to look over the objectives for the shoe tying exam it would seem logical to me that objective 2 would be the most important, and objectives 1 & 3 would almost be a tie for second. If each objective were equal they would each get 33% of the points, but since objective 2 is more important I might give it 40% and the other two 30% each. Or if you may want to give objective 2 50% and the other two 25% each. The exact weighting will vary from person to person, but they should be somewhat consistent in terms of their order of importance. Things that are essential to know should be rated highest (I like to use A, B, C… designations), and things that are not as critical will be rated lower.

Develop a test blueprint. Administration time Item formats Number of items Difficulty of items Easy: answered correctly >75% Moderate: 25%< answered correctly <75% Hard: answered correctly < 25% With your objectives clarified, it is now time to make a test blueprint. The blueprint lists the features planned for the test. Administration Time. How much time do you have to administer the test? Can it be done in 15 minutes, or will it take an hour? How much attention span do your students have for taking a test? Should the test be divided up over two or more days? Scoring Time. Remember that the longer a test is the more time it will take you to score it. How much time will you need to score the exams and interpret the results? Remember this rule ALWAYS GIVE AN EXAM BACK TO STUDENTS AND GO OVER EVERY ITEM AT THE NEXT CLASS MEETING. If you wait two weeks to give the results back students will have already forgotten what was on the test to begin with. Handing the test back the next day and going over each item gives students an opportunity to learn from their mistakes, ask questions or point out misconceptions that may have led to the answer they gave, and serves as a quality control measure to make sure that you have accurately scored the test. Item formats. The type of items you use (e.g. T/F, MC, short answer, essay, performance observations) will depend on the level of learning you designated in the learning objectives. As you plan the test, the blueprint should give you an idea of the formats for the items you expect to include. Number of items. Once you have considered your time constraints, it is generally best to include as many items as you can. The more items you have the more likely you are to have a test with acceptable levels of content validity. Having too few items will not give you enough information to make a valid conclusion. It would be like basing a person’s skill in basketball by watching him shoot one free-throw. You can’t get enough information without watching several free-throws. However, once you have seen shots, you probably won’t gain any further information by watching 100 shots. Difficulty of items. You should be concerned from the outset with how many of the test’s items should be relatively easy, how many should be moderately difficult, and how many should be challenging even for students with advanced levels of achievement. The categories of easy, moderate, and had are defined in the slide. One good way of designing the test is to have 25% of the items easy, 50% moderate, and 25% hard. Using these distributions allows you to assess minimal competence from the easy items (i.e. if a student doesn’t correctly answer 75% of the easy items then you need to remediate or take other action) the moderate items will start to separate varying levels of achievement in the range which you hoped the students would achieve, and the hard items will show you who really knows something. Using this breakdown the average score should be ~50% of the points possible, and no one should get 100%. I know that this is not how you are used to seeing tests today, but I assure you that there is nothing wrong with a test where no one scores 100%. In fact, a test where someone does score 100% has failed to assess the limits of that students achievement of the objectives. Most tests today are written such that 75-80% of the items are easy and the remaining items are moderate at best. Such tests resemble the mere “minimum competency tests” that James Popham so deftly discredited. We should not be teaching to minimum standards, but teaching to the level that we hope our students will rise up to. Minimum standards produce minimum results.

Develop a test blueprint. Approximation of total points Number of points for each objective Method of scoring Test outline Approximation of total points. Many teachers automatically set the total number of points for each test at 100. This practice is generally discouraged educational measurement specialists, who prefer the maximum point total to be determined by the number and complexity of the items. If all your items were multiple choice then it would be fine to have each item worth one point, but having an essay worth one point puts a much greater work load on a student for the same amount of points as a MC item. While developing an blueprint you should estimate the total number of points based on the item types. Only an approximate figure is needed for the blueprint. Number of points for each objective. Because you want your test to emphasize more important objectives over less important objectives, you should distribute the points according to the table of specifications. Go back to your table and multiply the weight of the objective by the total number of points on the test to see approximately how many points each objective will receive. Again, the points for the objective should reflect the importance of the objective and the amount of time spent on that objective. Method of scoring. If you used the 25/50/25 distribution I suggested before, you should be able to come up with a way to convert those scores to grades (e.g. 0-25% = F, 25-40% = D, 40-60% = C, 60-75% = B, % = A). These boundaries are not hard-and-fast, and can be adjusted based on the class performance. This is one of the advantages of using the 25/50/25 distribution method. If you insist on the 90%=A, 80%=B, etc. you will have to adjust the items to meet the scale, rather than the scale to the items. This is a serious weakness of using this scale. If a teacher writes a test that was “too hard” and no one got an A, then the next test is typically so easy that everyone gets 100% so they can bring their grade up and feel better about themselves. In effect, using this scale forces the teacher to artificially manipulate the grades rather than letting the student’s natural performance set the criteria as the 25/50/25 method does. Test Outline. Finally, the blueprint should include an outline indicating sections and subsections of the test and where such things as directions will occur.

Obtain relevant test item pools: Write items well in advance of the test date. Build a set of questions for each objective. Have more questions for each objective than you intend to use on the test. Create multiple formats (i.e. MC, essay, SA, matching, oral response, performance, demonstrations, portfolio, etc.) File the questions in an orderly and manageable way (including the directions for how to score each item). An item pool is a collection of test items that are all designed to be relevant to the same learning objective. Building an item pool is a daunting task, and can be very difficult the first time you develop a test. However, once the pool is created you only need retrieve preexisting items each time you put together a test. Follow the suggestions listed on this slide when creating an item pool.

Obtain relevant test item pools: Make certain that the question can be matched to an objective and that it meets the stated learning level. Write items at the appropriate reading level. Do not give clues to the answer in the question Write each item so that the answer is one that most experts would agree upon. NOTE!!! There are two critical items on this slide. First, making certain that each question can be matched to an objective and that it meets the stated learning level is crucial to validating any results you may get from the test. If you are ever challenge to defend your assessments this information will go a long way to indicate that you are a competent professional who knows what you are doing. I have seen some good teachers get crushed because they did not do this. I have also seen some good teachers preempt unwarranted lawsuits before they were ever brought to trial because they DID have evidence of the items matching the objective listed in the state curriculum and doing so in a “valid” and reliable fashion. Second, the correct answer should be the one that most experts agree on. You should be able to give the question to ten competent experts and have all ten come up with the same answer. If this is not possible, you may want to consider re-writing the item so that it is more objective or allows an opinion to be made but bases the grade on supporting arguments and not the opinion itself (i.e. the sun revolves around the earth? Prove it!).

Synthesize the test: Item interaction Sequencing Items Directions Finally, you can start to put the items on to the test. We will discuss these three items more at the end of this unit, but for now you should be aware of them. When taking the items from the item pool and putting them on the test, watch out for item interaction. Item interaction occurs when one item gives a clue to the answer to another item. Sequencing of items is also very important. You should start with the easiest items first (i.e. TF, matching, short answer), then move to the more difficult items (e.g. MC, essay). Finally, don’t make the mistake of NOT providing proper directions. You have spent a lot of time putting this test together, make sure students know what to do with it.

Creating Effective Achievement Tests

Similar presentations

Presentation on theme: "Creating Effective Achievement Tests"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Creating Effective Achievement Tests

Similar presentations

Presentation on theme: "Creating Effective Achievement Tests"— Presentation transcript:

Similar presentations

About project

Feedback