Presentation is loading. Please wait.

Presentation is loading. Please wait.

Assessment Tools: Development and Validation

Similar presentations


Presentation on theme: "Assessment Tools: Development and Validation"— Presentation transcript:

1 Assessment Tools: Development and Validation
Wendy K. Adams Colorado School of Mines more emphasis throughout at how identifying the specific components relative to creating tool for measuring. they go together of course, but people going to be more interested initially in the components. Need to add more on teaching skills. Take out a little of validation. Change the table so that it’s just three main categories with some examples form my work under those. Save the table for later in the lit review.

2 Introduction Survey Tools: Formative Assessment of Instruction - FASI
Definition Development and Validation (experts & students) Interpretation

3 Definition FASI - Formative Assessment of Instruction:
Assessment of content knowledge Typically multiple choice Examples: FCI, BEMA, CSEM, QMCS, CUE Assessment of Perceptions Typically likert scale Examples: CLASS, PTaP

4 Content Question (FCI - Force Concept Inventory)
Despite a very strong wind, a tennis player manages to hit a tennis ball with her racquet so that the ball passes over the net and lands in her opponent's court. Consider the following forces: 1. A downward force of gravity. 2. A force by the "hit". 3. A force exerted by the air. Which of the above forces is (are) acting on the tennis ball after it has left contact with the racquet and before it touches the ground? 1 only. 1 and 2. 1 and 3. 2 and 3. 1, 2, and 3.

5 Perceptions Statements
After I study a topic in physics and feel that I understand it, I have difficulty solving problems on the same topic. I would become a grade 7-12 teacher if the pay were equal to my other career options Strongly Disagree Strongly Agree Strongly Disagree Strongly Agree

6 FASI Value – Content or Perceptions
Instructors agree that students should be able to answer these questions expect students to value the subject in their daily life Students do poorly on these concept tests when they enter the course disagree with experts on the subject applying to daily life and how to learn. Remarkably stubborn conceptual understanding doesn’t change much after instruction Often perceptions of application and how to learn gets less expert-like after instruction

7 Introduction Survey Tools: Formative Assessment of Instruction - FASI
Definition Development and Validation (experts & students) Interpretation

8 Key References AERA (American Educational Research Association), APA (American Psychological Association), & NCME (National Council on Measurement and Education). (1999). Standards for educational and psychological testing. Washington, DC: Author. NRC (National Research Council). (2001). Knowing what students know. The science and design of educational assessment. In J. W. Pellegrino, N. Chudowsky, & R. Glaser (Eds.), Committee on the foundations of Assessment (Board on Testing and Assessment Center for Education Division of Behavioral and Social Sciences and Education) (pp. 1–14). Washington, DC: National Academy Press. Adams, W. K. and Wieman. C.E. (2011). Development and validation of instruments to measure learning of expert-like thinking. International Journal of Science Education, 33, 9,

9 Development Phase 1. Delineation of the purpose of the test and the scope of the construct or the extent of the domain to be measured Phase 2. Development and evaluation of the test specifications Phase 3. Development, field testing, evaluation, and selection of the items and scoring guides and procedures Phase 4. Assembly and evaluation of the test for operational use

10 Development Phase 2. Development and evaluation of the test specifications item format – (forced answer) desired psychometric properties – (low stakes, selection of a few concepts / perceptions) time restrictions (less than a class period/ 10 minutes) characteristics of the population test procedures (eg. pre/post)

11 Development Phase 3. Development, field testing, evaluation, and selection of the items and scoring guides and procedures; and Phase 4. Assembly and evaluation of the test for operational use. These two are the bulk of the work and constitute both Development & Validation

12 Validation Collecting “evidence of validity” rather than “validating” the instrument. Evidence based on test content – how well does it represent the domain in question Evidence based on response processes (eg. If measuring reasoning, is the test taker using reasoning or an algorithm to answer.) Evidence based on internal structure – single dimension or several components Evidence based on relations to other variables

13 Development & Validation
Expert interviews Establish topics that are important to teachers. Interviews and observations to identify student thinking and the ways it can deviate from expert thinking. Create open-ended survey questions to probe student thinking more broadly Create a forced answer test. Carry out validation interviews with both novices and experts on the test questions. Administer to classes and experts - run statistical tests on the results. Modify items as necessary. Student interviews Collect data Modify Expert interviews Student interviews Collect data

14 Content Questions “assessments ... should focus on making students’ thinking visible to both their teachers and themselves so that instructional strategies can be selected to support an appropriate course of future learning” (NRC guidelines, 2001, p. 4) Straight forward questions with clear language Students think they understand but they don’t! Distracters (possible answers) Typically 3 – 5 distracters per question Include common incorrect responses Do not include any options that are not commonly chosen

15 Perceptions Statements
Quote from things experts say and things students say that you don’t like to hear. Straight forward language, ~4th grade level, avoid using “not” Misconceptions about Surveys: Positive and negative version of each statement (rarely works) Do not need to ask the same thing multiple times (less reliable) Does not need to be limited to one construct

16 Student Interviews Think-aloud interviews
Interpret the questions/statements consistently Agree with the expert for expert-like reasons Choose correct answer for right reasons only Student thinking when they disagree with the expert choose the wrong answer. ~30 per version per population Only “valid” for populations used for development and validation

17 Consider the following forces: 2. A force by the "hit".
Despite a very strong wind, a tennis player manages to hit a tennis ball with her racquet so that the ball passes over the net and lands in her opponent's court. Consider the following forces: 1. A downward force of gravity. 2. A force by the "hit". 3. A force exerted by the air. Which of the above forces is (are) acting on the tennis ball after it has left contact with the racquet and before it touches the ground? 1 only. 1 and 2. 1 and 3. 2 and 3. 1, 2, and 3.

18 Consider the following forces: 2. A force by the "hit".
Despite a very strong wind, a tennis player manages to hit a tennis ball with her racquet so that the ball passes over the net and lands in her opponent's court. Consider the following forces: 1. A downward force of gravity. 2. A force by the "hit". 3. A force exerted by the air. Which of the above forces is (are) acting on the tennis ball after it has left contact with the racquet and before it touches the ground? 1 only. 1 and 2. 1 and 3. 2 and 3. 1, 2, and 3. Very popular

19 Interviews find Definition of force Needs a force to be moving
Clarity of timing – isolating moments in time

20 A large box is pulled with a constant horizontal force. As a result, the box moves across a level floor at a constant speed. (Two other questions here) 28. If, instead, the horizontal force pulling the box is doubled. The box’s speed: continuously increases. will be double the speed but still constant. is greater and constant, but not necessarily twice as great. is greater and constant for awhile and increases thereafter. increases for a while and constant thereafter.

21 A large box is pulled with a constant horizontal force. As a result, the box moves across a level floor at a constant speed. (Two other questions here) 28. If, instead, the horizontal force pulling the box is doubled. The box’s speed: continuously increases. will be double the speed but still constant. is greater and constant, but not necessarily twice as great. is greater and constant for awhile and increases thereafter. increases for a while and constant thereafter.

22 Interviews find This one needs several concepts
Understanding that there’s a difference between constant speed and speeding up. Net force results in an acceleration How can something just keep speeding up?

23 Textbook cutoffs do not necessarily apply
Statistical Analyses Textbook cutoffs do not necessarily apply Reliability - are the results consistent? Alternate-form coefficients Test-retest or stability coefficients Internal consistency coefficients cronbach’s alpha (only works if single construct) “The ideal approach to the study of reliability entails independent replication of the entire measurement process.” (Standards 1999, p. 27)

24 Textbook cutoffs do not necessarily apply
Statistical Analyses Textbook cutoffs do not necessarily apply Item Analysis item difficulty percentage of students who got the item correct item discrimination how well an item differentiates between strong and weak students, as defined by their overall test score point biserial correlation Correlation between the item and students’ test scores Item correlations Correlations between individual items (>0.6 discard one of them)

25 Statistical Analyses Factor Analysis Data intensive technique
Identifies which groups of questions/statements are answered consistently by students Concept tests - group concepts Perceptions – empirical categories for analysis

26 Development & Validation
Expert interviews Establish topics that are important to teachers. Interviews and observations to identify student thinking and the ways it can deviate from expert thinking. Create open-ended survey questions to probe student thinking more broadly Create a forced answer test. Carry out validation interviews with both novices and experts on the test questions. Administer to classes and experts - run statistical tests on the results. Modify items as necessary. Student interviews Collect data Modify Expert interviews Student interviews Collect data

27 Introduction Survey Tools: Formative Assessment of Instruction - FASI
Definition Development and Validation (experts & students) Interpretation

28 Scoring and Analysis Course Results Percent correct Normalized Gain
Fraction of what they could have learned (Post – Pre) / (100-Pre) Effect size How many standard deviations change (Post - Pre) / (Pooled St Dv)

29 Pre-test Scores

30 Post-test Scores

31 Pre and Post

32 Percent Gain % Gain: The sheer volume of material learned, measured by FCI, for UNC IE is greater than other courses shown.

33 Normalized Gain Normalized Gain: The Normalized gain for IE is nearly equivalent for the females and males. <g> = (%post-%pre)/(100%-%pre)

34 Effect Size Effect Size: The effect size shows how much the mean has moved compared to the standard deviation of the population. This data clearly shows UNC IE and Indiana have the most impressive increases in movement of means. d = (%post-%pre)/(pooled sd)

35 Scoring and Analysis Course or Population Results
Percent agreement with the expert

36 Course pre/post Shifts

37 Beliefs by Major 1st and 2nd yr grads 2nd yr Phys majors
1st yr Phys majors Whole class mostly engineers Bio/Physiology & Chem Non-sci majors ‘Personal Interest’ Elem. teacher cand. ‘Overall’ 0% 20% 40% 60% 80% 100% % Favorable Score (PRE)

38 CLASS – Women vs. Men One study asked TWO questions:
What would a Physicist say? What do you think? Favorable Favorable “You” results consistent w/ typical CLASS scores

39 Conclusion Formative Assessments of Instruction
Characterize who’s in your class Identify strengths and weaknesses of instruction Compare different types of instruction within and across institutions Development and Validation Student interviews, expert interviews, data collection, and iteration. Interpretation Not just a single number


Download ppt "Assessment Tools: Development and Validation"

Similar presentations


Ads by Google