Presentation is loading. Please wait.

Presentation is loading. Please wait.

Struggling for meaning in standards-based assessment Mark Wilson UC Berkeley.

Similar presentations


Presentation on theme: "Struggling for meaning in standards-based assessment Mark Wilson UC Berkeley."— Presentation transcript:

1 Struggling for meaning in standards-based assessment Mark Wilson UC Berkeley

2 Outline What do we mean by “standards-based” assessments? Some current solutions to the problem of assessing standards An alternative –Learning performances –Learning progressions –Progress variables

3 What do we mean by “standards- based” assessments? What people often think they are getting: –A useful result for each standard “ideal approach” –The illusion of “standards-based” assessments What they are usually getting: –A single result that is somehow related to all, or a subset of, the standards –The reality of “standards-based” assessments

4 How standards-based is “standards-based”? “Fidelity”--how well do the assessments match the standards? High Fidelity: each standard has its own useable result Moderate Fidelity: each standard is represented by at least one item in the assessments Low Fidelity: the items match some of the standards

5 Why can’t each standard be assessed? Fidelity versus Cost when total cost is fixed Number of items Fidelity $ per item i.e., in the “ideal approach” we need so many items per standard that we can’t afford it.

6 Common Solutions: “Standards-based” One (more or less) items per standard –not enough for actual assessments of standards –Also used to provide emphasis among standards (i.e., “gold standards”) Sample standards over time Assess only a certain subset of the standards Validate through “alignment review” Decide to have a much smaller set of standards –Popham’s “Instructionally-sensitive assessments”

7 E.g. #1

8 Eg. #2

9 “Standards-based” assessments Do not have high fidelity to standards Are what can be afforded Still maintain “threat” effect –Although low density of items per standard means that “threat” on any one standard is low

10 Thinking about an Alternative “A mile wide and an inch deep” –now-classic criticism of US curricula in Mathematics and Science Need for standards to be interpretable by educators, policy-makers, etc. Need to enable long-term view of student growth Need to find a more efficient way to use item information than in “ideal approach”

11 Learning Performances Learning performances: a way of elaborating on content standards by specifying what students should be able to when they achieve a standard –E.g., students should be able to describe phenomena, use models to explain patterns in data, construct scientific explanations, or test hypotheses –Reiser (2002), Perkins (1998)

12 Learning performance example Benchmark (AAAS, 1993): –[The student will understand that] Individual organisms with certain traits are more likely than others to survive and have offspring LP expansion (Reiser et al, 2003): –Students identify and represent mathematically the variation on a trait in a population. –Students hypothesize the function a trait may serve and explain how some variations of the trait are advantageous in the environment. –Students predict, supported with evidence, how the variation on the trait will affect the likelihood that individuals in the population will survive an environmental stress.

13 Learning progressions Learning progressions: descriptions of the successively more sophisticated ways of thinking about an idea that follow one another as students learn –Aka learning trajectories, progressions of developmental competence, and profile strands More than one path leads to competence Need to engage in curriculum debate about which learning progressions are most important –Try and choose them so that we end up with fewer standards per grade level

14 Learning progression examples Evolutionary Biology –Catley, K., Reiser, B., and Lehrer, R. (2005). Tracing a prospective learning progression for developing understanding of evolution. Atomic-Molecular Theory –Smith, C., Wiser, M., Anderson, C.W., Krajcik, J., and Coppola, B. (2004). Implications of research on children’s learning for assessment: matter and atomic molecular theory. Both available at: –http://www7.nationalacademies.org/bota/Test_Design_ K-12_Science.htmlhttp://www7.nationalacademies.org/bota/Test_Design_ K-12_Science.html

15 Progress Variables Progress variable: Assessment expression of a learning progression Aim is to use what we know about meaningful differences in item difficulty to make the interpretation of the results more efficient –Borrow interpretative and psychometric strength from easier and more difficult items, so that we don’t need as many as does the “ideal approach”. Progress variables are a principal component of the BEAR Assessment System (Wilson, 2005; Wilson & Sloane, 2000):

16 The BEAR Assessment System 4 principles: 4 building blocks Examples provided by:

17 Principle 1: Developmental Perspective Building Block 1: Construct Map Developmental perspective –assessment system should be based on a developmental perspective of student learning Progress variable –Visual metaphor for how the students develop and how we think about how their item responses might change

18 Example: Why things sink and float

19 Principle 2: Match between curriculum and assessment Building Block 2: Items design Instruction & assessment match –there must be a match between what is taught and what is assessed Items design –a set of principles that allows one to observe the students under a set of standard conditions that span the intended range of the item contexts

20 Example: Why things sink and float Please answer the following question. Write as much information as you need to explain your answer. Use evidence, examples and what you have learned to support your explanations. Why do things sink and float?

21 Principle 3: Interpretable by teachers Building Block 3: Outcome space Management by teachers –that teachers must be the managers of the system, and hence must have the tools to use it efficiently and use the assessment data effectively and appropriately Outcome space –Categories of student responses must make sense to teachers

22 Example: Why things sink and float

23 Principle 4: Evidence of quality Building Block 4: Measurement model Evidence of quality –reliability and validity evidence, evidence for fairness Measurement model –multidimensional item response models, to provide links over time both longitudinally within cohorts and across cohorts

24 Example: Evaluate progress of a group OTUFPMMVMVDRD

25 Evaluate a student’s locations over time Embedded Assessments

26 BEAR Assessment System: Principles Developmental Perspective Need a framework for communicating meaning Match between Instruction and Assessment Need methods of gathering data that are acceptable and useful to all participants Interpretable by Teachers Need a way to value what we see in student work Evidence of QualityNeed a technique of interpreting data that allows meaningful reporting to multiple audiences

27 In conclusion… Achieving meaningful measures is tough under any circumstances, but especially so in an accountability situation, –where the requirements for accountability and the scale of the evaluation make it very expensive. Strategies like learning performances, learning progressions and progress variables are needed to make meaning possible, and affordable.

28 References American Association for the Advancement of Science (1993). Benchmarks for Science Literacy. New York: Oxford University Press. Catley, K., Reiser, B., and Lehrer, R. (2005). Tracing a prospective learning progression for developing understanding of evolution. Commissioned paper prepared for the National Research Council’s Committee on Test Design for K-12 Science Achievement, Washington, DC.( http://www7.nationalacademies.org/bota/Test_Design_K-12_Science.html) http://www7.nationalacademies.org/bota/Test_Design_K-12_Science.html) Reiser, B.J., Krajcik, J., Moje, E., and Marx, R. (2003). Design strategies for developing science instructional materials. Paper presented at the National Association for Research in Science Teaching Annual Meeting, March, Philadelphia, PA. Smith, C., Wiser, M., Anderson, C.W., Krajcik, J., and Coppola, B. (2004). Implications of research on children’s learning for assessment: matter and atomic molecular theory. Commissioned paper prepared for the National Research Council’s Committee on Test Design for K–12 Science Achievement, Washington, DC..( http://www7.nationalacademies.org/bota/Test_Design_K-12_Science.html) http://www7.nationalacademies.org/bota/Test_Design_K-12_Science.html) Wilson, M. (2005). Constructing measures: An item-response modeling approach. Mahwah, NJ: Lawrence Erlbaum Associates.(https://www.erlbaum.com/shop/tek9.asp?pg=products&specific=0- 8058-4785-5)https://www.erlbaum.com/shop/tek9.asp?pg=products&specific=0- 8058-4785-5 Wilson, M, & Bertenthal, M. (Eds.). (2005). Systems for state science assessment. Report of the Committee on Test Design for K-12 Science Achievement. Washington, D.C.: National Academy Press. (http://books.nap.edu/catalog/11312.html) Wilson, M., and Sloane, K. (2000). From principles to practice: An embedded assessment system. Applied Measurement in Education, 12(2), 181–208. Available at: http://www.leaonline.com/doi/pdfplus/10.1207/S15324818AME1302_4


Download ppt "Struggling for meaning in standards-based assessment Mark Wilson UC Berkeley."

Similar presentations


Ads by Google