Presentation is loading. Please wait.

Presentation is loading. Please wait.

Harry O'Neil University of Southern California and The Center for Research on Evaluation, Standards, and Student Testing A Theoretical Basis for Assessment.

Similar presentations


Presentation on theme: "Harry O'Neil University of Southern California and The Center for Research on Evaluation, Standards, and Student Testing A Theoretical Basis for Assessment."— Presentation transcript:

1 Harry O'Neil University of Southern California and The Center for Research on Evaluation, Standards, and Student Testing A Theoretical Basis for Assessment of Problem Solving Annual Meeting of American Educational Research Association Montreal, Canada April 19, 1999 AERA 19 Apr 99 v.2

2 C R E S S T / U S C CRESST MODEL OF LEARNING Content Understanding Learning Communication Collaboration Problem Solving Self-Regulation

3 AERA 19 Apr 99 v.2 C R E S S T / U S C JUSTIFICATION: WORLD OF WORK The justification for Problem Solving as a core demand can be found in analyses of both the workplace and academic learning. –O’Neil, Allred, and Baker (1997) reviewed five major studies from workplace readiness literature. Each of these studies identified the need for (a) higher order thinking skills, (b) teamwork, and (c) some form of technology fluency. In four of the studies, problem solving skills were specifically identified as essential.

4 AERA 19 Apr 99 v.2 C R E S S T / U S C JUSTIFICATION: NATIONAL STANDARDS New standards (e.g., National Science Education Standards) suggest new assessment approaches rather than multiple-choice exams –Deeper or higher order learning –More robust knowledge representations –Integration of mathematics and science –Integration of scientific information that students can apply to new problems in varied settings (i.e., transfer) –Integration of content knowledge and problem solving –More challenging science problems –Conduct learning in groups

5 AERA 19 Apr 99 v.2 C R E S S T / U S C PROBLEM SOLVING DEFINITION Problem solving is cognitive processing directed at achieving a goal when no solution method is obvious to the problem solver (Mayer & Wittrock, 1996) Problem-solving components (Glaser et al. 1992, Sugrue 1994) –Domain-specific knowledge (content understanding) –Problem-solving strategy ––Domain-specific strategy in troubleshooting (e.g., malfunction probability [i.e., fix first the component that fails most often]) –Self-regulation (metacognition [planning, self- monitoring] + motivation [effort, self-efficacy] )

6 Content Understanding PROBLEM SOLVING Domain-Dependent Problem-Solving Strategies Self-Regulation Metacognition Self- Monitoring Planning Motivation Effort Self- Efficacy AERA 19 Apr 99 v.2 C R E S S T / U S C

7 AERA 19 Apr 99 v.2 C R E S S T / U S C CONCLUSION To be a successful problem solver, one must know something (content knowledge), possess intellectual tricks (problem-solving strategies), be able to plan and monitor one’s progress towards solving the problem (metacognition), and be motivated (effort, self-efficacy).

8 AERA 19 Apr 99 v.2 C R E S S T / U S C MEASURING PROBLEM SOLVING Think-aloud protocols Performance assessments with extensive human rater scoring Multiple-choice tests (direct or indirect) Computer-based assessments (CRESST, ETS) Observation systems Experiments Focus groups On-line monitoring Survey

9 AERA 19 Apr 99 v.2 C R E S S T / U S C CRESST MEASUREMENT OF PROBLEM SOLVING Context is program evaluation of technology-based innovations in DoDEA (CAETI). Computer-based administration, scoring and reporting –Content understanding is measured by a knowledge map –Domain-specific problem-solving strategies are measured on a search task by looking at search behavior and the use of the information found. Search for information on concepts they are uncertain of so as to improve their content understanding. –Self-regulation measured by a paper and pencil. Reporting –Problem solving would be a profile of three scores (content understanding, problem solving strategies, and self-regulation)

10 AERA 19 Apr 99 v.2 C R E S S T / U S C CONCEPT MAPS Definition –A concept map is a graphical representation of information consisting of nodes and labeled lines nodes correspond to concepts within a particular subject area or domain lines indicate a relationship between pairs of concepts (nodes) labels on each line explain how two concepts are related

11 AERA 19 Apr 99 v.2 C R E S S T / U S C SAMPLE CONCEPT MAP “Learning in the Classroom”

12 AERA 19 Apr 99 v.2 C R E S S T / U S C CONCEPT MAPS (continued) Administration: Concept maps have typically been constructed using paper-and-pencil formats, such that the person draws the concept map on a piece of paper and raters are trained to score it. Thus, the cost of scoring such maps is expensive. –However, CRESST has developed software to provide such scoring based on an expert’s map The concepts and links are fixed such that the person selects a concept from a list and a link from another list. Such scoring depends on a digital representation of the individual’s paper-and-pencil concept map.

13 AERA 19 Apr 99 v.2 C R E S S T / U S C Concept Map Scores IndividualGroup Content score: Structural score: Number of terms you used: Number of links you used: Number of terms the expert used: Number of links the expert used: Score Information The content score ranges from 0 to 7. The structural score ranges from 0 to 1. The higher your content and structural scores are, the closer your map will look like an expert’s. 4 4 4 4 7 0.80

14 AERA 19 Apr 99 v.2 C R E S S T / U S C CONCLUSIONS Computer-based Problem-solving assessment is feasible –Process/Product validity evidence is promising Allows real time scoring/reporting to students and teachers Useful for program evaluation and diagnostic functions of testing What’s next? –Generalizability Study –Structural equation modeling –Measure of internet literacy or technology fluency However, needs computer to administer –Not feasible/Cost-effective in some environments

15 AERA 19 Apr 99 v.2 C R E S S T / U S C INTERNATIONAL LIFE SKILLS SURVEY Computer-based Administration was not feasible Survey with Adults (16-65) is a follow up of IALS Content Understanding –Have participants represent a mechanical system (e.g., a tire pump) by creating a concept map that is scored with software. Explanation lists to assess domain dependent problem-solving strategies –Have participants respond via a written explanation to a prompt re domain-specific problem-solving process (e.g., troubleshooting: Imagine that the tire pump does not pump air to the hose. What could be wrong? Questionnaire to measure self-regulation

16 AERA 19 Apr 99 v.2 C R E S S T / U S C TYPE OF KNOWLEDGE MAPS Paper administered, person scored, computer reported –Traditional Computer administered, computer scored, computer reported –CRESST DoDEA Schools, prior work Paper administered, computer scored, computer reported –Today’s symposium (CRESST ILSS project)

17 AERA 19 Apr 99 v.2 C R E S S T / U S C PROBLEM SOLVING: QUESTIONS Questions –How many topics do we provide to be reliable? –How many problem solving questions do we provide to be reliable? –What is impact of providing scoring information? –What is role of gender? –What is the role of knowledge? –Control issues What is effect of counterbalancing? How do you represent procedures in knowledge maps? –Revision of self-regulation questionnaire Samples –Taiwanese 10th grade students (50% males, 50% females) –College students (39% males, 61% females) –Temporary workers (16-65) (approximately 43% females, 57% males)

18 Bicycle Tire Pump AERA 19 Apr 99 v.2

19 C R E S S T / U S C CONCLUSIONS Allows paper and pencil administration and computer scoring/reporting –like multiple choice tests Useful for program evaluation and classroom uses What’s next? –generalizability study –comparative study of computer administration vs. paper and pencil administration

20 AERA 19 Apr 99 v.2 C R E S S T / U S C STRENGTHS Agreement that skill is necessary for the world of work (high- performance workplaces) Construct is a synthesis of definitions from the workforce readiness, cognitive science, and social/developmental literature Measurement is threefold—breaks new ground: content understanding via knowledge maps (performance assessment); problem-solving strategies via tasks to list activities (performance assessment); self-regulation via self-report techniques Leverages existing work at CRESST

21 AERA 19 Apr 99 v.2 C R E S S T / U S C WEAKNESSES WEAKNESS:Breaking new ground in measurement of problem solving via performance assessment leads to few “items,” which potentially leads to problems in reliability and content validity, particularly in the content understanding area via knowledge maps. RELIABILITY FIX:Conduct two generalizability studies (Los Angeles in spring 1998 Taiwan in January 1998) to estimate magnitude of potential reliability problems. CONTENT VALIDITY FIX: Each student gets two problems; live with the reality that performance assessment and time constraints lead to few tasks.

22 AERA 19 Apr 99 v.2 C R E S S T / U S C WEAKNESSES (continued) WEAKNESS:Too few items. FIX:Concept map (10-15items), strategies (6-8- items), self-regulation (32 items). For both concept map and strategies omitted “items” are scored as wrong. WEAKNESS:Breaking new ground in measurement leads to new challenges in reporting. Should problem solving be reported as single vs. multiple scores as profile and/or levels as in literacy? FIX: Explore use of latent variable structures (e.g., confirmatory factor analysis) as replacement/supplement to existing literacy/numeracy framework (IRT).

23 AERA 19 Apr 99 v.2 C R E S S T / U S C WEAKNESSES (continued) WEAKNESS:The nature of problem-solving tasks (e.g., troubleshooting) may favor males. FIX:Via pilot studies, estimate gender effect; select some tasks that females would be better at than males.

24 AERA 19 Apr 99 v.2 C R E S S T / U S C Back-Up Slides

25 RECOMMENDED APPROACH FOR MEASUREMENT OF PROBLEM SOLVING AERA 19 Apr 99 v.2

26 RECOMMENDED APPROACH (continued) AERA 19 Apr 99 v.2

27 C R E S S T / U S C SAMPLE METACOGNITIVE ITEMS The following questions refer to the ways people have used to describe themselves. Read each statement below and indicate how you generally think or feel. There are no right or wrong answers. Do not spend too much time on any one statement. Remember, give the answer that seems to describe how you generally think or feel. Note. Formatted as in Section E, Background Questionnaire: Canadian version of the International Adult Literacy Survey (1994). Item a is a planning item; item b is a self-checking item. Kosmicki (1993) reported alpha reliability of.86 and.78 for 6-item versions of these scales respectively.


Download ppt "Harry O'Neil University of Southern California and The Center for Research on Evaluation, Standards, and Student Testing A Theoretical Basis for Assessment."

Similar presentations


Ads by Google