Harry O'Neil University of Southern California and The Center for Research on Evaluation, Standards, and Student Testing A Theoretical Basis for Assessment.

Slides:



Advertisements
Similar presentations
Performance Assessment
Advertisements

National Accessible Reading Assessment Projects Defining Reading Proficiency for Accessible Large Scale Assessments Principles and Issues Paper American.
Problem solving skills
Learning Outcomes Participants will be able to analyze assessments
Advances in the PARCC Mathematics Assessment August
Got Data? Using your ACT tools to improve performance on all levels, moving students to College and Career Readiness.
Educational Outcomes: The Role of Competencies and The Importance of Assessment.
CHAPTER 3 ~~~~~ INFORMAL ASSESSMENT: SELECTING, SCORING, REPORTING.
1 SESSION 3 FORMAL ASSESSMENT TASKS CAT and IT FORMS OF ASSESSMENT.
Neag School of Education Using Social Cognitive Theory to Predict Students’ Use of Self-Regulated Learning Strategies in Online Courses Anthony R. Artino,
Conceptual change. Conceptual reorganization in psychology students beliefs’ about the discipline. Eric Amsel & Adam Johnston Weber State University 10.
C R E S S T / U C L A Improving the Validity of Measures by Focusing on Learning Eva L. Baker CRESST National Conference: Research Goes to School Los Angeles,
Home Economics Teachers’ Readiness for Teaching STEM
Principles of High Quality Assessment
Session 6: Writing from Sources Audience: 6-12 ELA & Content Area Teachers.
Planning, Instruction, and Technology
ICT TEACHERS` COMPETENCIES FOR THE KNOWLEDGE SOCIETY
Mathematics Grade Level Considerations for Grades 6-8.
Copyright © 2001 by The Psychological Corporation 1 The Academic Competence Evaluation Scales (ACES) Rating scale technology for identifying students with.
Principles of Assessment
1 MSP-Motivation Assessment Program (MSP-MAP) Tools for the Evaluation of Motivation-Related Outcomes of Math and Science Instruction Martin Maehr
C R E S S T / U C L A Evaluating the Impact of the Interactive Multimedia Exercises (IMMEX) Program: Measuring the Impact of Problem-Solving Assessment.
DMUSD TRANSITION TO COMMON CORE STATE STANDARDS. COMMON CORE STATE STANDARDS  Common Core State Standards Initiative is a state-led effort coordinated.
DEVELOPING ALGEBRA-READY STUDENTS FOR MIDDLE SCHOOL: EXPLORING THE IMPACT OF EARLY ALGEBRA PRINCIPAL INVESTIGATORS:Maria L. Blanton, University of Massachusetts.
GED 2014 Changes, Challenges, and Choices October 14, 2014.
English Language Arts Overview Created By: Penny Plavala, Literacy Specialist.
1 How can self-regulated learning be supported in mathematical E-learning environments? Presenters: Wei-Chih Hsu Professor : Ming-Puu Chen Date : 11/10/2008.
Evaluating Student Growth Looking at student works samples to evaluate for both CCSS- Math Content and Standards for Mathematical Practice.
Critical and creative thinking Assessment Tool How could schools use the tool? Sharon Foster.
Interstate New Teacher Assessment and Support Consortium (INTASC)
A Framework for Inquiry-Based Instruction through
Advances in the PARCC Mathematics Assessment August
ASSESSMENT OF STUDENT LEARNING Manal bait Gharim.
Standards-Based Science Instruction. Ohio’s Science Cognitive Demands Science is more than a body of knowledge. It must not be misperceived as lists of.
 Participants will teach Mathematics II or are responsible for the delivery of Mathematics II instruction  Participants attended Days 1, 2, and 3 of.
SOL Changes and Preparation A parent presentation.
EDU 385 EDUCATION ASSESSMENT IN THE CLASSROOM
Comp 20 - Training & Instructional Design Unit 6 - Assessment This material was developed by Columbia University, funded by the Department of Health and.
Forum - 1 Assessments for Learning: A Briefing on Performance-Based Assessments Eva L. Baker Director National Center for Research on Evaluation, Standards,
1 Issues in Assessment in Higher Education: Science Higher Education Forum on Scientific Competencies Medellin-Colombia Nov 2-4, 2005 Dr Hans Wagemaker.
Chapter 5 Building Assessment into Instruction Misti Foster
CRESST Conference Los Angeles, CA September 15, 2000 CRESST Conference 9/15/00 v.3 COMPUTER-BASED ASSESSMENT OF COLLABORATIVE PROBLEM SOLVING Harry O'Neil.
Putting Research to Work in K-8 Science Classrooms Ready, Set, SCIENCE.
Students’ and Faculty’s Perceptions of Assessment at Qassim College of Medicine Abdullah Alghasham - M. Nour-El-Din – Issam Barrimah Acknowledgment: This.
FEBRUARY KNOWLEDGE BUILDING  Time for Learning – design schedules and practices that ensure engagement in meaningful learning  Focused Instruction.
Classroom Assessment A Practical Guide for Educators by Craig A. Mertler Chapter 13 Assessing Affective Characteristics.
CT 854: Assessment and Evaluation in Science & Mathematics
Baker ONR/NETC July 03 v.4  2003 Regents of the University of California ONR/NETC Planning Meeting 18 July, 2003 UCLA/CRESST, Los Angeles, CA ONR Advanced.
ONR/NSF Technology Assessment of Web-Based Learning, v3 © Regents of the University of California 6 February 2003 ONR/NSF Technology Assessment of Web-Based.
1/27 CRESST/UCLA DIAGNOSTIC/PRESCRIPTIVE USES OF COMPUTER- BASED ASSESSMENT OF PROBLEM SOLVING San-hui Sabrina Chuang CRESST Conference 2007 UCLA Graduate.
1 The Impact of Knowledge Sharing Modes and Projects Complexity on Team Creativity in Taiwan ’ s Information Systems Development Team 1 Mei-Hsiang Wang.
Session Objectives Analyze the key components and process of PBL Evaluate the potential benefits and limitations of using PBL Prepare a draft plan for.
Assessment Information from multiple sources that describes a student’s level of achievement Used to make educational decisions about students Gives feedback.
Race to the Top Assessment Program: Public Hearing on Common Assessments January 20, 2010 Washington, DC Presenter: Lauress L. Wise, HumRRO Aab-sad-nov08item09.
Standards-Based Science Assessment. Ohio’s Science Cognitive Demands Science is more than a body of knowledge. It must not be misperceived as lists of.
Maryland College and Career Readiness Conference Summer 2015.
Understanding the 2015 Smarter Balanced Assessment Results Assessment Services.
Enriching Assessment of the Core Albert Oosterhof, Faranak Rohani, & Penny J. Gilmer Florida State University Center for Advancement of Learning and Assessment.
Chapter 6 - Standardized Measurement and Assessment
EVALUATION AND SELFASSESSMENT SYSTEMS FOR THE VIRTUAL TEACHING: MULTIPLE CHOICE QUESTIONNAIRES Claudio Cameselle, Susana Gouveia Departamento de Enxeñería.
1 Science, Learning, and Assessment: (Eats, Shoots, and Leaves) Choices for Comprehensive Assessment Design Eva L. Baker UCLA Graduate School of Education.
Lessons Learned. Communication, Communication, Communication Collaborative effort Support of all stakeholders Teachers, Principals, Supervisors, Students,
Donna Lipscomb EDU 695 MAED Capstone Common Core Presentation INSTRUCTOR KYGER MAY 21, 2015.
Development of the Construct & Questionnaire Randy Garrison & Zehra Akyol April
Oleh: Beni Setiawan, Wahyu Budi Sabtiawan
ASSESSMENT OF STUDENT LEARNING
Session 4 Objectives Participants will:
Critically Evaluating an Assessment Task
LEARNER-CENTERED PSYCHOLOGICAL PRINCIPLES. The American Psychological Association put together the Leaner-Centered Psychological Principles. These psychological.
Presentation transcript:

Harry O'Neil University of Southern California and The Center for Research on Evaluation, Standards, and Student Testing A Theoretical Basis for Assessment of Problem Solving Annual Meeting of American Educational Research Association Montreal, Canada April 19, 1999 AERA 19 Apr 99 v.2

C R E S S T / U S C CRESST MODEL OF LEARNING Content Understanding Learning Communication Collaboration Problem Solving Self-Regulation

AERA 19 Apr 99 v.2 C R E S S T / U S C JUSTIFICATION: WORLD OF WORK The justification for Problem Solving as a core demand can be found in analyses of both the workplace and academic learning. –O’Neil, Allred, and Baker (1997) reviewed five major studies from workplace readiness literature. Each of these studies identified the need for (a) higher order thinking skills, (b) teamwork, and (c) some form of technology fluency. In four of the studies, problem solving skills were specifically identified as essential.

AERA 19 Apr 99 v.2 C R E S S T / U S C JUSTIFICATION: NATIONAL STANDARDS New standards (e.g., National Science Education Standards) suggest new assessment approaches rather than multiple-choice exams –Deeper or higher order learning –More robust knowledge representations –Integration of mathematics and science –Integration of scientific information that students can apply to new problems in varied settings (i.e., transfer) –Integration of content knowledge and problem solving –More challenging science problems –Conduct learning in groups

AERA 19 Apr 99 v.2 C R E S S T / U S C PROBLEM SOLVING DEFINITION Problem solving is cognitive processing directed at achieving a goal when no solution method is obvious to the problem solver (Mayer & Wittrock, 1996) Problem-solving components (Glaser et al. 1992, Sugrue 1994) –Domain-specific knowledge (content understanding) –Problem-solving strategy ––Domain-specific strategy in troubleshooting (e.g., malfunction probability [i.e., fix first the component that fails most often]) –Self-regulation (metacognition [planning, self- monitoring] + motivation [effort, self-efficacy] )

Content Understanding PROBLEM SOLVING Domain-Dependent Problem-Solving Strategies Self-Regulation Metacognition Self- Monitoring Planning Motivation Effort Self- Efficacy AERA 19 Apr 99 v.2 C R E S S T / U S C

AERA 19 Apr 99 v.2 C R E S S T / U S C CONCLUSION To be a successful problem solver, one must know something (content knowledge), possess intellectual tricks (problem-solving strategies), be able to plan and monitor one’s progress towards solving the problem (metacognition), and be motivated (effort, self-efficacy).

AERA 19 Apr 99 v.2 C R E S S T / U S C MEASURING PROBLEM SOLVING Think-aloud protocols Performance assessments with extensive human rater scoring Multiple-choice tests (direct or indirect) Computer-based assessments (CRESST, ETS) Observation systems Experiments Focus groups On-line monitoring Survey

AERA 19 Apr 99 v.2 C R E S S T / U S C CRESST MEASUREMENT OF PROBLEM SOLVING Context is program evaluation of technology-based innovations in DoDEA (CAETI). Computer-based administration, scoring and reporting –Content understanding is measured by a knowledge map –Domain-specific problem-solving strategies are measured on a search task by looking at search behavior and the use of the information found. Search for information on concepts they are uncertain of so as to improve their content understanding. –Self-regulation measured by a paper and pencil. Reporting –Problem solving would be a profile of three scores (content understanding, problem solving strategies, and self-regulation)

AERA 19 Apr 99 v.2 C R E S S T / U S C CONCEPT MAPS Definition –A concept map is a graphical representation of information consisting of nodes and labeled lines nodes correspond to concepts within a particular subject area or domain lines indicate a relationship between pairs of concepts (nodes) labels on each line explain how two concepts are related

AERA 19 Apr 99 v.2 C R E S S T / U S C SAMPLE CONCEPT MAP “Learning in the Classroom”

AERA 19 Apr 99 v.2 C R E S S T / U S C CONCEPT MAPS (continued) Administration: Concept maps have typically been constructed using paper-and-pencil formats, such that the person draws the concept map on a piece of paper and raters are trained to score it. Thus, the cost of scoring such maps is expensive. –However, CRESST has developed software to provide such scoring based on an expert’s map The concepts and links are fixed such that the person selects a concept from a list and a link from another list. Such scoring depends on a digital representation of the individual’s paper-and-pencil concept map.

AERA 19 Apr 99 v.2 C R E S S T / U S C Concept Map Scores IndividualGroup Content score: Structural score: Number of terms you used: Number of links you used: Number of terms the expert used: Number of links the expert used: Score Information The content score ranges from 0 to 7. The structural score ranges from 0 to 1. The higher your content and structural scores are, the closer your map will look like an expert’s

AERA 19 Apr 99 v.2 C R E S S T / U S C CONCLUSIONS Computer-based Problem-solving assessment is feasible –Process/Product validity evidence is promising Allows real time scoring/reporting to students and teachers Useful for program evaluation and diagnostic functions of testing What’s next? –Generalizability Study –Structural equation modeling –Measure of internet literacy or technology fluency However, needs computer to administer –Not feasible/Cost-effective in some environments

AERA 19 Apr 99 v.2 C R E S S T / U S C INTERNATIONAL LIFE SKILLS SURVEY Computer-based Administration was not feasible Survey with Adults (16-65) is a follow up of IALS Content Understanding –Have participants represent a mechanical system (e.g., a tire pump) by creating a concept map that is scored with software. Explanation lists to assess domain dependent problem-solving strategies –Have participants respond via a written explanation to a prompt re domain-specific problem-solving process (e.g., troubleshooting: Imagine that the tire pump does not pump air to the hose. What could be wrong? Questionnaire to measure self-regulation

AERA 19 Apr 99 v.2 C R E S S T / U S C TYPE OF KNOWLEDGE MAPS Paper administered, person scored, computer reported –Traditional Computer administered, computer scored, computer reported –CRESST DoDEA Schools, prior work Paper administered, computer scored, computer reported –Today’s symposium (CRESST ILSS project)

AERA 19 Apr 99 v.2 C R E S S T / U S C PROBLEM SOLVING: QUESTIONS Questions –How many topics do we provide to be reliable? –How many problem solving questions do we provide to be reliable? –What is impact of providing scoring information? –What is role of gender? –What is the role of knowledge? –Control issues What is effect of counterbalancing? How do you represent procedures in knowledge maps? –Revision of self-regulation questionnaire Samples –Taiwanese 10th grade students (50% males, 50% females) –College students (39% males, 61% females) –Temporary workers (16-65) (approximately 43% females, 57% males)

Bicycle Tire Pump AERA 19 Apr 99 v.2

C R E S S T / U S C CONCLUSIONS Allows paper and pencil administration and computer scoring/reporting –like multiple choice tests Useful for program evaluation and classroom uses What’s next? –generalizability study –comparative study of computer administration vs. paper and pencil administration

AERA 19 Apr 99 v.2 C R E S S T / U S C STRENGTHS Agreement that skill is necessary for the world of work (high- performance workplaces) Construct is a synthesis of definitions from the workforce readiness, cognitive science, and social/developmental literature Measurement is threefold—breaks new ground: content understanding via knowledge maps (performance assessment); problem-solving strategies via tasks to list activities (performance assessment); self-regulation via self-report techniques Leverages existing work at CRESST

AERA 19 Apr 99 v.2 C R E S S T / U S C WEAKNESSES WEAKNESS:Breaking new ground in measurement of problem solving via performance assessment leads to few “items,” which potentially leads to problems in reliability and content validity, particularly in the content understanding area via knowledge maps. RELIABILITY FIX:Conduct two generalizability studies (Los Angeles in spring 1998 Taiwan in January 1998) to estimate magnitude of potential reliability problems. CONTENT VALIDITY FIX: Each student gets two problems; live with the reality that performance assessment and time constraints lead to few tasks.

AERA 19 Apr 99 v.2 C R E S S T / U S C WEAKNESSES (continued) WEAKNESS:Too few items. FIX:Concept map (10-15items), strategies (6-8- items), self-regulation (32 items). For both concept map and strategies omitted “items” are scored as wrong. WEAKNESS:Breaking new ground in measurement leads to new challenges in reporting. Should problem solving be reported as single vs. multiple scores as profile and/or levels as in literacy? FIX: Explore use of latent variable structures (e.g., confirmatory factor analysis) as replacement/supplement to existing literacy/numeracy framework (IRT).

AERA 19 Apr 99 v.2 C R E S S T / U S C WEAKNESSES (continued) WEAKNESS:The nature of problem-solving tasks (e.g., troubleshooting) may favor males. FIX:Via pilot studies, estimate gender effect; select some tasks that females would be better at than males.

AERA 19 Apr 99 v.2 C R E S S T / U S C Back-Up Slides

RECOMMENDED APPROACH FOR MEASUREMENT OF PROBLEM SOLVING AERA 19 Apr 99 v.2

RECOMMENDED APPROACH (continued) AERA 19 Apr 99 v.2

C R E S S T / U S C SAMPLE METACOGNITIVE ITEMS The following questions refer to the ways people have used to describe themselves. Read each statement below and indicate how you generally think or feel. There are no right or wrong answers. Do not spend too much time on any one statement. Remember, give the answer that seems to describe how you generally think or feel. Note. Formatted as in Section E, Background Questionnaire: Canadian version of the International Adult Literacy Survey (1994). Item a is a planning item; item b is a self-checking item. Kosmicki (1993) reported alpha reliability of.86 and.78 for 6-item versions of these scales respectively.