Inferences about School Quality using opportunity to learn data: The effect of ignoring classrooms. Felipe Martinez CRESST/UCLA CCSSO Large Scale Assessment.

Slides:



Advertisements
Similar presentations
Stephen C. Court Presented at
Advertisements

Project VIABLE: Behavioral Specificity and Wording Impact on DBR Accuracy Teresa J. LeBel 1, Amy M. Briesch 1, Stephen P. Kilgus 1, T. Chris Riley-Tillman.
Chapter 3 Introduction to Quantitative Research
Chapter 3 Introduction to Quantitative Research
Performance Assessment
Standardized Scales.
ASSESSING RESPONSIVENESS OF HEALTH MEASUREMENTS. Link validity & reliability testing to purpose of the measure Some examples: In a diagnostic instrument,
Cross Sectional Designs
Cal State Northridge Psy 427 Andrew Ainsworth PhD
The Research Consumer Evaluates Measurement Reliability and Validity
Testing What You Teach: Eliminating the “Will this be on the final
Culture and psychological knowledge: A Recap
Using Growth Models for Accountability Pete Goldschmidt, Ph.D. Assistant Professor California State University Northridge Senior Researcher National Center.
Using Hierarchical Growth Models to Monitor School Performance: The effects of the model, metric and time on the validity of inferences THE 34TH ANNUAL.
RESEARCH METHODS IN EDUCATIONAL PSYCHOLOGY
Chapter 9 Flashcards. measurement method that uses uniform procedures to collect, score, interpret, and report numerical results; usually has norms and.
Test Validity S-005. Validity of measurement Reliability refers to consistency –Are we getting something stable over time? –Internally consistent? Validity.
 There are times when an experiment cannot be carried out, but researchers would like to understand possible relationships in the data. Data is collected.
Analysis of Clustered and Longitudinal Data
OCTOBER ED DIRECTOR PROFESSIONAL DEVELOPMENT 10/1/14 POWERFUL & PURPOSEFUL FEEDBACK.
Copyright © 2001 by The Psychological Corporation 1 The Academic Competence Evaluation Scales (ACES) Rating scale technology for identifying students with.
© 2013 Cengage Learning. Outline  Types of Cross-Cultural Research  Method validation studies  Indigenous cultural studies  Cross-cultural comparisons.
An Introduction to HLM and SEM
NCAASE Work with NC Dataset: Initial Analyses for Students with Disabilities Ann Schulte NCAASE Co-PI
Maslina Mahdzan Mazlina Mustapha Badriyah Bt Minai International Conference on Life Long Learning
The Quality of Teacher-Student and Home-School Relationships in Black and White Students in West-Central Wisconsin Paula Hoffert, M.S.E. and Barbara Lozar,
Student Engagement Survey Results and Analysis June 2011.
Danielle Varda & Carrie Chapman University of Colorado at Denver, School of Public Affairs.
Understanding Statistics
The Impact of Including Predictors and Using Various Hierarchical Linear Models on Evaluating School Effectiveness in Mathematics Nicole Traxel & Cindy.
CHAPTER 6, INDEXES, SCALES, AND TYPOLOGIES
Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.
Issues in Assessment Design, Vertical Alignment, and Data Management : Working with Growth Models Pete Goldschmidt UCLA Graduate School of Education &
OCTOBER ED DIRECTOR PROFESSIONAL DEVELOPMENT 10/1/14 POWERFUL & PURPOSEFUL FEEDBACK.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Perceptive Agile Measurement: New Instruments for Quantitative Studies in the Pursuit of the Social-Psychological Effect of Agile Practices Department.
C R E S S T / U C L A UCLA Graduate School of Education & Information Studies Center for the Study of Evaluation National Center for Research on Evaluation,
Thomson South-Western Wagner & Hollenbeck 5e 1 Chapter Sixteen Critical Thinking And Continuous Learning.
C M Clarke-Hill1 Analysing Quantitative Data Forming the Hypothesis Inferential Methods - an overview Research Methods.
Investigating the role of career anchors in job satisfaction and organisational commitment; a PJ fit approach Catherine Steele & Dr Jan Francis-Smythe.
“Value added” measures of teacher quality: use and policy validity Sean P. Corcoran New York University NYU Abu Dhabi Conference January 22, 2009.
A Comparison of General v. Specific Measures of Achievement Goal Orientation Lisa Baranik, Kenneth Barron, Sara Finney, and Donna Sundre Motivation Research.
Copyright © Allyn & Bacon 2008 Intelligent Consumer Chapter 14 This multimedia product and its contents are protected under copyright law. The following.
American Educational Research Association Annual Meeting AERA San Diego, CA - April 13-17, 2009 Denise Huang Examining the Relationship between LA's BEST.
An Introduction to Formative Assessment as a useful support for teaching and learning.
The Practice of Social Research Chapter 6 – Indexes, Scales, and Typologies.
Personal Control over Development: Effects on the Perception and Emotional Evaluation of Personal Development in Adulthood.
Standards-Based Science Assessment. Ohio’s Science Cognitive Demands Science is more than a body of knowledge. It must not be misperceived as lists of.
Assessing Responsiveness of Health Measurements Ian McDowell, INTA, Santiago, March 20, 2001.
Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.
 A test is said to be valid if it measures accurately what it is supposed to measure and nothing else.  For Example; “Is photography an art or a science?
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov February 16, 2011.
Business Project Nicos Rodosthenous PhD 08/10/2013 1
LISA A. KELLER UNIVERSITY OF MASSACHUSETTS AMHERST Statistical Issues in Growth Modeling.
Assistant Instructor Nian K. Ghafoor Feb Definition of Proposal Proposal is a plan for master’s thesis or doctoral dissertation which provides the.
Dr. Marciano B. Melchor University of Ha’il, KINGDOM OF SAUDI ARABIA May 2013.
Research and Evaluation
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov March 23, 2011.
2) Methodology Research instrument:
Test Validity.
Shudong Wang NWEA Liru Zhang Delaware Department of Education
Instructional Practices in the Early Grades that Foster Language & Comprehension Development Timothy Shanahan University of Illinois at Chicago
Partial Credit Scoring for Technology Enhanced Items
Analyzing Reliability and Validity in Outcomes Assessment Part 1
Information Overload and National Culture: Assessing Hofstede’s Model Based on Data from Two Countries Ned Kock, Ph.D. Dept. of MIS and Decision Science.
An Introduction to Correlational Research
Validity and Reliability II: The Basics
Analyzing Reliability and Validity in Outcomes Assessment
Chapter 8 VALIDITY AND RELIABILITY
Educational Testing Service
Presentation transcript:

Inferences about School Quality using opportunity to learn data: The effect of ignoring classrooms. Felipe Martinez CRESST/UCLA CCSSO Large Scale Assessment Conference Boston, MA; June 21, 2004

Introduction We focus on two related issues concerning the valid use of measures of school performance in accountability systems: differences in achievement (and OTL) between classrooms, and the impact on measures of school quality of ignoring the classroom context. Overview of research questions and studies. 1. Comparison of teacher and student reports of OTL 2. Distribution and effects of OTL (as reported by students and teachers) on student achievement in Reading. 3. Effect of ignoring classroom nesting in multilevel models. Effect on model-based measures of school quality.

Research Questions Comparison of teacher perceptions of the OTL they provide, to student perceptions of the OTL they receive. Factor Analysis Distribution and effects of OTL, as perceived by teachers and students. Multilevel Models (HLM) Distribution of student achievement and effect of ignoring classroom nesting on measures of school quality. Multilevel Models (HLM, Empirical Bayes estimates)

Factor Analysis Sample: Our sample consisted of 97,675 students attending elementary schools in grades 2 nd to 5 th. In addition the sample included data from 6,902 teachers. Methods: A questionnaire was used to collect OTL data from students and their teachers in order to compare their perceptions of the educational activities that occur in the classroom during the school year. The Student and Teacher OTL questionnaires were identical and consist of seven four-point Likert items related to content exposure and other classroom practices theoretically related to student performance in Language Arts. Each item inquires about the frequency with which the students performed a certain activity in the classroom during the school year. The scale ranges from 1 (almost never) to 4 (almost every day).

Factor Analysis Teacher and Student reports of Opportunity to Learn

Factor Analysis Teacher and Student reports of Opportunity to Learn

Factor Analysis Teacher and Student reports of Opportunity to Learn Differences exist in the way teachers and their students perceive the nature of teaching and learning activities conducted in the classroom As Teachers, Students clearly separated activities related to writing; however, students regarded activities led by the teacher, independently of whether these involved reading aloud to them or explaining grading criteria, as part of a single construct of Teacher activities. This may reflect a certain degree of confusion in the students (or lack of direction from the teachers) in terms of the specific nature of the activity being carried out by the teacher — whereby students may for example perceive explanation of grading criteria as their teacher simply reading something to them.

Multilevel Analysis Sample: The sample for multilevel modeling includes 46,284 2 nd to 5 th grade students distributed across 4,972 classrooms (teachers), in 375 elementary schools Methods: For each student achievement scores (SAT9 Reading) and background information was available, as well as context data about classrooms and schools. The factors created in the previous step are used as indicators of student and teacher OTL. At this stage, three-level multilevel models (HLMs) were employed to correctly take into account the nested structure of the data—students nested within classrooms, which in turn were nested in schools.

Multilevel Analysis Unconditional 2-Level Model

Multilevel Analysis Unconditional 3-Level Model

Multilevel Analysis Unconditional Two- and Three- Level Models of SAT9 Reading Scores

Multilevel Analysis Unconditional Three- Level Models of Student OTL

Multilevel Analysis Student- and Teacher- reported OTL as predictor of student achievement

Multilevel Analysis Student- and Teacher- reported OTL as predictor of student achievement

Multilevel Analysis Variation of Student-reported OTL effects across classrooms

Multilevel Analysis Correlations of EB School residuals from unconditional 2- and 3-level models

Multilevel Analysis Conditional two-level model with school context Level-1 (Student) Level-2 (School)

Multilevel Analysis Conditional three-level model with classroom context Level-1 (Student) Level-2 (Classroom)  00k =  u00k Level-2 (School)

Multilevel Analysis Correlations of EB School residuals from conditional 2- and 3-level models (considering classroom context)

Discussion Teachers and students do not necessarily perceive the same opportunities to learn within a classroom. Results are in agreement with previous research (e.g. see Muthen et. al., 1995) suggesting that OTL information collected from teachers may not add significantly to what is known from information collected from students. Students’ own perceptions of OTL are more closely linked to their achievement than are the perceptions of teachers of the opportunities they provide the same students. Although OTL is provided to student at the classroom level, measuring student perceptions may be a more powerful (and accurate) indicator of OTL than teacher reports.

Discussion Results also support the notion that the classroom environment is at least as important or more important than the larger school as determinant of student learning (see Kyriakydes, Campbell & Gagatsis, 2000; Hill & Rowe, 1996; Anderson, 1987, among others). Student OTL slopes vary significantly across classrooms (Level-2), but not across schools (Level-3). This implies that the exact effects of OTL at any given classroom can differ considerably from the average. Model-based measures of school quality (in our case Empirical Bayes school-level residuals) are impacted by model choice. Estimates for particular schools can differ considerably depending on whether the classroom environment is included by using a three-level model, or not.

Discussion Results emphasize the importance of using three-level models. In general, d espite the fact that accountability systems are aimed at schools, results indicate that careful attention needs to be paid also to classroom differences within schools. Increasing attention to teacher effects is an encouraging sign although the use of these estimates in high stakes situations is problematic (McCaffrey et. al., 2004) Even within three-level models, however, use of EB residuals and other estimates of school quality for accountability purposes should be carefully considered as they constitute “at best, Type A effects” not suitable for accountability (Willms and Raudenbush, 1989; Raudenbush, 2004). Furthermore, the estimates of school performance we produce are cross-sectional in nature. For an up-to-date view on the promise but also the (sometimes overwhelming) complexity of the models needed for longitudinal studies of teacher and school effectiveness see the last issue of the Journal for Educational and Behavioral Statistics on Value Added Assessment (Spring 2004).