Presentation is loading. Please wait.

Presentation is loading. Please wait.

CRESST ONR/NETC Meetings, 17-18 July 2003, v1 ONR Advanced Distributed Learning Impact of Language Factors on the Reliability and Validity of Assessment.

Similar presentations


Presentation on theme: "CRESST ONR/NETC Meetings, 17-18 July 2003, v1 ONR Advanced Distributed Learning Impact of Language Factors on the Reliability and Validity of Assessment."— Presentation transcript:

1

2 CRESST ONR/NETC Meetings, 17-18 July 2003, v1 ONR Advanced Distributed Learning Impact of Language Factors on the Reliability and Validity of Assessment for ELLs Jamal Abedi University of California, Los Angeles National Center for Research on Evaluation, Standards, and Student Testing (CRESST) July 18, 2003

3 CRESST ONR/NETC Meetings, 17-18 July 2003, v1 Classical Test Theory: Reliability  2 X =  2 T +  2 E X: Observed Score T: True Score E: Error Score  XX’=  2 T /  2 X  XX’= 1-  2 E /  2 X Textbook examples of possible sources that contribute to the measurement error: 2 Rater Occasion Item Test Form

4 CRESST ONR/NETC Meetings, 17-18 July 2003, v1 Generalizability Theory: Partitioning Error Variance into Its Components s 2 (X pro ) =  2 p +  2 r +  2 o +  2 pr +  2 po +  2 ro +  2 pro,e p: Person r: Rater o: Occasion Are there any sources of measurement error that may specifically influence ELL performance? 3

5 CRESST ONR/NETC Meetings, 17-18 July 2003, v1 Validity of Academic Achievement Measures We will focus on construct and content validity approaches: A test’s content validity involves the careful definition of the domain of behaviors to be measured by a test and the logical design of items to cover all the important areas of this domain (Allen & Yen, 1979, p. 96). A test’s construct validity is the degree to which it measures the theoretical construct or trait that it was designed to measure (Allen & Yen, 1979, p. 108). A content-based achievement test has construct validity if it measures the content that it is supposed to measure. A content-based achievement test has content validity if the test content is representative of the content being measured. 4 Examples:

6 CRESST ONR/NETC Meetings, 17-18 July 2003, v1 Two major questions on the psychometric of academic achievement tests for ELLs: Are there any sources of measurement error that may specifically influence ELL performance? Do achievement tests accurately measure ELLs’ content knowledge? 5

7 CRESST ONR/NETC Meetings, 17-18 July 2003, v1 Study #9 Impact of students’ language background on content-based performance: analyses of extant data (Abedi & Leon, 1999). Analyses were performed on extant data, such as Stanford 9 and ITBS SAMPLE: Over 900,000 students from four different sites nationwide. Study #10 Examining ELL and non-ELL student performance differences and their relationship to background factors (Abedi, Leon, & Mirocha, 2001). Data were analyzed for the language impact on assessment and accommodations of ELL students. SAMPLE: Over 700,000 students from four different sites nationwide. Finding l The higher the level of language demand of the test items, the higher the performance gap between ELL and non-ELL students. l Large performance gap between ELL and non-ELL students on reading, science and math problem solving (about 15 NCE score points). l This performance gap was reduced to zero in math computation.

8 CRESST ONR/NETC Meetings, 17-18 July 2003, v1 Normal Curve Equivalent Means and Standard Deviations for Students in Grades 10 and 11, Site 3 School District Reading Science Math MSD M SD M SD Grade 10 SD only16.412.725.513.322.511.7 LEP only24.016.432.915.336.816.0 LEP & SD16.311.224.8 9.323.6 9.8 Non-LEP & SD38.016.042.617.239.616.9 All students36.016.941.317.538.517.0 Grade 11 SD Only14.913.221.512.324.313.2 LEP Only22.516.128.414.445.518.2 LEP & SD15.512.726.120.125.113.0 Non-LEP & SD38.418.339.618.845.221.1 All Students36.219.038.218.944.021.2 Note. LEP = limited English proficient. SD = students with disabilities.

9 CRESST ONR/NETC Meetings, 17-18 July 2003, v1 Disparity Index (DI) was an index of performance differences between LEP and non-LEP. SITE 3 Disparity Index (DI) Non-LEP/Non-SD Students Compared to LEP-Only Students Disparity Index (DI) Math Math Grade Reading Math Total Calculation Analytical 3 53.425.812.932.8 6 81.637.622.246.1 8125.236.925.244.0

10 CRESST ONR/NETC Meetings, 17-18 July 2003, v1

11

12

13 Issues and problems in classification of students with limited English proficiency

14 CRESST ONR/NETC Meetings, 17-18 July 2003, v1 Findings The relationship between language proficiency test scores and LEP classification. Since LEP classification is based on students’ level of language proficiency and because LAS is a measure of language proficiency, one would expect to find a perfect correlation between LAS scores and LEP levels (LEP versus non-LEP). The results of analyses indicated a weak relationship between language proficiency test scores and language classification codes (LEP categories). CorrelationG2G3G4G5G6G7G8G9G10G11G12 Pearson r.223.195.187.199.224.261.252.265.304.272.176 Sig (2-tailed).000 N58772162110028039387961102945782836 Correlation between LAS rating and LEP classification for Site 4

15 CRESST ONR/NETC Meetings, 17-18 July 2003, v1

16 Correlation coefficients between LEP classification code and ITBS subscales for Site 1 GradeReading Math Concept & Estimation Math Problem Solving Math Computation Grade 3 Pearson r-.160-.045-.076.028 Sig (2-tailed).000 N36,00635,98135,94836,000 Grade 6 Pearson r-.256-.154-.180-.081 Sig (2-tailed).000 N28,27228,27328,25028,261 Grade 8 Pearson r-.257-.168-.206-.099 Sig (2-tailed).000 N25,36225,33625,33325,342

17 CRESST ONR/NETC Meetings, 17-18 July 2003, v1 Generalizability Theory: Language as an additional source of measurement error  2 (X prl ) =  2 p +  2 r +  2 l +  2 pr +  2 pl +  2 rl +  2 prl,e p: Person r: Rater l: Language Are there any sources of measurement error that may specifically influence ELL performance?


Download ppt "CRESST ONR/NETC Meetings, 17-18 July 2003, v1 ONR Advanced Distributed Learning Impact of Language Factors on the Reliability and Validity of Assessment."

Similar presentations


Ads by Google