Presentation is loading. Please wait.

Presentation is loading. Please wait.

ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

Similar presentations


Presentation on theme: "ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,"— Presentation transcript:

1 ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago, IL February 21, 2007

2 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 2 Outline of my presentation 1. What do scores on ACCESS for ELLs® mean? 2. What do we know about the reliability of ACCESS for ELLs® scores? 3. What do we know about the validity of ACCESS for ELLs® scores? 4. So what does this mean for using scores on ACCESS for ELLs®?

3 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 3 1. What do scores on ACCESS for ELLs® mean?

4 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 4 Two types of scores WIDA ACCESS for ELLs® Scale Scores = psychometrically-derived measure WIDA ACCESS for ELLs® Proficiency Level Scores = socially-derived interpretation of the scale score in terms of the WIDA Standards Proficiency Level Definitions

5 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 5 What is measured? Scale Scores (and interpretive Proficiency Level Scores) are given for measures in the four domains Listening Speaking Reading Writing Scale Scores are combined into four composite scores (which are also interpreted in Proficiency Level Scores) Oral (listening and speaking) Literacy (reading and writing) Comprehension (listening and reading) Overall Composite (listening, speaking, reading, and writing)

6 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 6 Weighting of the overall composite Scale Scales of the four domains are weighted differently in the overall composite score Listening (15%) Speaking (15%) Reading (35%) Writing (35%)

7 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 7 ACCESS administration times and composite score weights Listening (15%): minutes, machine scored Reading (35%): minutes, machine scored Writing (35%): Up to 1 hour, rater scored Speaking (15%): Up to 15 minutes, administrator scored

8 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 8 Scale Scores vs. Proficiency Level Scores The WIDA ACCESS for ELLs® Scale Scores are the psychometrically derived measures of student proficiency Range from 100 to 600 One scale applies to all grades through vertical equating of tests Vertical scale score takes into account that assessment tasks taken by students in the grade 9-12 cluster are more challenging than the assessment tasks taken by students in the grade 1-2 cluster Average scale scores consistently show an increase from grade to grade

9 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL Overall Composite Scale Scores

10 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL Overall Composite Scale Scores

11 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 11 Scale Scores vs. Proficiency Level Scores Proficiency Level Scores are socially-derived interpretations of the WIDA ACCESS for ELLs® Scale Scores in terms of the six proficiency levels defined in the WIDA Standards Comprised of two numbers, e.g. 2.5 First number indicates the proficiency level into which the students scale score places him or her (e.g. 2 = Beginning) Second number indicates how far, in tenths, the students scale places him or her between the lower and the higher cut score of the proficiency level (e.g. 2.5 = 5/10 or ½ of the way between the cut score for level 2 and for level 3) The same scale score is interpreted differently based on what grade level cluster different students are in The same proficiency level score corresponds to different scale scores based on the grade level cluster

12 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 12 Example: Scale score of 350 GradesDomainCut 1/22/33/44/55/6 1-2Overall Overall Overall Overall

13 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 13 Easy Items Less Proficient Students Hard Items More Proficient Students Example: Overall composite proficiency level score

14 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 14 How are proficiency level scores derived? While Proficiency Level Scores are socially- derived interpretations, they are not arbitrary Set by panels of content experts Set following best technical practices Set by consensus building procedures (standard setting studies) Set by carefully documented replicable procedures For WIDA ACCESS for ELLs®, these were set by panels of experts in April of 2004, for each grade level cluster (see WIDA Technical Report #1 for complete details)

15 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 15 Originally WIDA had grade level cluster cuts

16 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 16 Grade level cuts are being introduced this year

17 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 17 Cluster vs. grade level cuts

18 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL Overall Composite Scale Scores

19 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 19 Effect of grade level cut scores Proficiency Level Score

20 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL What do we know about the reliability of ACCESS for ELLs® scores?

21 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 21 What is reliability? Psychometrically speaking, reliability refers to the consistency of test scores. What evidence is there that this test score result is not just a chance occurrence, but would have been obtained had the student been tested on multiple occurrences or scored under multiple occasions?

22 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 22 Multiple forms of ACCESS for ELLs® In the Annual Technical Report, the reliability of each of the 44 separate test forms for ACCESS for ELLs® is reported. ClusterListReadWriteSpeakTotal K Total13 544

23 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 23 Types of reliability reported For all test forms, internal consistency (coefficient alpha) is reported. For writing, agreement between operational raters is also reported (20%) For speaking, agreement between administrators from field test data is also given currently, but a larger study is underway Reliabilities for domain scores based on the individual forms for Series 100 ( ) are within expected and acceptable ranges

24 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 24 Reliability of the overall composite Results indicate that the reliability of the overall composite score across tiers is similar and very high across all grade level clusters (Series 100). K

25 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 25 The most important reliability index For tests like ACCESS for ELLs®, by which decisions are based on a students classification into proficiency levels, the accuracy of classification is perhaps the most important reliability index. This index gives an estimate of how reliably a student was placed to be at least at or above a certain category (versus below that category).

26 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 26 Accuracy of classification indices (Series 100) Grade Cluster CutK / / /4na /5na /6na

27 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL What do we know about the validity of ACCESS for ELLs® scores?

28 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 28 What is validity? Validity refers to an evaluative judgment of the degree to which theoretical rationales and empirical evidence support the adequacy and appropriateness of inferences and actions made on the basis of test scores.

29 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 29 Validity issues for ACCESS for ELLs® Issues related to ACCESS for ELLs® include Do the described proficiency levels exist? How does the test relate to other measures of English language proficiency? How confident are we in the cut scores that place students into the various levels, that they really define the levels? Do we know that ACCESS for ELLs® tests the language needed for academic success and is not a content test? And so on…

30 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 30 Study 1: Do the levels of the Standards really exist? Reading and Listening Selected Response Type Items SI = Social and Instructional Language LA = language of Language Arts MA = language of Math SC = language of Science SS = language of Social Studies

31 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 31 The Standards guide test development 1. ACCESS for ELLS® makes the WIDA Standards operational 2. WIDA Standards provide a.Content (What?) b.Performance Levels (How well?)

32 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 32 Large-scale Standards: SC reading

33 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 33 Large-Scale standards: SC reading Classify living organisms (such as birds and mammals) by using pictures or icons

34 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 34 Large-scale Standards: SC reading Interpret data presented in text and tables in scientific studies

35 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 35 2: general language of the content areas 1: pictorial or graphic representation of the language of the content areas 5: technical language of the content areas At the given level of English language proficiency, English language learners will process, understand, produce, or use:

36 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 36 Validation issues Validity is about the adequacy and appropriateness of inferences about students made on the basis of test scores. The WIDA Standards make claims about what students at five different proficiency levels can do. Can those claims be substantiated empirically?

37 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 37 Research study questions 1. Are the ACCESS for ELLs items empirically ordered by difficulty as predicted by the WIDA Standards? 2. Does that ordering differ by domain (listening or reading)? 3. Does that ordering differ by standard (SI, LA, MA, SC, SS)?

38 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 38 Data Results from ACCESS for ELLs field test Fall 2004 Over 6500 students grades 1 to 12 8 WIDA states About 3.5% proportional representation

39 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 39 Method Items were vertically scaled across grade levels using common item equating Item difficulty was determined using the Rasch measurement model Items that did not meet the requirements of the model were eliminated from the analysis Average item difficulties were calculated by proficiency level

40 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 40 Number of items used = 651

41 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 41 Results

42 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 42

43 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 43

44 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 44

45 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 45 Conclusions

46 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL Are the ACCESS for ELLs items empirically ordered by difficulty as predicted by the WIDA Standards? Yes. WIDA Standards (MPIs) provided sufficient content and rationale to develop specifications that operationalized the five proficiency levels through listening and reading selected response items.

47 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL Does that ordering differ by domain (listening or reading)? No. The general ordering was similar across listening and reading. Some difference between listening level 5 and reading level 5 was observed.

48 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL Does that ordering differ by standard (SI, LA, MA, SC, SS)? Yes. SI (social and instructional language) items showed a clear tendency to be easier than items assessing language in the content areas, particularly at higher proficiency levels. Items assessing language in the content areas were similar except at level 5 where language arts appeared easier than expected.

49 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 49 Discussion 1. While many additional validation issues remain, this preliminary empirical analysis based on the field test data indicate that the WIDA Standards provide a strong basis for distinguishing among proficiency levels of ELLs.

50 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 50 Discussion 2. The operational plan for ongoing WIDA assessment item renewal and development provides opportunity to tighten item specifications based on empirical research while operationalizing the WIDA Standards.

51 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 51 Process of test development 2. Standards 3. Specifications 4. Assessment 1. Theory and Research

52 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 52 Study 2: Validation evidence from the bridge study What can we learn about ACCESS for ELLs from the WIDA Consortiums bridge study? Study 1: What is the relationship between performances on the older English language proficiency tests and on ACCESS for ELLS? Study 2: What is the relationship between the cut score denoting the highest level of proficiency on the older tests and the predicted corresponding score on ACCESS for ELLs in terms of ACCESS proficiency levels?

53 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 53 Purpose of the bridge study To help WIDA Consortium member states understand the performances of their ELLs in acquiring English on the older tests (for which they had data) in terms of the new test, especially to: meet compliance with Title III requirements provide continuity of data flow for cohorts of English language learners identified in , the baseline year provide information that may help determine Annual Measurable Achievement Objectives (AMAOs) for the established cohorts in the transitional year

54 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 54 The older tests IDEA Proficiency Test (IPT) Language Assessment Scales (LAS) Language Proficiency Test Series (LPTS) Maculaitis II (MAC II) NOTE: The first three tests do NOT have separate scores for listening and speaking!

55 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 55 WIDA levels of English Language Proficiency ENTERING BEGINNING DEVELOPING EXPANDING BRIDGING 4.5

56 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 56 Participants 4,985 students from IL and RI

57 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 57 Procedures 2005 operational ACCESS administration (AL, ME, VT) Participating students in IL and RI administered older test and operational ACCESS within 6-8 week window Scoring of older test took place within local districts following their standard procedures and submitted to ACCESS scoring vendor Scoring of ACCESS was with Spring 2005 operational scoring Data matched by ACCESS scoring vendor Older test data cleaned at CAL Analyses at CAL

58 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 58 Analyses: Study 1 Pearson correlations between performances on each form of older test (raw or scale score) and ACCESS for ELLs scale scores Because each form for the older tests was unique, 64 correlational analyses were performed IPT (14) LAS (14) LPTS (16) MAC II (20) Summarized by averaging

59 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 59 Results: Study 1 example (IPT Reading) IPT Reading Score with ACCESS Reading Scale Score IPT Form (Read) ACCESS Read Scale Score IPT_ELIPT Read Raw ScorePearson Correlation.741** N 205 IPT_R_1ABIPT Read Raw ScorePearson Correlation.540** N 250 IPT_R_2ABIPT Read Raw ScorePearson Correlation.618** N 296 IPT_R_3ABIPT Read Raw ScorePearson Correlation.713** N 317

60 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 60 Results: Study 1 summary range Average Correlations (All Levels of Each Test within Domain) TestListSpeakReadWrite IPT LAS LPTS MAC II

61 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 61 Results: Study 1 summary by test across domains Average Correlations (All Levels of Each Test within Domain) TestListSpeakReadWrite IPT LAS LPTS MAC II

62 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 62 Results: Study 1 summary by domain across tests Average Correlations (All Levels of Each Test within Domain) TestListSpeakReadWrite IPT LAS LPTS MAC II

63 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 63 Discussion: Study 1 Generally moderate to high correlations between ACCESS for ELLs® and older tests; ACCESS appears to assessing a similar construct (criterion-related validity) but is not interchangeable with the older tests Correlations across all tests with reading were highest; most familiar to students and test developers? Correlations across all tests with listening were lowest; but three tests did not have separate scores for listening and speaking! Correlations across domains between LPTS and ACCESS for ELLs® were highest; LPTS the newest of the older generation

64 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 64 Analyses: Study 2 From predicted scores tables, found for each grade level the ACCESS for ELLs® proficiency level score corresponding to the cut score of the highest proficiency level on the older test Summarized findings by calculating averages and standard deviations

65 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 65 Predicted scores table example Predicted ACCESS = * LAS LAS RW 2AB Writing Raw Score to WIDA ACCESS Writing Scale Score LAS RW 2AB Raw Score LAS Proficiency Level (by grade) Predicted ACCESS Score ACCESS Proficiency Level (by grade) LAS RW 2AB Raw Score LAS Proficiency Level (by grade) Predicted ACCESS Score ACCESS Proficiency Level (by grade) Writing4,5,6Writing4,56 Writing4,5,6Writing4, …………………………

66 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 66 Finding the WIDA proficiency level score example Predicted ACCESS = * LAS LAS RW 2AB Writing Raw Score to WIDA ACCESS Writing Scale Score LAS RW 2AB Raw Score LAS Proficiency Level (by grade) Predicted ACCESS Score ACCESS Proficiency Level (by grade) LAS RW 2AB Raw Score LAS Proficiency Level (by grade) Predicted ACCESS Score ACCESS Proficiency Level (by grade) Writing4,5,6Writing4,56 Writing4,5,6Writing4,56 …………… …………… …………… …………… …………… …………… …………… …………… ……………

67 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 67 Truncated example results: Listening K123…1112 Listening IPT …4.8 LAS …4.4 LPTS …3.0 MAC II …2.9

68 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 68 Results: Study 2 summary range Average Proficiency Level Score (Standard Deviation) TestListSpeakReadWrite IPT4.9 (0.80) 4.0 (0.36) 3.9 (0.97) 2.9 (0.64) LAS4.8 (0.67) 5.1 (0.81) 3.1 (1.11) 3.1 (0.67) LPTS3.5 (0.53) 2.9 (0.79) 5.3 (0.71) 3.9 (0.74) MAC II3.7 (0.78) 3.5 (0.74) 3.5 (0.76) 3.0 (0.40)

69 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 69 Interpretation: Highest test and domain LPTS Reading ENTERING BEGINNING DEVELOPING EXPANDING BRIDGING LPTS Reading

70 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 70 Interpretation: Lowest test and domain LPTS Reading ENTERING BEGINNING DEVELOPING EXPANDING BRIDGING IPT Writing

71 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 71 Results: Study 2 High and low by test across domains Average Proficiency Level Score (Standard Deviation) TestListSpeakReadWrite IPT4.9 (0.80) 4.0 (0.36) 3.9 (0.97) 2.9 (0.64) LAS4.8 (0.67) 5.1 (0.81) 3.1 (1.11) 3.1 (0.67) LPTS3.5 (0.53) 2.9 (0.79) 5.3 (0.71) 3.9 (0.74) MAC II3.7 (0.78) 3.5 (0.74) 3.5 (0.76) 3.0 (0.40)

72 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 72 Results: Study 2 High and low by domain across tests Average Proficiency Level Score (Standard Deviation) TestListSpeakReadWrite IPT4.9 (0.80) 4.0 (0.36) 3.9 (0.97) 2.9 (0.64) LAS4.8 (0.67) 5.1 (0.81) 3.1 (1.11) 3.1 (0.67) LPTS3.5 (0.53) 2.9 (0.79) 5.3 (0.71) 3.9 (0.74) MAC II3.7 (0.78) 3.5 (0.74) 3.5 (0.76) 3.0 (0.40)

73 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 73 Discussion: Study 2 (1 of 3) Results varied widely from a close relationship to WIDA proficiency span (LPTS Reading) to much lower, though in general, cut scores on older tests tended to be much lower than the WIDA 6.0; were ELLs exited too early under the older tests? do ACCESS for ELLs standards and performance level definitions better align with levels of English proficiency needed for academic success? with a single test across districts within a states, states will have clearer data to better understand the development of English proficiency in ELLs and its relationship to academic achievement

74 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 74 Discussion: Study 2 (2 of 3) Results varied widely across tests and domains; LPTS with the highest cut scores in reading and writing had lowest cut scores in listening and speaking; but three tests did not have separate scores for listening and speaking, including LPTS! LPTS had only fluent/non-fluent listening and speaking categories?

75 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 75 Discussion: Study 2 (3 of 3) Across tests, writing had lowest cut scores for three of four tests; is writing on ACCESS for ELLs unduly hard?, or is it more indicative of what is needed for academic success?

76 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 76 Important considerations in interpretations CONTENT differences between all five tests include: Degree of alignment with English language proficiency and academic content standards Number and types of items in each subsection or language domain Depth of knowledge of the items Inclusion of the language of math, science, and social studies Ceiling levels of the measures Rubrics used for interpreting speaking and writing METHODOLOGICAL caveats include: Use of linear regression across all analyses Sometimes small numbers of students in subgroups Distribution of observed scores (Spring testing)

77 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 77 Preliminary conclusions Correlational data show strong support for ACCESS for ELLs as a measure of English proficiency (criterion- related validity) Comparison of cut scores indicate that the WIDA Standards, as operationalized by ACCESS for ELLs, describe a longer proficiency continuum than the older tests Additional studies are needed to explore the relationship between that extended continuum and academic achievement

78 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 78 Validity evidence from the grade level cut score review study 75 teachers from 14 WIDA states Examined test items and (for writing and speaking) examinee performances in light of the WIDA Standards model Performance Indicators and the Standards performance level descriptors Through a structured process came up with proposed grade level cut scores (based on empirical proposed scores based on current cluster level cut scores) As in the original standard setting study, evaluated the confidence they had in the cut scores representing the different performance levels Results: Confidence increased greatly over first study

79 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 79 Evaluations from grade level cut score review Averages across all participants How confident are you in the cut scores? (4 = hi, 1 = lo) Red = below 3.10 / Black = 3.11 to 3.40 / Green = above 3.40 ReadWriteListSpeak OrigRevOrigRevOrigRevOrigRev 1/ / / / /

80 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 80 Evaluations from grade level cut score review Averages across all participants How confident are you in the cut scores? (4 = hi, 1 = lo) Red = below 3.10 / Black = 3.11 to 3.40 / Green = above 3.40 ReadWriteListSpeak OrigRevOrigRevOrigRevOrigRev 1/ / / / /

81 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 81 Evaluations from grade level cut score review Averages across all participants How confident are you in the cut scores? (4 = hi, 1 = lo) Red = below 3.10 / Black = 3.11 to 3.40 / Green = above 3.40 ReadWriteListSpeak OrigRevOrigRevOrigRevOrigRev 1/ / / / /

82 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 82 Evaluations from grade level cut score review Averages across all participants How confident are you in the cut scores? (4 = hi, 1 = lo) Red = below 3.10 / Black = 3.11 to 3.40 / Green = above 3.40 ReadWriteListSpeak OrigRevOrigRevOrigRevOrigRev 1/ / / / /

83 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 83 Other validity studies underway at CAL Some ongoing internal research at CAL (1) What do we learn from the results of the technical analyses of Series 100 to improve item and form specifications? (2) How do we improve the construction of items appropriate (both from content and empirical results) to their targeted proficiency levels? (3) What evidence do we have that ACCESS for ELLs tests the language of the content areas and not knowledge of the content areas?

84 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 84 #1 Example from Series 100 analyses

85 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 85 #1 Example from Series 100 analyses

86 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 86 #2 Example 3-5 Read Prof Level 2

87 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 87 #2 Example 3-5 Read Prof Level 5

88 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 88 Interaction of Performance Level Descriptions and model Performance Indicators Language Proficiency (Performance Level Descriptions) 1 Entering 2 Beginning 3 Developing 4 Expanding 5 Bridging PIs L 1 L 2 L 3 L4 L 5 Linguistic Complexity Vocabulary Usage Language Control

89 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 89 #3 Confirmatory Factor Analyses (SEM) R SS R SI R LA R MA R SC L SI L SS L LA L MA L SC List Score Read Score L- pro f En gpr of R- pro f SS SC MAMA LA SI

90 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 90 Other research (and possibilities) 1. Native speaker studies (Alabama data) 2. Relationship between performance on ACCESS for ELLs and state content tests (?)

91 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL 91 X XX XXX XX XXX X XX XX X No Logistic regression with state data? Yes XX X XX X XX X X XX ACCESS Scale Score lowhigh Score 80% hi% lo% Probability

92 ISBE Presentation 2/21/2007 © 2007 WIDA/CAL So what does this mean for using scores on ACCESS for ELLs®? Be sure to understand the meaning of scale scores and proficiency level scores Have confidence using scores knowing that the reliability (consistency) of the scale scores are high; in particular, for the overall composite score that the accuracy of classification based on the overall composite is also high initial validity studies strongly support the use of ACCESS for ELLs® test scores as a valid indicator of levels of proficiency in accordance with the WIDA Standards the WIDA Consortium supports a rigorous program of on- going test improvement, supported by research the WIDA Consortium continues to collect evidences in support of the validity of the use of test scores

93 For more information, please contact the WIDA Hotline: or World Class Instructional Design and Assessment, Center for Applied Linguistics, Metritech, Inc.,


Download ppt "ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,"

Similar presentations


Ads by Google