Aligning Program Goals, Instructional Practices, and Outcomes Assessment Dr. Ray T. Clifford BILC Conference, Budapest 29 May 2006.

Aligning Program Goals, Instructional Practices, and Outcomes Assessment Dr. Ray T. Clifford BILC Conference, Budapest 29 May 2006

What connects the instructional components that are included in this year’s conference theme? Instructional Practices Outcomes Assessment Program Goals

What connects the instructional components that are included in this year’s conference theme? Standards

BILC Standards-Based Projects The BILC-developed interpretation of STANAG 6001 approved as an official part of that STANAG. A BILC Working Group has prepared descriptors for optional plus levels. A survey was conducted on the desirability of a producing a STANAG 6001 BILC- sponsored, “benchmark” test with Advisory ratings.

Participation in the Survey 16 countries responded to the survey: Austria Bulgaria Canada Denmark EstoniaFinland Germany Hungary Italy Latvia Lithuania Poland Romania Spain Sweden Turkey

Survey Results 1.Would your country use a Benchmark Test if one were available? Definitely yes: 8 Probably yes: 5 Perhaps: 2 Most likely not: 0 Definitely not: 1

Survey Results 2.Does your country use “plus levels” when assigning STANAG ratings? Definitely yes: 3 Probably yes: 0 Perhaps: 1 Most likely not: 1 Definitely not:11

Survey Results 3.Would you like to have plus levels incorporated into a Benchmark Test? Definitely yes: 5 Probably yes: 5 Perhaps: 2 Most likely not: 2 Definitely not: 2

Summary A “benchmark” test would be welcomed by most countries. The scores should be advisory in nature. Providing “plus” level ratings would allow those ratings to be used or ignored. BILC should proceed with plans to: –Develop a benchmark STANAG test of reading comprehension. –Explore internet delivery options.

ACT will Assist with Funding ACT has approved funding to support the development of the BILC Advisory Test (Reading): –Part-time project coordinator. –Computer programming and server support. –Travel expenses for the next meeting of the Test Working Group. Work is underway. –Test specifications have been completed. –Texts and items are being reviewed.

A Comparison of Testing Standards STANAG 6001 The Common European Framework of Reference for Languages: Learning, teaching, assessment

Every Performance Standard has Three Essential Components Task A statement of what is to be done or accomplished. Conditions A description of the conditions under which (or context in which) the task is to be performed. For language this includes the topics to be addressed. Accuracy A definition of how well the task must be performed under the conditions stated.

5 LEVELTASKSCONTEXT/TOPICSACCURACY 4 3 2 1 0 All expected of an educated NS All subjects Accepted as an educated NS Tailor language, counsel, motivate, persuade, negotiate Wide range of professional needs Extensive, precise, and appropriate Support opinions, hypothesize, explain, deal with unfamiliar topics Practical, abstract, special interests Narrate, describe, give directions Concrete, real- world, factual Intelligible even if not used to dealing with non-NS Errors never interfere with communication & rarely disturb Q & A, create with the languageEveryday survival Intelligible with effort or practice Use memorized phrasesRandom Unintelligible STANAG 6001 - Speaking (Summarized) as a Standard

STANAG 6001 Scale Validation Exercise Conducted at Sofia, Bulgaria 13 October 2005

Instructions On the top of a blank piece of paper, write the following information: 1.Your current work assignment: Teacher, Tester, Administrator, Other______ 2.Your first (or dominate) language: _________ 3.You do not need to write your name!

Instructions Next, write the numbers: 0 1 2 3 4 5 down the left side of the paper.

Instructions You will now be shown 6 descriptions of language speaking proficiency. Each description will be labeled with a color.

Instructions Rank the descriptions according to their level of difficulty by writing their color designation next the appropriate number: 0 (easiest) = Color ? 1 (next easiest) = Color ? 2 (next easiest) = Color ? 3 (next easiest) = Color ? 4 (next easiest) = Color ? 5 (most difficult) = Color ?

Ready? The descriptions will now be presented… –One at a time, –In a random sequence, –For 15 seconds each. You will see each of the descriptors 4 times. Thank you for participating in this experiment.

STANAG 6001 Scale Validation: A Timed Exercise Without Training 74 people turned in their rankings. They marked their current work assignments as: –Administrator 49 –Teacher26 –Tester19 –Other 1

Results of the STANAG Scale Validation ( n = 74 )

The CEF can also be presented as a standard by dividing each of the descriptions into the three components of… Task(s) Conditions/Topics Accuracy expectations

CEF: OVERALL ORAL PRODUCTION (CEF, p. 58) LevelTaskContext/TopicAccuracy A1 Produce simple phrases About people and places Mainly isolated phrases A2 Give a simple description or presentation Of people, living or working conditions, daily routines, likes/dislikes, etc. A short series of simple phrases and sentences linked into a list

CEF: OVERALL ORAL PRODUCTION (CEF, p. 58) LevelTaskContext/TopicAccuracy B1 Sustain a straightforward description One of a variety of subjects within his/her field of interest Reasonably fluent, linear sequence of points

CEF: OVERALL ORAL PRODUCTION (CEF, p. 58) LevelTaskContext/TopicAccuracy B2.1 Give descriptions and presentations, expand and support ideas with subsidiary points and examples Wide range of subjects related to his/her field of interest Clear, detailed, and relevant

CEF: OVERALL ORAL PRODUCTION (CEF, p. 58) LevelTaskContext/TopicAccuracy B2.2 Give descriptions and presentations, with highlighting of significant points, and supporting detail Clear, systematically developed, appropriate, and relevant

CEF: OVERALL ORAL PRODUCTION (CEF, p. 58) LevelTaskContext/TopicAccuracy C1 Give descriptions and presentations on complex subjects, integrate sub-themes, develop particular points, round off with a conclusion Clear, detailed, appropriate

CEF: OVERALL ORAL PRODUCTION (CEF, p. 58) LevelTaskContext/TopicAccuracy C2 Produce …speech with an effective logical structure Clear, smoothly flowing well- structured which helps the recipient to notice and remember significant facts

Why are topics not specified at the higher ability levels? The CEF manual gives the answers…

Ambiguity of Expectations Three types of “proficiency” are recognized in CEF “communicative testing” (Pages 180 and 184): –“Emerging competence” in relevant situations. –Competence on tasks in a “relevant syllabus”. –“The generalisable competencies” evidenced by a candidate’s overall performance. For STANAG 6001, only the last type of generalisable competence is considered “proficiency”.

Ambiguity of Expectations CEF acknowledges a third “blended” category, between achievement and real- world proficiency, but does not label it. (p. 184.) STANAG tester training documents label this “in-between” category as “rehearsed performance” or “pro-chievement” ability to distinguish it from unrehearsed, general ability.

Some Other Examples “Table 2. Common Reference Levels: Self Assessment” (CEF p. 24) –Contains almost no accuracy statements. “Table 3. Common Reference Levels: qualitative aspects of spoken language use” (CEF pp. 28 and 29) –Contains accuracy statements not only under the column labeled “ACCURACY”, but also interwoven in the descriptions found under the columns labeled “RANGE”, “FLUENCY”, “INTERACTION”, and “COHERENCE”.

Why not combine two CEF scales to match the “standard” format? This evidently creates too many rating options for the CEF developers. However, every testing system should decide how to deal with the complexity of the interactions between two factors: –The difficulty of the Communication Tasks tested. –The varying levels of competency demonstrated by the test candidates.

Example # 1 Consider for instance, the combination of the CEF “Overall Oral Production” scale and the “General Linguistic Range” scale. (pp. 58 and 110) –The “Overall Oral Production” scale has 7 defined levels. –The “General Linguistic Range” has 9 defined levels. –The combination could yield 63 different rating combinations.

Options for Reducing Complexity Select a progressive subset of the possible combinations as major progress milestones. Conclude as the CEF does that… –It is not “practical” to “use all the scales at all levels”. (p. 192) –The test rating criteria should be linked to the learner’s textbook and defined by criteria that are appropriate to the “requirements of the assessment task concerned”. (p. 193)

The CEF Approach to Handling Language Complexity Therefore, the CEF suggests… –“Features need to be combined, renamed and reduced into a smaller set of assessment criteria appropriate to the needs of learners”. (p. 193) [Emphasis added] –Test rating criteria should be restricted to those criteria that are appropriate to the “style of the pedagogic culture concerned”. (p. 193) [Emphasis added]

Example # 2 Compare this approach with how STANAG 6001 deals with rating complexity. –6 task levels. –6 content levels. –6 accuracy levels. –The combination could yield 216 different rating combinations.

STANAG 6001 Approach to Handling Language Complexity Therefore, STANAG 6001… –Combined, renamed and reduced features into a smaller set of assessment criteria appropriate to the needs of employers. –Reduced rating complexity by aligning each task level with an appropriate level of expanding content areas, and an increasing level of accuracy that correspond to the type of tasks being tested. –Stipulated that (as with other performance standards) all of the task, condition, and accuracy statements for a given level must be satisfied before that level proficiency can be awarded.

5 LEVELTASKSCONTEXT/TOPICSACCURACY 4 3 2 1 0 All expected of an educated NS All subjects Accepted as an educated NS Tailor language, counsel, motivate, persuade, negotiate Wide range of professional needs Extensive, precise, and appropriate Support opinions, hypothesize, explain, deal with unfamiliar topics Practical, abstract, special interests Narrate, describe, give directions Concrete, real- world, factual Intelligible even if not used to dealing with non-NS Errors never interfere with communication & rarely disturb Q & A, create with the languageEveryday survival Intelligible with effort or practice Use memorized phrasesRandom Unintelligible STANAG 6001 - Speaking (Summarized) as a Standard

Technically, STANAG 6001 also adheres to the recommendations of the CEF, because it… –Is a “metasystem”. (CEF, pp. 192 - 196) –Has combined features to create a reduced set of assessment criteria… That match the tasks being assessed. (CEF, p. 193) With between 4 and 7 rating levels. (CEF, p. 193) –Meets the needs of “employers” by testing for generalisable, real-world proficiency. (CEF, p. 183)

STANAG 6001 Diverges from the recommendations of the CEF, because it… –Assigns ratings based on employment needs without considering “the needs of the learners concerned” or “the style of the pedagogic culture concerned.” (CEF, p. 193) –Uses criterion-referenced grading of a “task, topics, and accuracy” hierarchy – rather than a norm-referenced scalar analysis. (CEF, p. 185) –Rates ability based on one’s unrehearsed, real- world proficiency in the language being tested.

A Summary of the Major Contrasts STANAG 6001 The primary purpose is to test individuals’ general proficiency across a wide range of topics regardless of their course of study. The primary users of the information are employers and administrators. By design, STANAG 6001 is under-specified for measuring step-by-step progress within a specific curriculum. CE Framework of Reference The primary purpose is to check learners’ progress in developing communicative competence within a specific course of study. The primary users of the information are the teachers and students. By design, the CE Framework of Reference is under- specified for testing of general, real-world proficiency.

These contrasts are not a problem! No single test or testing framework can meet both the formative needs of learners and the summative needs of employers, so… –Use the CE Framework of Reference for designing curriculum-appropriate achievement and performance tests. –Use the STANAG 6001 assessment as a culminating, independent measure of graduates’ general, real-world ability.

5 4 3 2 1 STANAG 6001 Proficiency Scale “Emerging competency” “Competence in a relevant syllabus” STANAG 6001 focus CEF focus “General, unrehearsed, real-world proficiency”

What happens when you compare rehearsed performance ratings with unrehearsed proficiency ratings? Those who can pass an unrehearsed, general proficiency test can also pass a curriculum-based performance test. Those who can pass a rehearsed performance test may or may not be able to pass a general, unrehearsed proficiency test.

Conclusion “The solutions to our problems should be as simple as possible, but no simpler.” Albert Einstein Language tests should match the purpose for which the results will be used. –Use achievement tests for testing mastery of lessons in a textbook. –Use performance tests for checking rehearsed abilities. –Use proficiency tests for determining general, unrehearsed ability in real-world situations.

Aligning Program Goals, Instructional Practices, and Outcomes Assessment Dr. Ray T. Clifford BILC Conference, Budapest 29 May 2006.

Similar presentations

Presentation on theme: "Aligning Program Goals, Instructional Practices, and Outcomes Assessment Dr. Ray T. Clifford BILC Conference, Budapest 29 May 2006."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Aligning Program Goals, Instructional Practices, and Outcomes Assessment Dr. Ray T. Clifford BILC Conference, Budapest 29 May 2006.

Similar presentations

Presentation on theme: "Aligning Program Goals, Instructional Practices, and Outcomes Assessment Dr. Ray T. Clifford BILC Conference, Budapest 29 May 2006."— Presentation transcript:

Similar presentations

About project

Feedback