Presentation is loading. Please wait.

Presentation is loading. Please wait.

Measurement, Evaluation, and Assessment.  Evaluation: is the systematic collection and analysis of data needed to make decisions. It involves assessing.

Similar presentations


Presentation on theme: "Measurement, Evaluation, and Assessment.  Evaluation: is the systematic collection and analysis of data needed to make decisions. It involves assessing."— Presentation transcript:

1 Measurement, Evaluation, and Assessment

2  Evaluation: is the systematic collection and analysis of data needed to make decisions. It involves assessing the strengths and weaknesses of programs, policies, personnel, products, and organizations to improve their effectiveness. {remedy is suggested}  Assessment: it is the process of gathering information to make decisions. It is the process of observing learning: describing, collecting, recording, scoring, and interpreting information about a student’s or one’s own learning. {no remedy is suggested}  Measurement: a process that uses many tools to collect numerical data about the process of learning. It includes: tests, observation sheets, interviews, questionnaires.  Tests: is one tool of measurement that can be oral, written or task-based.

3 Evaluation Assessment Measuring testing

4 _ الاختبار test : هو عبارة عن مجموعة أو سلسلة من الأسئلة أو المهام يطلب من الدارسين الإجابة لها تحريريا أو شفهيا أحيانا بوسائل أخرى مثل التمثيل والألعاب وغيرها ويفترض أن يشمل الاختبار عينة ممثلة لكل الأسئلة الممكنة والمهام التي لها علاقة بالمعارف والمهارات التي يقيسها الاختبار. 2_ القياس measurement: هو العملية التي يقدر بها أداء الدارسين بالنسبة للمعارف والمهارات والسمات المختلفة باستخدام أداة ملائمة أو مقياس مناسب ويعبر عن القياس بقيمة رقمية وبذلك فان القياس أوسع من الاختبار بل قد يتم القياس باستخدام أدوات أخرى غير الاختبارات مثل الملاحظة أو قوائم التقدير أو بأي وسيلة أخرى تسمح بالحصول علي معلومات بصورة كمية, والقياس يشير إلى عملية التقدير الكمي أو الدرجة ولا يتضمن القياس حكما قيميا علي النتيجة. 3_ التقييم assessment: هو عملية يتم فيها تقدير قيمة ومعرفة نواحي القوة والضعف لمستوى الدارسين أو طرق التدريس وإصدار أحكاما عليها باستخدام طرق وأدوات متنوعة. _4 التقويم Evaluation : التقويم هو عملية منظمة تستخدم فيها نتائج القياس أو أي معلومات يحصل عليها بوسائل أخرى مناسبة, في إصدار أحكام علي أداء الدارسين في جوانب المنهج أو جوانب سلوك الدارسين لمعرفة وتحديد مدى الانسجام والتوافق بين الأداء والأهداف أو بين النواتج الواقعية للتعلم والنواتج التي كانت متوقعة ومعرفة نقاط القوة والضعف لدى المتعلم.

5 Simple Responses Color the paint set. Examples of tests

6 Simple Responses

7 Multiple Choice Look at the first picture in each row and say its name. Circle the picture whose name rhymes with it

8 Multiple Choice Look at the picture in each row. Circle the picture that goes in a different direction from the others. Then color the pictures.

9 Multiple Choice Circle the number that tells how many. Then color the picture.

10 Matching Items Look at each picture and say its name. Draw a line to match each rhyming picture

11 Matching Items Match the picture of each mother to her baby.

12 Matching Items Look at each picture. Draw a line to match the pictures that look the same

13 Choose the right answer. 1. This is a circle True False 2. The sky is green. True False 3. This is a square True False 4. There are seven days in a week True False 1. We eat ten times a day True False

14 Fill-in-the-Blank Look at the pattern in each row. Draw the shapes that continue the patterns. Then color the shapes.

15 Fill-in-the-Blank Look at the pictures at the top of the page. Count those objects in the large picture. Then write how many. 11 21

16 Fill-in-the-Blank How many objects do you se in each big box? Write the numerical in the small box. Circle the sets that show 7. 75 77

17 Fill-in-the-Blank Read the following passage and fill in the blanks with the appropriate words: Arabic, flag, white, green, Madinah, city, Islam, Muslim, Saudi Arabia, ruler, Saudi, king, kingdom, North, peninsula, East, center, South, West, desert, Arabian, wash I am a _______ my religion is ______. My nationality is _____. I am from ___________. Saudi Arabia is a _______ located in the Arabian _________. ____ Abdullah is the _____ of the country. Most of my country is a ______. Its is bordered by Yemen and Oman from the ______, Qatar, United Arab Emirates, and the _______ Gulf from the _____, Kuwait, Iraq, and Jordan from the ______, and the Read Sea from the _____. Its capital, Riyadh, is located in the ______. Its ____ is colored _____ and _____. I live in ________ the second holly ___ for Muslims. MuslimIslam SaudiSaudi Arabiacountry PeninsulaKing rulerdesert South PersianEast North Westcenter flaggreenwhiteMadinah city

18  Complete the following words: 1. B__g 2. Un_ver_ity 3. Sch__ol 4. __ark__t 5. Tea__her 6. Int__rna__ion__l 7. __ist__r i is o Me c eta Se

19 1. Finding out about progress: how well testees have mastered something, area, or skills they have been taught. 2. Encouraging: Showing students the progress and goals they have reached to increase their motivation. 3. Finding out about learning difficulties: Using test results to diagnose the areas of weakness in order to treat them effectively to enrich both teaching and learning.

20 4. Finding out about achievement: How well students have learned over a long period. 5. Placing students: Sort students into groups according to ability at the beginning of a course. 6. Selecting students: Select certain candidates for a job or place in a course.  Selecting can be both academic and non- academic, whereas placement is for academic purposes only

21  PROFICIENCY TESTS: Designed to measure students' ability regardless of any training, measuring suitability of candidates to perform a certain task or a specific course:

22  ACHIEVEMENT TESTS: directly related to language courses with the aim to know how successful students are in achieving the objectives of a language course or program.

23  DIAGNOSTIC TESTS: to identify strengths and weakness of both students and teaching adjusting our way of teaching and creating a suitable plan for remedial teaching as well.

24  ENCOURAGING TESTS: teacher-made for the purpose of showing students their progress to comfort their anxiety to know their level, especially when learning a second or foreign language. Hence, they: A.do not have to be marked in the same way as the other evaluation tests are; B.should be marked for the students and not to be included on the final grade of the course; C.should have letter marks: A, B, C, D and NO Fs, rather than numerical ones.

25  DISCRETE POINT vs INTEGRATIVE TESTING: 1.DISCRETE POINT: testing one element at a time and item by item. Testing particular grammatical structures is a good example. 2.INTEGRATIVE TESTING: requires the testee to combine many language elements to complete a certain task thus it tends to be direct. Composition, taking notes, and dictation are good examples of this technique.

26  OBJECTIVE vs SUBJECTIVE TESTING: 1.OBJECTIVE TESTING: does not require scorer's judgment, using questions or forms such as multiple choice, true false, matching, ordering, and/or re-arrangement. 2.SUBJECTIVE TESTING: usually judged by one or more examiners such as composition, reports, letters, written comprehension questions, discussions, conversations, and talks and speeches.

27  Comparing objective and subjective techniques we realize that: A.Objective scoring is preferred by examiners because: its not time consuming; and its reliability and not for its validity or scoring itself. B.Subjective scoring is mostly criticized for: its lack of reliability; its score can change from one examiner to another and even with the same examiner in a different situation; and it is time consuming; it takes a long to mark and score.

28 1. Validity: degree to which a test measures what it is supposed to measure: –Content Validity: degree to which the test items confirm to the instructional objectives.

29  Reliability: degree to which a test measures consistency: Test-retest reliability: correlation between 2 administrations of the same test to the same students. Factors that affect reliability: Tests that are too easy or too difficult tend to have lower reliabilities. The more heterogeneous the group being tested, the higher the reliabilities will tend to be. As the number of items on a test increases, the higher the reliabilities will tend to be.

30  Possible Testing Errors –Administrative: poor directions, physical discomfort, unnecessary anxiety. –Scoring –Student errors: distractions, poor health, or anxiety. –Test errors: trick questions, poorly phrased questions, ambiguous items, or items that discriminate poorly.

31  Should measure what it intended to be measured (valid), therefore each item needs to be carefully analyzed to determine the kind of thinking the students will use to answer. Each individual should receive the same ranking with repeated administrations of the same test (reliable). The difficulty level of items should be chosen so that the range of scores would be fair to students' abilities.

32 Items should be varied but consistent with objectives. Avoid ambiguity in any of its items mostly using descriptive vocabulary and technical jargon only when needed. No trick questions, hidden expectations, or multiple answers in a one answer question form. Avoid optional questions except in multiple forms of the same test or different levels of difficulty.

33  Essay: measure organizational skills and all cognitive levels and takes less time to prepare. Hence it should: –require answers of short duration –measure one kind of understanding within one question –include detailed criteria for scoring –avoid ambiguity

34  20 Marks  an answer that will have the full mark will have the following characteristics:  representation of the overall meaning in good Arabic  well-build sentence structure in Arabic  good choice of vocabulary  appropriate reformulation of ideas  use of cohesive devices as followed in Arabic  no deviation from the ideas in the source text  no drops or missing ideas or sentences in the translation.  5 Marks:  an answer that will have the full mark will have the following characteristics:  overall meaning is misunderstood and misrepresented.  meaning of item of expressions and vocabulary is misrepresented.  structure in Arabic is disfigured.  a translation in this level will get chunks of translation for individual words and expressions in the sentence.  A remarkable deviation from ideas in the source text.  Half the passage is left untranslated or just copied from the source text.

35  Multiple-Choice: measures a wide variety of cognitive levels in less response time per item and can be improved through statistical analysis in addition to its simplicity and accuracy in scoring. However, it is very time-consuming to construct and needs good training to write beyond memory/recognition. In addition, it is not cost efficient and test-wise students can score higher. Hence, to be effective it should: –be composed of the same number of alternatives –composed of alternatives of about the same length –have alternatives listed in a column –stated positively –composed of equally plausible distractors –use reasonable vocabulary –have alternatives logically related to the stem

36  True-False: easily scored items that take short response time, but are difficult to write and easy to guess. Therefore, it should: –not to be statements of absolutes –not be ambiguous –be evenly distributed true and false in number –be stated positively –be relatively short sentences –not be composed of phrases taken directly from the textbook.

37  Matching: easily scored items with short response time, however it is mostly limited to simple recognition with clear and embedded clues. Therefore, it should: –be composed of numbered item to the left and lettered choices to the right –be composed of stems and choices dealing with similar content –have a larger number of choices than stems –have longer statements composing the stems and shorter statements composing the choices. –Have the directions for responding at the beginning of the item

38  Completion: easily scored items that take short response time, however it only tests association of words and phrases and takes more time than the other objective questions and invites for ambiguity. Therefore, it should: –have the same length of blank space for each item –not include questions about measured quantities –not have more than one blank per question unless all items are uniform. –not be composed of phrases taken directly from the textbook.

39  copies of any test should be distributed with minimum confusion have a silent pleasant environment before and during the examination avoid distractions by allowing only examination materials

40  Functions of grades: communicate students' academic achievement and predict future academic success to students themselves, as well as other parties: parents, institutions, businesses, etc. Characteristics of grades: –Are not directly earned by students, but are assigned by teachers. –Have serious limitations in communicating factors determining the grade.

41  Assigning grades to achievement tests –Make a scoring key prior to test administration. –Determine a passing cut-off score based on analyzing the questions measuring understanding, as well as other cut-off scores. –Adjust cut-off scores after test administration based on the distribution of scores and standard error of measurement.

42 1. Each type of testing is designed to serve an area of evaluation for both learning and teaching processes. 2.There is nothing called a perfect test; every test is good for what it is designed to assess. 3.To choose a type of testing one must: –decide on what s/he wants to test and why. –choose the testing and scoring techniques that serve his/her purpose: validity or reliability, especially when analytical results are required. 4.A good tool for the wrong purpose could have a bad effect on students, teacher, and the learning process as a whole.

43 5. Even though tests are good indicators of learning and teaching process, one must pay attention because of their painful backwash. 6.Do not use a test on the market without analyzing if it meets your students' needs and objectives of your course. 7.Tests are most useful when used to: –enhance the learning process; –help in selecting the proper teaching material; –show the areas of weakness of students and syllabus for remedial purposes.

44  Select a unit in one of the schools’ textbooks.  Examine it carefully and develop an achievement test on that unit.  Your test should have:  - clear instructions  - objective and open-ended items  - a scoring key  - marks (scores) for each question item.  It must NOT be copied from the teacher’s guide or any other resource.  YOU MUST DEVELOP IT BY YOURSELF.

45  Assignments should be sent no longer than 9:00 pm on Monday.  Email subject, and file title, should contain your name.  DO NOT write the assignment in the body of the message.  You have to deliver Two things: the test, and the answer key to the test.  For items on essay (writing), you need to develop a scoring scheme in the answer key.


Download ppt "Measurement, Evaluation, and Assessment.  Evaluation: is the systematic collection and analysis of data needed to make decisions. It involves assessing."

Similar presentations


Ads by Google