Reliability for Teachers Kansas State Department of Education ASSESSMENT LITERACY PROJECT1 Reliability = Consistency.

Slides:



Advertisements
Similar presentations
Consistency in testing
Advertisements

Topics: Quality of Measurements
The Research Consumer Evaluates Measurement Reliability and Validity
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
Chapter 5 Reliability Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright ©2006.
The Department of Psychology
Increasing your confidence that you really found what you think you found. Reliability and Validity.
Chapter 4 – Reliability Observed Scores and True Scores Error
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.
VALIDITY AND RELIABILITY
Lesson Six Reliability.
1Reliability Introduction to Communication Research School of Communication Studies James Madison University Dr. Michael Smilowitz.
 A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account.
Reliability.
Part II Sigma Freud & Descriptive Statistics
What is a Good Test Validity: Does test measure what it is supposed to measure? Reliability: Are the results consistent? Objectivity: Can two or more.
-生醫統計期末報告- Reliability 學生 : 劉佩昀 學號 : 授課老師 : 蔡章仁.
Reliability and Validity of Research Instruments
Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.
Research Methods in MIS
Classroom Assessment A Practical Guide for Educators by Craig A
Reliability of Selection Measures. Reliability Defined The degree of dependability, consistency, or stability of scores on measures used in selection.
Classroom Assessment Reliability. Classroom Assessment Reliability Reliability = Assessment Consistency. –Consistency within teachers across students.
Technical Issues Two concerns Validity Reliability
Measurement and Data Quality
Validity and Reliability
Reliability Presented By: Mary Markowski, Stu Ziaks, Jules Morozova.
Instrumentation.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.
Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Reliability Lesson Six
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
Technical Adequacy Session One Part Three.
Reliability Chapter 3.  Every observed score is a combination of true score and error Obs. = T + E  Reliability = Classical Test Theory.
1 Chapter 4 – Reliability 1. Observed Scores and True Scores 2. Error 3. How We Deal with Sources of Error: A. Domain sampling – test items B. Time sampling.
EDU 8603 Day 6. What do the following numbers mean?
Week 5 Lecture 4. Lecture’s objectives  Understand the principles of language assessment.  Use language assessment principles to evaluate existing tests.
Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable Fairly Valid but not very Reliable Think in terms of ‘the purpose.
RELIABILITY Prepared by Marina Gvozdeva, Elena Onoprienko, Yulia Polshina, Nadezhda Shablikova.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Self-Assessing Locally-Designed Assessments Jennifer Borgioli Learner-Centered Initiatives, Ltd.
Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.
Measurement MANA 4328 Dr. Jeanne Michalski
Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.
1 LANGUAE TEST RELIABILITY. 2 What Is Reliability? Refer to a quality of test scores, and has to do with the consistency of measures across different.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Reliability Ability to produce similar results when repeated measurements are made under identical conditions. Consistency of the results Can you get.
Chapter 6 - Standardized Measurement and Assessment
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Validity & Reliability. OBJECTIVES Define validity and reliability Understand the purpose for needing valid and reliable measures Know the most utilized.
Reliability When a Measurement Procedure yields consistent scores when the phenomenon being measured is not changing. Degree to which scores are free of.
Reliability EDUC 307. Reliability  How consistent is our measurement?  the reliability of assessments tells the consistency of observations.  Two or.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Reliability. Basics of test score theory Each person has a true score that would be obtained if there were no errors in measurement. However, measuring.
Measurement and Scaling Concepts
Concept of Test Validity
Classical Test Theory Margaret Wu.
Reliability & Validity
Human Resource Management By Dr. Debashish Sengupta
Calculating Reliability of Quantitative Measures
Evaluation of measuring tools: reliability
Using statistics to evaluate your test Gerard Seinhorst
The first test of validity
REVIEW I Reliability scraps Index of Reliability
Chapter 8 VALIDITY AND RELIABILITY
Presentation transcript:

Reliability for Teachers Kansas State Department of Education ASSESSMENT LITERACY PROJECT1 Reliability = Consistency

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Essential Questions: 2 What is test reliability? What are some of the issues related to reliability? What are the three types of reliability? How can teachers increase their classroom tests’ reliability?

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Reliability is Essential 3 Tests must be reliable to be valid – but can be reliable and still not be valid

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Reliability Represents Consistency 4 Test reliability represents the consistency of test measurement. Unreliable tests can’t be trusted.

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Tests - Reliable / Results - Valid 5 Results of a test are valid or invalid The test itself is either reliable or unreliable A test must be both reliable and valid to be useful

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT What is Reliability? 6 Reliability is the consistency, stability, accuracy, and precision of the scores that a test yields

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Three Varieties of Reliability 7

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Reliability Depends on Correlational Analyses 8

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Correlation-Coefficients 9

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT No Pre-Determined Reliability Coefficient 10 It’s important to understand there is no predetermined reliability coefficient that tests must attain in order to show consistency of a test’s scores

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Activity One 1 Take time to answer the essential question: What is reliability? 11

Classification Consistency Reliability Kansas State Department of Education ASSESSMENT LITERACY PROJECT12 Classification consistency reliability is a representation of the proportion of students who are classified identically on two different test forms or two different administrations of the same test.

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Example of Classification Consistency (Good Reliability) 13 1 st Admin.Upper 3 ed Middle 3 ed Lower 3 ed Upper 3 ed 3552 Middle 3 ed 4326 Lower 3 ed Test-Retest Reliability Classification Table 2 nd Administered Test

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Example of Classification Consistency (Poor Reliability) 14 1 st Admin.Upper 3 ed Middle 3 ed Lower 3 ed Upper 3 ed Middle 3 ed Lower 3 ed Test-Retest Reliability Classification Table 2 nd Administered Test

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Issues Related to Classification Consistency Inter-rater reliability. Inter-rater agreement is the degree of agreement in the ratings that two or more observers assigned to evaluate the same behavior or performance The focus of inter-rater reliability is the accuracy of the ratings 15

16

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Issues Related to Classification Consistency Two types of errors are likely to occur when cut scores on tests are used to classify students The first error is setting cut scores too high The second error is setting cut scores too low 17

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT For tests used with cut scores, get answers to the following questions : What proportion of students would be classified the same way if they had taken a different form of the same test? What proportion of students would be classified the same way if they had taken the same form on a different day (assuming no changes in knowledge)? What proportion of students would be classified the same way if their responses to the constructed-response questions, such as essays, had been scored by different people? 18

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Reliability of Classification is Not Perfect Every test is only a sample of all the questions that could be asked Test takers are not likely to be equally knowledgeable about all of the possible questions that could be asked, so test form to test form differences are likely Day-to-day fluctuations in students’ attention, memory, health, and so on also impact reliability classification 19

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Cut Scores and Classification Consistency Most tests cannot distinguish well between students with scores that are very close to one another. Whenever a cut score is used, however, students with scores just above the cut score and students with scores just below the cut score will be classified differently What this means is that students who score near the cut score may pass or fail a test because of random fluctuations 20

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Reliable Tests = Classification Consistency The more reliable a test is, however, the less likely it is that the scores will be affected by large random fluctuations – Longer tests are more reliable than shorter tests – Objectively scored tests are more reliable than subjectively scored tests 21

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Activity Two 2 Let’s stop now and participate in Activity Two where we will answer the essential question: What are some of the issues related to reliability? 22

Stability Reliability Kansas State Department of Education ASSESSMENT LITERACY PROJECT23 Definition: Assessment results consistent over time (over occasions). Why might test results NOT be consistent over time?

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Determining Stability Reliability Test-Retest Reliability –Compute the correlation between a first and later administration of the same test Classification-consistency –Compute the percentage of consistent student classifications over time Main Concern is with the stability of the assessment over time 24

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Standard Error of Measurement (SEM) SEM is an estimate of the consistency of a student’s score if the student had retaken the same test over and over again 25

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Standard Deviation (SD) Standard Deviation of test scores is a statistical indicator of how spread out a set of test scores is 26

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Formula for Computing Standard Error of Measurement (SEM) Where: SEM = the standard Error of Measurement S x = the standard Deviation of the test scores r xx = the reliability of the test 27

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Alternate-Form Reliability Concerned with the question: Are two, supposedly equivalent, forms of an assessment in fact actually equivalent? The two forms do not have to yield identical scores The correlation between two or more forms of the assessment should be reasonably substantial 28

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Determining Alternative-Form Reliability Administer two forms of the assessment to the same individuals and correlate the results Determine the extent to which the same students are classified the same way by the two forms Alternate-form reliability is established by evidence, not by proclamation 29

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Internal Consistency Reliability Concerned with: Extent to which the test items of an assessment function consistently Extent items in an assessment measure a single attribute –For example, consider a math problem-solving test. To what extent does reading comprehension pay a role? What is actually being measured? 30

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Formulae for Computing Internal Consistency (Terminology) Kuder-Richardson (K-F Formula) = used for right/wrong answers such as multiple choice Cronbach Coefficient Alpha – for items in which students are given points such as essay questions Dichotomous items = right/wrong Polytomous = multiples responses/scores 31

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Most Common is the Kuder-Richardson or K-R Formula The closer the K-R value is to 1.00 the higher the internal reliability 32

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Activity Three 3 Take time now to participate in Activity Three to answer the essential question: What are the three types of reliability? 33

Review of Reliability Kansas State Department of Education ASSESSMENT LITERACY PROJECT23 Reliability is the consistency, stability, accuracy, and precision of the scores that a test yields Validity refers to what inferences can be made about the test’s results. A test can have high reliability but not have validity. However a test cannot have validity unless it is reliable Reliability depends upon correlational analyses. These are score consistency and classification consistency

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Review of Reliability There are three different types of reliability: Stability reliability is the consistency of results between two time-separated tests Alternate form reliability is the consistency of results between two different forms of a test Internal consistency reliability is the consistency in the way a test’s items function 35

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Review of Reliability Standard deviation of test scores is a statistical indicator of how spread out a set of test scores is Standard error of measurement or S E M is an estimate of the consistency of a student’s score if the student had retaken the same test over and over again If the standard error of measurement for a test is small this is a good thing. And the standard error of measurement is smaller when the standard deviation of the test scores is small and the reliability coefficient is large 36

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Improving Classroom Tests’ Reliability Always remember to encourage students to perform their best Match the assessment difficulty to the students’ ability levels Have scoring criteria that are available and well understood by students before they start an assignment or assessment 37

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Improving Classroom Tests’ Reliability Allow enough time to complete the assessment For objective assessments like multiple choice tests: –Have enough items - longer tests are more reliable than shorter tests 38

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Improving Classroom Tests’ Reliability For papers, essays, and projects: –Have clear directions for all students –Have a systematic scoring procedure that the students are familiar with –Have multiple markers (scorers) when possible 39

ASSESSMENT LITERACY PROJECT Kansas State Department of Education ASSESSMENT LITERACY PROJECT Activity Four 4 Take time now to participate in Activity Four to answer the essential question: How can teachers increase their classroom tests’ reliability? 40