Reliability.

Slides:



Advertisements
Similar presentations
Questionnaire Development
Advertisements

Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
Topics: Quality of Measurements
Taking Stock Of Measurement. Basics Of Measurement Measurement: Assignment of number to objects or events according to specific rules. Conceptual variables:
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
Chapter 5 Reliability Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright ©2006.
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Increasing your confidence that you really found what you think you found. Reliability and Validity.
Chapter 4 – Reliability Observed Scores and True Scores Error
1 Reliability in Scales Reliability is a question of consistency do we get the same numbers on repeated measurements? Low reliability: reaction time High.
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.
VALIDITY AND RELIABILITY
Lesson Six Reliability.
1Reliability Introduction to Communication Research School of Communication Studies James Madison University Dr. Michael Smilowitz.
Validity and Reliability
 A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account.
Part II Sigma Freud & Descriptive Statistics
Reliability for Teachers Kansas State Department of Education ASSESSMENT LITERACY PROJECT1 Reliability = Consistency.
Part II Sigma Freud & Descriptive Statistics
-生醫統計期末報告- Reliability 學生 : 劉佩昀 學號 : 授課老師 : 蔡章仁.
Reliability and Validity of Research Instruments
Can you do it again? Reliability and Other Desired Characteristics Linn and Gronlund Chap.. 5.
Reliability Analysis. Overview of Reliability What is Reliability? Ways to Measure Reliability Interpreting Test-Retest and Parallel Forms Measuring and.
A quick introduction to the analysis of questionnaire data John Richardson.
Lesson Seven Reliability. Contents  Definition of reliability Definition of reliability  Indication of reliability: Reliability coefficient Reliability.
Research Methods in MIS
Technical Issues Two concerns Validity Reliability
Measurement and Data Quality
Validity and Reliability
Determining Sample Size
Instrumentation.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
Reliability Lesson Six
Technical Adequacy Session One Part Three.
Reliability & Validity
1 Chapter 4 – Reliability 1. Observed Scores and True Scores 2. Error 3. How We Deal with Sources of Error: A. Domain sampling – test items B. Time sampling.
Appraisal and Its Application to Counseling COUN 550 Saint Joseph College For Class # 3 Copyright © 2005 by R. Halstead. All rights reserved.
Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable Fairly Valid but not very Reliable Think in terms of ‘the purpose.
Chapter 2: Behavioral Variability and Research Variability and Research 1. Behavioral science involves the study of variability in behavior how and why.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Psychometrics. Goals of statistics Describe what is happening now –DESCRIPTIVE STATISTICS Determine what is probably happening or what might happen in.
Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Reliability Ability to produce similar results when repeated measurements are made under identical conditions. Consistency of the results Can you get.
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Chapter 6 - Standardized Measurement and Assessment
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Validity & Reliability. OBJECTIVES Define validity and reliability Understand the purpose for needing valid and reliable measures Know the most utilized.
Reliability When a Measurement Procedure yields consistent scores when the phenomenon being measured is not changing. Degree to which scores are free of.
Reliability EDUC 307. Reliability  How consistent is our measurement?  the reliability of assessments tells the consistency of observations.  Two or.
Language Assessment Lecture 7 Validity & Reliability Instructor: Dr. Tung-hsien He
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Reliability. Basics of test score theory Each person has a true score that would be obtained if there were no errors in measurement. However, measuring.
RELIABILITY AND VALIDITY Dr. Rehab F. Gwada. Control of Measurement Reliabilityvalidity.
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
CHAPTER 5 MEASUREMENT CONCEPTS © 2007 The McGraw-Hill Companies, Inc.
Classical Test Theory Margaret Wu.
Reliability & Validity
Calculating Reliability of Quantitative Measures
به نام خدا.
PSY 614 Instructor: Emily Bullock, Ph.D.
Evaluation of measuring tools: reliability
The first test of validity
Chapter 8 VALIDITY AND RELIABILITY
Presentation transcript:

Reliability

:The Instructor Presented by: Dr. Çise Çavuşoğlu Saman A. Hasan Welcome every one Reliability :The Instructor Presented by: Dr. Çise Çavuşoğlu Saman A. Hasan

Reliability and Validity Example :1 A test designed to measure typing ability. if the test is reliable we would expect a student who receives a high score the first time he takes the test to receive a high score the next time he takes the test, but they should be close.  Example :2 If you test someone and he is got 100, after a period of time do again if you get the same result, it is reliability but if you get different result it is validity. Example 3 When all the students know the answer of your question it is reliability , but when students have different answers for your question it is validity. Example 4 How much time do I need for studying master? The answer is reliability How much time do I need for learning Turkish? The answer is validity

Reliability and Validity Example 5 Suppose a researcher gave a group of eighth graders two forms of a test designed to measure their knowledge of the constitution of the united states and found their scores to be consistent: those who scored high on form A also scored high on form B ; and so on . we would say that the scores are reliable. Example 6 When we talk about human's organs , It is reliability When we talk about human's behavior, It is validity   Example 7 What does consistency to be measure? (Reliability) What does suppose to be measure? (Validity)

Definition Reliability refers to the consistency of scores or answers from one set of items to another. of research concerns the replicability and consistency of the methods, conditions, and results. Reliable data is evidence that you can trust. That means that if someone else did the same experiment they would get the same result. ”“Your evidence will be more reliable if you repeat your reading

Types of reliability External reliability : involves the extent to which independent researchers working in the same or similar context would obtain consistent results. Such as psychometric tests and questionnaires can be assessed using the test-retest method.    This involves testing the same participant twice over a period of time on the same test.  Similar scores would suggest that the test has external reliability.

Types of reliability Internal reliability: involves the extent to which researchers concerned with the same data and constructs would be consistent in matching them. Such as psychometric tests and questionnaires can be assessed using the split half method.    This involves splitting a test into two and having the same participant doing both halves of the test.  If the two halves of the test provide similar results this would suggest that the test has internal reliability.

Reliability Errors of measurement Whenever people take the same test twice , they will seldom perform exactly the same –that is, their scores or answers will not usually be identical. This may due to variety of factors. Here’s a crazy (but true) example: many years ago, people used to believe that if you had a large brain then you were intelligent. Suppose you went around and measured the circumference of your friend's heads because you also believed this theory. Is the size of a person’s head a reliable measure (Think first!)? The answer is YES. If I measured the size of your head today and then next week, I would get the same number. Therefore, it is reliable. However, the whole idea is wrong! Because we now know that larger headed people are not necessarily smarter than smaller headed ones, we know that the theory behind the measure is invalid.

Reliability coefficient Expresses a relationship , but this time it is between scores of the same individuals on the same instrument at two different times, or between two parts of the same instrument. Reliability is related to these parts. If scores have large error components, reliability is low; but if there is little error in the scores, Reliability is high. Reliability coefficient can take on values from 0 to 1.0, inclusive. Conceptually, if a reliability coefficient where 0, there would be no "true" component in the observed score. The observed score would consist entirely of error. On the other hand, if the reliability coefficient where 1.0, the observed score would contain no error

Procedures for Estimating Reliability (The three best – known ways to obtain a reliability coefficient) Test-Retest Method: the test retest method involves administering the same test twice to the same group after a certain time interval has elapsed. A reliability coefficient is then calculated to indicate the relationship between the two sets of scores obtained.  If the test is reliable, the scores that each student receives on the first administration should be similar to the scores on the second. Reliability coefficient will be affected by the length of time that elapses between the two administrations of the test. The longer the time interval, the lower the reliability coefficient is likely to be  

Procedures for Estimating Reliability Equivalent-Forms Method :(also called alternate or parallel) when the equivalent –forms is used, two different but equivalent forms of an instrument are administered to the same group of individuals during the same time period. Although the questions are different, they should sample the same content and they should be constructed separately from each other. A reliability coefficient is then calculated between the two sets of scores obtained.

Inter-Rater Reliability Whenever observations of behavior are used as data in research, we want to assure that these observations are reliable.  One way to determine this is to have two or more observers rate the same subjects and then correlate their observations.  If, for example, rater A observed a child act out aggressively eight times, we would want rater B to observe the same amount of aggressive acts.  If rater B witnessed 16 aggressive acts, then we know at least one of these two raters is incorrect.  If there ratings are positively correlated, however, we can be reasonably sure that they are measuring the same construct of aggression.  It does not, however, assure that they are measuring it correctly, only that they are both measuring it the same.

There are several internal consistency methods of estimating reliability Split- half procedure The split-half procedure involves scoring two halves of a test separately for each person and then calculating the correlation coefficient for the two sets of scores. The coefficient indicates the degree to which the halves of the test provide the same results. Hence describe the internal consistency of the test. Reliability of scores On total test Thus, if we obtained a correlation coefficient of .56 by comparing on–half of the test items to the other half, the reliability of scores for the total test would be: Reliability of scores = = .72 This illustrates an important characteristic of reliability. The reliability of a test can generally be increased by increasing its length, if the items added are similar to the original ones.

Kuder - Richardson Approaches Perhaps the most frequently employed method for determining internal consistency is the Kuder - Richardson Approach, particularly formulas KR20 and KR21. These formulas require only three pieces of information- The number of the items in the test. The mean The standard deviation   *That formula KR21 can be used only if it can be assumed that the items are of equal difficulty. As the following: KR21 reliability = Coefficient K= number of items in the test M= mean of the test of test scores SD= standard deviation of the set of test scores.

For example, if (K= 50 M= 40 SD= 4) The reliability coefficient would be calculated as shown below:   Reliability = = ( ) = (1.02) ( 1 - .50 ) = (1.02) (.50) = .51 Thus the reliability estimate for scores on this test is (.51) *Formula KR20 does not require the assumption that all items are of equal difficulty, although it is more difficult to calculate. Computer programs for doing so are commonly available, however, and should be used whenever a researcher cannot assume that all items are of equal difficulty.

How do you know the reliability estimate of How do you know the reliability estimate of .51 good or bad, high or low?  First we can compare a given coefficient with the extremes that are possible. As you will recall, a coefficient of ( .00) indicates a complete absence of a relationship; hence no reliability at all, whereas (1.00) is the maximum possible coefficient that can be obtained. Second we can compare a given reliability coefficient with the sorts of coefficient that are usually obtained for measures of the same type.   The reported reliability coefficients for many commercially available achievement tests, for example, are typically .90 or higher when Kurder-Rechardson formulas are used. Many classroom tests report reliability coefficient of .70 and higher.  Compared to these figures, our obtained coefficient must be judged rather low. For research purposes, a useful rule of thumb is that reliability should be at least .70 and preferably higher.

Alpha Coefficient Another check on the internal consistency of an instrument is to calculated an alpha coefficient ( frequently called Cronbach alpha after the man who developed it) This coefficient (a) is a general form of the KR20 formula to be used in calculating the reliability of items that are not scored right versus wrong, as in some easy tests where more than one answer is possible. You might see a problem in that you picked two halves at random. Why not try to take into account all possible split halves. Wouldn't that you give you a better estimate? In fact, that is done by Cronbach's Alpha:

Summary of the three methods of estimating the instrument Types of Information Provided Methods Stability of test scores over time Test-retest Consistency of test scores over two different forms of an instrument. Equivalent forms Consistency of test scores over two different parts of an instrument. Internal consistency   1 2 2 1 1

An example of the importance of reliability It is the use of measuring devices in Olympic track and field events. For the vast majority of people, ordinary measuring rulers and their degree of accuracy are reliable enough. However, for an Olympic event, such as the discus throw, the slightest variation in a measuring device -- whether it is a tape, clock, or other device -- could mean the difference between the gold and silver medals. Additionally, it could mean the difference between a new world record and outright failure to qualify for an event. Olympic measuring devices, then, must be reliable from one throw or race to another and from one competition to another. They must also be reliable when used in different parts of the world, as temperature, air pressure, humidity, interpretation, or other variables might affect their readings.

References Fraenkel, J., R., & Wallen, N., E., (1990). How to design and evaluate research in education. New York.   Jones, J. E. & Bearley, W.L. (1996, Oct 12). Reliability and validity of training instruments. Organizational Universe Systems. Available: http://ous.usa.net/relval.htm Wiersma, W. , & Jurs , S. G., ( 2005 ). Research Methods in Education. Boston: Pearson Education

Thanks for your attention NEAR EAST UNIVERSITY Department of ELT 22-Nov-2011