Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.

Slides:



Advertisements
Similar presentations
Lecture 7: reliability & validity Aims & objectives –This lecture will explore a variety of techniques for ensuring that research is conducted with reliable.
Advertisements

Consistency in testing
Topics: Quality of Measurements
The Research Consumer Evaluates Measurement Reliability and Validity
RELIABILITY Reliability refers to the consistency of a test or measurement. Reliability studies Test-retest reliability Equipment and/or procedures Intra-
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
Some (Simplified) Steps for Creating a Personality Questionnaire Generate an item pool Administer the items to a sample of people Assess the uni-dimensionality.
Reliability and Validity checks S-005. Checking on reliability of the data we collect  Compare over time (test-retest)  Item analysis  Internal consistency.
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.
VALIDITY AND RELIABILITY
 A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account.
Reliability & Validity.  Limits all inferences that can be drawn from later tests  If reliable and valid scale, can have confidence in findings  If.
Part II Sigma Freud & Descriptive Statistics
Reliability for Teachers Kansas State Department of Education ASSESSMENT LITERACY PROJECT1 Reliability = Consistency.
Part II Sigma Freud & Descriptive Statistics
Reliability Analysis. Overview of Reliability What is Reliability? Ways to Measure Reliability Interpreting Test-Retest and Parallel Forms Measuring and.
Beginning the Research Design
Reliability and Validity
Psych 231: Research Methods in Psychology
Validity, Reliability, & Sampling
Research Methods in MIS
Chapter 9 Flashcards. measurement method that uses uniform procedures to collect, score, interpret, and report numerical results; usually has norms and.
Classical Test Theory By ____________________. What is CCT?
Measurement and Data Quality
Validity and Reliability
Reliability, Validity, & Scaling
Instrument Validity & Reliability. Why do we use instruments? Reliance upon our senses for empirical evidence Senses are unreliable Senses are imprecise.
Instrumentation.
Foundations of Educational Measurement
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
Test item analysis: When are statistics a good thing? Andrew Martin Purdue Pesticide Programs.
Statistical Evaluation of Data
1 Cronbach’s Alpha It is very common in psychological research to collect multiple measures of the same construct. For example, in a questionnaire designed.
Reliability Chapter 3.  Every observed score is a combination of true score and error Obs. = T + E  Reliability = Classical Test Theory.
Assessing Learners with Special Needs: An Applied Approach, 6e © 2009 Pearson Education, Inc. All rights reserved. Chapter 4:Reliability and Validity.
EDU 8603 Day 6. What do the following numbers mean?
Appraisal and Its Application to Counseling COUN 550 Saint Joseph College For Class # 3 Copyright © 2005 by R. Halstead. All rights reserved.
Research methods in clinical psychology: An introduction for students and practitioners Chris Barker, Nancy Pistrang, and Robert Elliott CHAPTER 4 Foundations.
Experiment Basics: Variables Psych 231: Research Methods in Psychology.
Experimental Research Methods in Language Learning Chapter 9 Descriptive Statistics.
Question paper 1997.
RELIABILITY Prepared by Marina Gvozdeva, Elena Onoprienko, Yulia Polshina, Nadezhda Shablikova.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Chapter 9 Correlation, Validity and Reliability. Nature of Correlation Association – an attempt to describe or understand Not causal –However, many people.
Chapter Eight: Using Statistics to Answer Questions.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
Reliability: Introduction. Reliability Session 1.Definitions & Basic Concepts of Reliability 2.Theoretical Approaches 3.Empirical Assessments of Reliability.
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Chapter 6 - Standardized Measurement and Assessment
Reliability and Validity in Testing. What is Reliability? Consistency Accuracy There is a value related to reliability that ranges from -1 to 1.
TEST SCORES INTERPRETATION - is a process of assigning meaning and usefulness to the scores obtained from classroom test. - This is necessary because.
Language Assessment Lecture 7 Validity & Reliability Instructor: Dr. Tung-hsien He
Dr. Jeffrey Oescher 27 January 2014 Technical Issues  Two technical issues  Validity  Reliability.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Reliability. Basics of test score theory Each person has a true score that would be obtained if there were no errors in measurement. However, measuring.
The Basics of Social Science Research Methods
Reliability Analysis.
© LOUIS COHEN, LAWRENCE MANION AND KEITH MORRISON
پرسشنامه کارگاه.
5. Reliability and Validity
Reliability and Validity of Measurement
Reliability Analysis.
The first test of validity
15.1 The Role of Statistics in the Research Process
Qualities of a good data gathering procedures
Presentation transcript:

Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis

Leading Questions What is reliability? How do we know that a research instrument is reliable? Why do you think reliability is important for experimental research?

The Reliability and Validity of a Measure Researchers should not assume that their instruments are reliable merely because they have already piloted them, or because they adopt them from trusted researchers in the field who originally reported high reliability estimates of the instruments. A reliability estimate largely depends on the participants taking the tests, and the context in which they take them, and the test items or tasks that have been used.

The Reliability and Validity of a Measure Test validity was originally defined as the degree to which a measure captures what it claims to measure. Test validity is related to theory and how a construct is defined. Reliability is a necessary, but insufficient condition for measurement validity. Essential to present evidence of a high reliability coefficient, which implies a good level of precision and consistency of the instruments used.

Estimating a Reliability Coefficient Reliability is a complex issue because there are different aspects we need to take into account. Reliability needs to be understood together with validity. Reliability is typically described as the consistency of scoring (e.g., language tests or productive tasks), coding (e.g., coding think- aloud or interview data), or rating (e.g., Likert- scale questionnaires, and quantitative observations).

What Does a Reliability Estimate Tell Us?

Unreliable and invalid : Archery results would be both unreliable and invalid if our arrows missed the circular target altogether of randomly hit the circular targets without once landing in the goal (the centre). Reliable but invalid : In this case, arrows hit around or at the same spot in the circular targets, but never hit the goal. Reliable and valid : This was when arrows hit the goal consistently.

A Reliability Estimate A reliability estimate ranges between 0 and 1. A reliability estimate of 0.90 of a language test indicates that students who score 60 out of 100 are 90% likely to obtain a similar test score when they take a similar test. A reliability estimate is 0.50, these same students are only 50% likely to obtain a similar score in a similar test suggests a less certainty about the result, compared to 0.90.

A Reliability Estimate A reliability coefficient of 0.70 upwards (70% or above of the items consistently collects information about the target construct) is acceptable, but one of 0.90 or above is desirable for research (Dörnyei 2007). A reliability estimate tells us the extent to which a research instrument, an observation, or a coding system is free from error of measurement.

Classical True Score Theory Theoretically speaking, an observed score (e.g., 5 out of 10, 70 out of 100) is composed of a true score, which is due to a learner’s true level of ability, and an error score, which is due to factors other than a learner’s level of ability. Observed score = true score + error score

Standard Error of Measurement (SEM) A reliability coefficient tells us about score consistency for a group of students. But it does not directly tell us whether an individual learner’s score is within a reasonable range. A standard error of measurement (SEM) score tells us a range of possible true scores for a learner. SEM is related to the reliability coefficient of a research instrument in a specific use.

Standard Error of Measurement (SEM) If a reliability coefficient of a test is 1.0, we will know for sure that there is no error score in this test because it has perfect reliability. Given this, the standard error of measurement, by default is zero. However, a reliability estimate of 1 is very rare and most unlikely. SEMs are computed using the reliability estimate and the standard deviation of a test score.

Standard Error of Measurement (SEM) SEM = SD X √[1- a reliability coefficient, where SD = a standard deviation on the test. For example, if a test has a reliability coefficient of 0.82 and SD is 5.29, we can compute the SEM as follows: SEM =5.29 X√  5.29 X √0.18  5.29 X 0.42  2.24 If a participant score was 28 out of 40, we simply use the SEM score to add and subtract the test score (i.e., 28 ± 2.24). His/her true score would be within a range of and

Factors Influencing a Reliability Coefficient There are interrelated factors that influence the reliability coefficient of a test or a measure, including. Objective scoring Nature of the construct of interest Number of participants or the sample size Test length or measure length Heterogeneity of participants’ abilities and attributes

Types and Methods of Calculation of Reliability Coefficients Split-half reliability coefficient Spearman-Brown prophecy coefficient Cronbach’s Alpha coefficient Rater Agreements in Percentages Cohen’s Kappa Coefficient

Discussion What is the difference between a correlation coefficient and a reliability coefficient? What is the meaning of a reliability coefficient to you? What do you think would be a problem in an experimental study when the researchers did not analyze their research instruments prior to inferential statistics?