 A test is said to be valid if it measures accurately what it is supposed to measure and nothing else.  For Example; “Is photography an art or a science?

Slides:



Advertisements
Similar presentations
The meaning of Reliability and Validity in psychological research
Advertisements

Issues of Reliability, Validity and Item Analysis in Classroom Assessment by Professor Stafford A. Griffith Jamaica Teachers Association Education Conference.
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Research Methodology Lecture No : 11 (Goodness Of Measures)
Testing What You Teach: Eliminating the “Will this be on the final
Assessment: Reliability, Validity, and Absence of bias
RESEARCH METHODS Lecture 18
VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green 1.
VALIDITY.
Prepared by : The instructor :
Teaching and Testing Pertemuan 13
BASIC PRINCIPLES OF ASSSESSMENT RELIABILITY & VALIDITY
VALIDITY & RELIABILITY Raja C. Bandaranayake. QUALITIES OF MEASUREMENT DEVICES  Validity Does it measure what it is supposed to measure?  Reliability.
Validity of Selection. Objectives Define Validity Relation between Reliability and Validity Types of Validity Strategies.
Chapter 7 Evaluating What a Test Really Measures
Classroom Assessment A Practical Guide for Educators by Craig A
Construct Validity By Michael Kotutwa Johnson Submitted October 23, 2006 AED 615 Professor Franklin.
Understanding Validity for Teachers
Questions to check whether or not the test is well designed: 1. How do you know if a test is effective? 2. Can it be given within appropriate administrative.
1 Evaluating Psychological Tests. 2 Psychological testing Suffers a credibility problem within the eyes of general public Two main problems –Tests used.
Test Validity S-005. Validity of measurement Reliability refers to consistency –Are we getting something stable over time? –Internally consistent? Validity.
VALIDITY. Validity is an important characteristic of a scientific instrument. The term validity denotes the scientific utility of a measuring instrument,
Validity and Reliability
PhD Research Seminar Series: Reliability and Validity in Tests and Measures Dr. K. A. Korb University of Jos.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Technical Adequacy Session One Part Three.
Validity & Practicality
Principles in language testing What is a good test?
6. Conceptualization & Measurement
Validity. Face Validity  The extent to which items on a test appear to be meaningful and relevant to the construct being measured.
Chap. 2 Principles of Language Assessment
Week 5 Lecture 4. Lecture’s objectives  Understand the principles of language assessment.  Use language assessment principles to evaluate existing tests.
Reliability vs. Validity.  Reliability  the consistency of your measurement, or the degree to which an instrument measures the same way each time it.
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Measurement Validity.
Selecting a Sample. Sampling Select participants for study Select participants for study Must represent a larger group Must represent a larger group Picked.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
Recall What was Required in the Rationale Assignment? 1.What is the phenomenon? 2.How is it different & similar to another phenomenon? 3.When/where/how.
Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.
SOCW 671: #5 Measurement Levels, Reliability, Validity, & Classic Measurement Theory.
Validity in Testing “Are we testing what we think we’re testing?”
Nurhayati, M.Pd Indraprasta University Jakarta.  Validity : Does it measure what it is supposed to measure?  Reliability: How the representative is.
Teaching Credentials. What is a Teaching Credential? A credential is an authorization to teach a subject or subjects in a school setting grades K-12.
Evaluation, Testing and Assessment June 9, Curriculum Evaluation Necessary to determine – How the program works – How successfully it works – Whether.
VALIDITY, RELIABILITY & PRACTICALITY Prof. Rosynella Cardozo Prof. Jonathan Magdalena.
Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity.
CERTIFICATE IN ASSESSING VOCATIONAL ACHIEVEMENT (CAVA) Unit 1: Understanding the principles and practices of assessment.
Evaluation and Assessment Evaluation is a broad term which involves the systematic way of gathering reliable and relevant information for the purpose.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
EVALUATING EPP-CREATED ASSESSMENTS
Principles of Language Assessment
Assessment Fall Quarter 2007.
VALIDITY by Barli Tambunan/
Lecture 5 Validity and Reliability
Introduction to the Validation Phase
Reliability and Validity in Research
Concept of Test Validity
Selecting Employees – Validation
Test Validity.
Evaluation of measuring tools: validity
Tests and Measurements: Reliability
Validity.
Human Resource Management By Dr. Debashish Sengupta
پرسشنامه کارگاه.
VALIDITY Ceren Çınar.
RESEARCH METHODS Lecture 18
Gazİ unIVERSITY M.A. PROGRAM IN ELT TESTING AND ASSESSMENT IN ELT «ValIdIty» PREPARED BY FEVZI BALIDEDE 2013, ANKARA.
Why do we assess?.
Chapter 8 VALIDITY AND RELIABILITY
Presentation transcript:

 A test is said to be valid if it measures accurately what it is supposed to measure and nothing else.  For Example; “Is photography an art or a science? Discuss” This is not a valid writing test.Why??

Validit y Face validity Content validity Construct validity Criterion- related validity Concurrent validity Predictive validity

 A test is said to have a face validity if it looks as it measures what it is supposed to measure. *  For Example, Oral test !!??  How ?? Show the test to colleagues and friends.

 A test is said to have a content validity if its contents constitutes a representative sample of the language skills, structures, etc with which it is concerned. *  The test should include a proper sample of the relevant structures.

 How??  We need a specification of the skills or structures that it meant to cover. *  It should be clear and specific. **  Give a percentage weighting each part.  Why is it important? Accurate measure of the what it is supposed to measure. Harmful backwash effect. ***

 Sometimes, the content of the test is determined by what is EASY to test rather what is IMPORTANT to test.

 How far the results on the test agree with those provided by some independent and highly dependent assessment of the candidate’s ability.  This independent assessment is the criterion measure against which the test is validated.

Criterion- related validity Concurrent validity Predictive validity

 Concurrent validity is established when the test and the criterion are administered at about the same time.  For example; oral test  How?? Random sample- subjected to the 45 mins.- 4 scorers.

 If the test is not valid, then it can not be used as a dependable measure of achievement with respect to the objectives of the test.

 The criterion for concurrent validation is not necessarily a proven, long test. A test might be validated against teacher’s assessment of their students provided that the assessments themselves can be relied on.

 Predictive validity concerns the degree to which a test can predict candidates’ future performance.  For example, proficiency test The criterion here might be an assessment of the students’ English as perceived by his/her supervisor at the university or it could be the outcome of the course (pass/fail).

 A test is said to have a construct validity if it can be demonstrated that it measures just the ability which it is supposed to measure. The word “construct” refers to any underlying ability which is hypothesized in a theory of language ability.

 How? It is a matter of empirical research to establish whether or not such a distinct ability exists, can be measured and is indeed measured in that test.

 Is it appropriate for practical test situations???

 Gross, commonsense construct like “the reading ability” and “ the writing ability” have no problem since we can measure them directly. even without evidence from research, we can be confident that we are measuring a distinct and meaningful ability.

However, we can not be sure if we are going to test them indirectly. In this case, we have to look to the theory of writing ability for guidance on how to construct the test.

 For example, multiple- choice test for writing.  How?? - Pilot test - Obtain extensive samples of the writing ability of the group. - Score them. - Compare pilot test against writing scores. - Level of agreement.

 However, we might develop a satisfactory indirect test of writing, but we have administered the reality of the underlying constructs (e.g. punctuation) ???

 How?? Administered a series of specially constructed tests measuring each of the constructs by a number of different methods. Writing can be scored separately in regard to each hypothesized construct.

 If the coefficients between the scores on the same construct are higher than those on different constructs, we have evidence that we are indeed measuring separate and identifiable constructs.

 Write explicit specifications for the test. (constructs that will be measured, include representative sample of the content)  Use direct testing.  make sure that the scoring of responses related directly to what is being tested.  Make it reliable. It can not be valid unless it is reliable.