Validity and Reliability II: The Basics

Slides:

Advertisements

Similar presentations

The meaning of Reliability and Validity in psychological research

Advertisements

Depths of Knowledge and Reading

Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.

The Research Consumer Evaluates Measurement Reliability and Validity

Increasing your confidence that you really found what you think you found. Reliability and Validity.

Reliability for Teachers Kansas State Department of Education ASSESSMENT LITERACY PROJECT1 Reliability = Consistency.

What Parents Need to Know  TABS (Texas Assessment of Basic Skills)  TEAMS (Texas Educational Assessment of Minimum Skills)  TAAS (Texas Assessment.

Lesson Seven Reliability. Contents  Definition of reliability Definition of reliability  Indication of reliability: Reliability coefficient Reliability.

Classroom Assessment A Practical Guide for Educators by Craig A

Understanding Validity for Teachers

Chapter 4. Validity: Does the test cover what we are told (or believe)

W E L C O M Magrill Elementary STAAR Informational Night Date.

ALIGNMENT. INTRODUCTION AND PURPOSE Define ALIGNMENT for the purpose of these modules and explain why it is important Explain how to UNPACK A STANDARD.

Technical Issues Two concerns Validity Reliability

Standardized Testing (1) EDU 330: Educational Psychology Daniel Moos.

Poetry Assessment Analysis & Critique Krissa Loretto EDUC 340 Spring 2014.

LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.

Technical Adequacy Session One Part Three.

+ Old Reliable Testing accurately for thousands of years.

Developing Assessments for and of Deeper Learning [Day 2b-afternoon session] Santa Clara County Office of Education June 25, 2014 Karin K. Hess, Ed.D.

EDU 8603 Day 6. What do the following numbers mean?

Appraisal and Its Application to Counseling COUN 550 Saint Joseph College For Class # 3 Copyright © 2005 by R. Halstead. All rights reserved.

Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable Fairly Valid but not very Reliable Think in terms of ‘the purpose.

Welcome to the Stanford Achievement Test Parent Meeting.

McGraw-Hill/Irwin © 2012 The McGraw-Hill Companies, Inc. All rights reserved. Obtaining Valid and Reliable Classroom Evidence Chapter 4:

Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.

Measurement MANA 4328 Dr. Jeanne Michalski

Chapter 6 - Standardized Measurement and Assessment

Validity & Reliability. OBJECTIVES Define validity and reliability Understand the purpose for needing valid and reliable measures Know the most utilized.

LISA A. KELLER UNIVERSITY OF MASSACHUSETTS AMHERST Statistical Issues in Growth Modeling.

Lesson 3 Measurement and Scaling. Case: “What is performance?” brandesign.co.za.

Year 2 New Curriculum and SATs information.. The purpose of this meeting is: to share with you expectations of the new curriculum and the impact this.

Interpreting Test Results using the Normal Distribution Dr. Amanda Hilsmier.

Nuts and Bolts of Assessment

Checklists and Rubrics EDU 300 Newberry College Jennifer Morrison

Welcome to the Stanford Achievement Test Parent Meeting

Welcome to the Stanford Achievement Test Parent Meeting

Personality Assessment, Measurement, and Research Design

Helping Students Learn

Understanding Reading Strategies

Assessment Theory and Models Part II

In-Service Teacher Training

Writing Tasks and Prompts

Research Methods Lesson 1 choosing a research method types of data

Journalism 614: Reliability and Validity

Steps for Curriculum Mapping

Testing testing: developing tests to support our teaching and learning

SAT/ACT Which test should you take?

Introduction to Measurement

Week 3 Class Discussion.

پرسشنامه کارگاه.

the BIG 3 OVERVIEW OF CRITERIA FOR EVALUATING EDUCATIONAL ASSESSMENT

Starter Look at the photograph, As a sociologist, you want to study a particular group in school. In pairs think about the following questions… Which group.

Reliability and Validity of Measurement

Session 4 Objectives Participants will:

SAT/ACT Which test should you take?

Intro to Statistical Inference

What Is Science? Read the lesson title aloud to students.

Validity and Reliability I: What’s the Story?

What Is Science? Read the lesson title aloud to students.

The first test of validity

Critically Evaluating an Assessment Task

Jeopardy Performance Standards General

Personality Assessment, Measurement, and Research Design

Jennifer Rodriguez TB610-60

Accelerated Reader® 101 for Parents

Chapter 8 VALIDITY AND RELIABILITY

Research Design. Research Design Validity Validity refers to the amount that a measure actually measures the concept it is designed to measure.

Reliability and Validity

Presentation transcript:

Validity and Reliability II: The Basics EDU 300 | Newberry College Jennifer Morrison Picture: http://commons.wikimedia.org/wiki/File:Honda_Civic_1995.jpg

Validity and Reliability II What is the difference between reliability and validity? Why are they important concepts? How can you make your assessments more reliable? How can you make your inferences from those assessments more valid? By the end of class you should be able to answer these questions.

Reliability Definition #1 Statisticians say an assessment is reliable when it gets consistent results over time. That means if Susie takes Assessment X in January she will get the same results when she takes the assessment (even if it’s a different version) in June, given that she does not learn anything that’s on Assessment X between tests. Discuss the statistical definition of reliability.

Reliability Definition #2 For our purposes, let’s use James Popham’s definition of reliability – that the assessment measures what it is supposed to measure. That means when Susie takes Assessment X that it actually assesses what Susie learned in Course X. Do your assessments measure what they are supposed to measure? How do you know? Reveal the first question and ask participants – “Are your classroom assessments reliable? In other words, do they measure what they are supposed to measure?” Participants will probably answer yes. Ask participants what our classroom assessments are supposed to measure. The general consensus should be SC Academic Standards and objectives. Reveal the second question – “How do you know?” – and state that most teachers’ classroom assessments are not actually reliable because they are not on target with SC Academic Standards or even the teacher’s stated objectives. This is one good reason to write your standards and objectives on your assessments.

There is a general feeling that data from teachers’ classroom tests are not reliable. Why? People feel that teachers’ classroom tests are not reliable. Why? Reliable is defined by many as meaning trustworthy. Difference between students’ grades and achievement on standardized tests Curriculum and assessment inconsistency between teachers Teachers teach and test what they like.. Tests we’ve seen don’t look good.. No one knows what really happens in a teachers classroom (the door is shut) Students tell parents they aren’t learning anything Parents aren’t getting any other information No one can figure out where grades come from Picture: http://www.tripwire.com/blog/2009/06/

On Target vs. On Topic For classroom assessments to be reliable in regard to state standards, assessment items must be on target, not just on topic. Discuss these two points. Ask participants to give examples of how teachers’ assessments might be on topic but not on target.

Example Standard = Use context clues to determine the meaning of technical terms and other unfamiliar words. (SC E4-3.1) Something used to confine a dog is… A) a cage, B) training, or C) identity tags. How would the assessment question have to be structured in order to assess the standard effectively? How would the assessment question have to be structured in order to assess the standard at an advanced level? Show and discuss this example. The example question is not on target because the are no context clues for students to use in order to determine the meaning of the word confine. In fact, in this instance, the word had been previously taught and memorized. Therefore, even if there were context given, students would simply have to remember what the word meant, not use context clues. This standard cannot be met unless students must apply their knowledge of context clues to determine the meaning of an unfamiliar word. To assess the standard effectively, the word must be unfamiliar and students must use context clues in a sentence or passage to determine meaning. The sentence or passage must contain context clues students should use. The assessment might ask the student to also explain what context clues he/she used. To assess the standard at an advanced level, the student would have to create context clues for unfamiliar words or evaluate the ways in which an author supplies context clues with his/her word choice. The student could also design a mental tool for locating and analyzing context clues.

How can we make sure our assessments are reliable? Be sure assessment items are on target (not just on topic). Follow the rules when designing your assessment items. Assess your assessment. DIY: Do an item analysis; Have two people evaluate (inter-rater reliability) Use statistics: Find the reliability coefficient (using test-retest , equivalent forms, or internal consistency methods)

SEM A standard error of measurement is a statistical number that indicates the amount of error to allow for when interpreting assessment scores. The SEM shows how many points we must add to or subtract from an individual’s test score in order to estimate the range of that individual’s true score, or score free from error. We use the SEM to create a score band or confidence band.

Validity Are the inferences we make from the data accurate? We are most likely to draw valid inferences when we know… the assessment and assessment procedure. how the assessment results were determined and what they mean. what was assessed. the consequences of using the assessment.

How can we make sure our inferences are valid? Make sure the assessment is reliable (that it assesses what it is supposed to assess). Make sure the assessment has an adequate sample (items, tasks) Have a procedure and rationale in place for scoring. The “numbers” need to be useful and mean something. Think about your interpretations. DIY: Compare performance to other measures, class averages, across multiple groups; Go over the assessment and have students share their thought processes Use statistics: Correlation coefficient (degree of relationship between two measures); Expectancy table Correlation between SAT scores and college grades Correlation between absences and course grades

What’s Due?