REVIEW I Reliability Index of Reliability Theoretical correlation between observed & true scores Standard Error of Measurement Reliability measure Degree.

Slides:

Advertisements

Similar presentations

The Research Consumer Evaluates Measurement Reliability and Validity

Advertisements

© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.

VALIDITY AND RELIABILITY

Reliability for Teachers Kansas State Department of Education ASSESSMENT LITERACY PROJECT1 Reliability = Consistency.

What is a Good Test Validity: Does test measure what it is supposed to measure? Reliability: Are the results consistent? Objectivity: Can two or more.

Combining Test Data MANA 4328 Dr. Jeanne Michalski

Reliability and Validity of Research Instruments

RESEARCH METHODS Lecture 18

Chapter 4 Validity.

Test Validity: What it is, and why we care.

LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.

LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.

Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.

Concept of Reliability and Validity. Learning Objectives  Discuss the fundamentals of measurement  Understand the relationship between Reliability and.

Lecture 7 Psyc 300A. Measurement Operational definitions should accurately reflect underlying variables and constructs When scores are influenced by other.

Characteristics of Sound Tests

Validity and Reliability EAF 410 July 9, Validity b Degree to which evidence supports inferences made b Appropriate b Meaningful b Useful.

Chapter 7 Correlational Research Gay, Mills, and Airasian

Classroom Assessment A Practical Guide for Educators by Craig A

Norms & Norming Raw score: straightforward, unmodified accounting of performance Norms: test performance data of a particular group of test takers that.

Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.

Measurement and Data Quality

Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides

Chapter 8 Introduction to Hypothesis Testing

Collecting Quantitative Data

Reliability and Validity what is measured and how well.

Instrumentation.

Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.

MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.

Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.

Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.

Unanswered Questions in Typical Literature Review 1. Thoroughness – How thorough was the literature search? – Did it include a computer search and a hand.

LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.

Technical Adequacy Session One Part Three.

Psychometrics William P. Wattles, Ph.D. Francis Marion University.

Foundations of Recruitment and Selection I: Reliability and Validity

Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.

CRT Dependability Consistency for criterion- referenced decisions.

Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.

Reliability REVIEW Inferential Infer sample findings to entire population Chi Square (2 nominal variables) t-test (1 nominal variable for 2 groups, 1 continuous)

Validity. Face Validity  The extent to which items on a test appear to be meaningful and relevant to the construct being measured.

Chapter Seven Measurement and Decision-Making Issues in Selection.

Tests and Measurements Intersession 2006.

Cut Points ITE Section One n What are Cut Points?

Correlation & Prediction REVIEW Correlation BivariateDirect/IndirectCause/Effect Strength of relationships (is + stronger than negative?) Coefficient of.

Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.

Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.

MEASUREMENT. MeasurementThe assignment of numbers to observed phenomena according to certain rules. Rules of CorrespondenceDefines measurement in a given.

Validity and Item Analysis Chapter 4. Validity Concerns what the instrument measures and how well it does that task Not something an instrument has or.

Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”

Psychometrics. Goals of statistics Describe what is happening now –DESCRIPTIVE STATISTICS Determine what is probably happening or what might happen in.

Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.

Measurement MANA 4328 Dr. Jeanne Michalski

REVIEW I Reliability scraps Index of Reliability Theoretical correlation between observed & true scores Standard Error of Measurement Reliability measure.

Chapter 6 - Standardized Measurement and Assessment

Chapter 6 Norm-Referenced Reliability and Validity.

Chapter 7 Criterion-Referenced Reliability and Validity PoorSufficientBetter.

Chapter 7 Criterion-Referenced Measurement PoorSufficientBetter.

©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 5 What is a Good Test?

Assessing Student Performance Characteristics of Good Assessment Instruments (c) 2007 McGraw-Hill Higher Education. All rights reserved.

Dr. Jeffrey Oescher 27 January 2014 Technical Issues  Two technical issues  Validity  Reliability.

Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.

Chapter 6 Norm-Referenced Measurement. Topics for Discussion Reliability Consistency Repeatability Validity Truthfulness Objectivity Inter-rater reliability.

ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.

Sample Power No reading, class notes only

Evaluation of measuring tools: validity

MANA 4328 Dr. Jeanne Michalski

REVIEW I Reliability scraps Index of Reliability

Presentation transcript:

REVIEW I Reliability Index of Reliability Theoretical correlation between observed & true scores Standard Error of Measurement Reliability measure Degree to which an observed score fluctuates due to measurement errors Factors affecting reliability A test must be RELIABLE to be VALID

REVIEW II Types of validity Content-related (face) Represents important/necessary knowledge Use “experts” to establish Criterion-related Evidence of a statistical relationship w/ trait being measured Alternative measures must be validated w/ criterion measure Construct-related Validates unobservable theoretical measures

REVIEW III Standard Error of Estimate Validity measure Degree of error in estimating a score based on the criterion Methods of obtaining a criterion measure Actual participation Perform criterion Predictive measures Interpreting “r”

Criterion-Referenced Measurement PoorSufficientBetter It’s all about me: did I get ‘there’ or not?

Criterion-Referenced Testing aka, Mastery Learning Standard Development Judgmental: use experts typical in human performance Normative: theoretically accepted criteria Empirical: cutoff based on available data Combination: expert & norms typically combined

Advantages of Criterion-Referenced Measurement Represent specific, desired performance levels linked to a criterion Independent of the % of the population that meets the standard If not met, specific diagnostic evaluations can be made Degree of performance is not important-reaching the standard is Performance linked to specific outcomes Individuals know exactly what is expected of them

Limitations of Criterion-Referenced Measurement Cutoff scores always involve subjective judgment Misclassifications can be severe Motivation can be impacted; frustrated/bored

Setting a Cholesterol “Cut-Off” Cholesterol mg/dl N of deaths

Setting a Cholesterol “Cut-Off” Cholesterol mg/dl N of deaths

Statistical Analysis of CRTs Nominal data (categorical; major, gender, pass/fail, etc.) Contingency table development (2x2 Chi 2 ) Chi-Square analysis (used w/ categorical variables) Proportion of agreement (see next slide) Phi coefficient (correl for dichotomous (y/n) variables)

Proportion of Agreement (P) Sum the correctly classified cells/total (n 1 + n 4 )/n 1 +n 2 +n 3 + n 4 Examples on board

Considerations with CRT The same as norm-referenced testing Reliability (consistency) Equivalence: is the PACER equivalent to 1-mi run/walk? Stability: does same test result in consistent findings? Validity (Truthfulness of measurement) Criterion-related: concurrent or predictive Construct-related: establish cut scores (see Fig. 7.3)

Meeting Criterion-Referenced Standards Possible Decisions Truly Below Criterion Truly Above Criterion Did not achieve standard Correct Decision False Positive Did achieve standard False Negative Correct Decision

CRT Reliability Test/Retest of a single measure Fail Day 2 Pass Fail Pass Day 1 n1n1 n2n2 n3n3 n4n4 (n 1 + n 4 )/(n 1 +n 2 +n 3 + n 4)

CRT Validity Use of a field test and criterion measure Fail Field Test Pass Fail Pass Criterion n1n1 n2n2 n3n3 n4n4

Example 1 FITNESSGRAM Standards (1987) 24 (4%) 21 (4%) 64 (11%) 472 (81%) Did not achieve the standard on the run/walk test Did achieve the standard on the run/walk test Below the criterion VO 2 max Above the criterion VO 2 max P=( )/( ) 496/581=85%

Example 2 AAHPERD Standards (1988) 130 (22%) 23 (4%) 201 (35%) 227 (39%) Did not achieve the standard on the run/walk test Did achieve the standard on the run/walk test Below the criterion VO 2 max Above the criterion VO 2 max Compare Examples 1-2: F’gram (81%) better predictor of VO 2max than AAHPERD standards (39%) P=( )/( ) 357/581=61%

Criterion-referenced Measurement Find a friend: Explain one thing that you learned today and share WHY IT MATTERS to you as a future professional