Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.

Slides:



Advertisements
Similar presentations
Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
Advertisements

Topics: Quality of Measurements
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
VALIDITY AND RELIABILITY
General Information --- What is the purpose of the test? For what population is the designed? Is this population relevant to the people who will take your.
Reliability and Validity of Research Instruments
Chapter 4 Validity.
Concept of Measurement
Beginning the Research Design
Reliability and Validity
Lecture 7 Psyc 300A. Measurement Operational definitions should accurately reflect underlying variables and constructs When scores are influenced by other.
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview Central Limit Theorem The Normal Distribution The Standardised Normal.
Validity and Reliability EAF 410 July 9, Validity b Degree to which evidence supports inferences made b Appropriate b Meaningful b Useful.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Classroom Assessment A Practical Guide for Educators by Craig A
Reliability of Selection Measures. Reliability Defined The degree of dependability, consistency, or stability of scores on measures used in selection.
Reliability and Validity. Criteria of Measurement Quality How do we judge the relative success (or failure) in measuring various concepts? How do we judge.
Review of normal distribution. Exercise Solution.
Measurement and Data Quality
Reading Assessments for Elementary Schools Tracey E. Hall Center for Applied Special Technology Marley W. Watkins Pennsylvania State University Frank.
Instrumentation.
Foundations of Educational Measurement
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
Unanswered Questions in Typical Literature Review 1. Thoroughness – How thorough was the literature search? – Did it include a computer search and a hand.
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
Technical Adequacy Session One Part Three.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Principles of Test Construction
Chapter 1: Research Methods
Chapter 4: Test administration. z scores Standard score expressed in terms of standard deviation units which indicates distance raw score is from mean.
Review of Basic Tests & Measurement Concepts Kelly A. Powell-Smith, Ph.D.
Reliability & Validity
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
Assessing Learners with Special Needs: An Applied Approach, 6e © 2009 Pearson Education, Inc. All rights reserved. Chapter 4:Reliability and Validity.
EDU 8603 Day 6. What do the following numbers mean?
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Chapter 4 – Research Methods in Clinical Psych Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Appraisal and Its Application to Counseling COUN 550 Saint Joseph College For Class # 3 Copyright © 2005 by R. Halstead. All rights reserved.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”
SOCW 671: #5 Measurement Levels, Reliability, Validity, & Classic Measurement Theory.
Psychometrics. Goals of statistics Describe what is happening now –DESCRIPTIVE STATISTICS Determine what is probably happening or what might happen in.
Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.
Measurement MANA 4328 Dr. Jeanne Michalski
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
Reliability and Validity Themes in Psychology. Reliability Reliability of measurement instrument: the extent to which it gives consistent measurements.
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Chapter 6 - Standardized Measurement and Assessment
Reliability and Validity in Testing. What is Reliability? Consistency Accuracy There is a value related to reliability that ranges from -1 to 1.
Chapter 3 Selection of Assessment Tools. Council of Exceptional Children’s Professional Standards All special educators should possess a common core of.
Lesson 3 Measurement and Scaling. Case: “What is performance?” brandesign.co.za.
Module 5: Basic Concepts of Measurement. Module 5 focuses on concepts and terminology that will be helpful as you administer and interpret tests and other.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
WHS AP Psychology Unit 7: Intelligence (Cognition) Essential Task 7-3:Explain how psychologists design tests, including standardization strategies and.
Measurement and Scaling Concepts
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
Copyright © 2009 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 47 Critiquing Assessments.
MGMT 588 Research Methods for Business Studies
Reliability and Validity
Ch. 5 Measurement Concepts.
Concept of Test Validity
Evaluation of measuring tools: validity
Reliability & Validity
Week 3 Class Discussion.
پرسشنامه کارگاه.
Presentation transcript:

Measurement Concepts & Interpretation

Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different the client is from the norm group (inter-individual) By comparing a client to a peer in the norm group to determine how different the client is from the norm group (inter-individual) –Scores provided in norm tables –The score in the norm table usually indicates how the client with peers in same age group or grade

Interpretation, cont. Comparing a client with his or her own performance (intra- individual) Comparing a client with his or her own performance (intra- individual)

Define: Mean Mean Median Median Mode Mode

So you wanna use psychological tests… Um, CAREFULLY review the test manual Um, CAREFULLY review the test manual Consider these aspects: Consider these aspects: –Theoretical Orientation of test/instrument –Practical Considerations –Standardization –Reliability –Validity Gary Groth-Marnat, 2003

Theoretical Orientation Do you adequately understand the theoretical construct the test is supposed to be measuring? Do you adequately understand the theoretical construct the test is supposed to be measuring? –If not, do some research. Do the test items correspond to the theoretical description of the construct? Do the test items correspond to the theoretical description of the construct? –Usually manuals provide individual analyses of the items…are the items relevant?

Practical Considerations If reading is required by the examinee, does his or her ability match the level required by the test? If reading is required by the examinee, does his or her ability match the level required by the test? –Tests vary in terms of the level of education How appropriate is the length of the test? How appropriate is the length of the test? –Some are too damn long and who likes that? You can always get additional training for some tests so you become Über good at it. You can always get additional training for some tests so you become Über good at it.

Standardization (adequacy of norms) Is the population to be test similar to the population the test was standardized on? Is the population to be test similar to the population the test was standardized on? Was the size of the standardization sample adequate? Was the size of the standardization sample adequate? Have specialized subgroup norms been established? Have specialized subgroup norms been established? How adequately do the instructions permit standardized administration? How adequately do the instructions permit standardized administration?

Norms!

Reliability The reliability of a test refers to its degree of stability, consistency, predictability, and accuracy The reliability of a test refers to its degree of stability, consistency, predictability, and accuracy Are reliability estimates sufficiently high? (correlations generally around.90 for clinical decision making and around.70 for research purposes) Are reliability estimates sufficiently high? (correlations generally around.90 for clinical decision making and around.70 for research purposes) What implications do the relative stability of the trait, the method of estimating reliability, and the test format have on reliability? What implications do the relative stability of the trait, the method of estimating reliability, and the test format have on reliability?

You tell me… Test-Retest Reliability Test-Retest Reliability –The reliability coefficient is calculated by correlating the scores obtained by the same person on two different administrations. Alternate Forms Alternate Forms –Trait is measured several times on the same individual by using parallel/alternate forms of the test – the different measurements should produce similar results Split half Reliability Split half Reliability –Test only given once (items split in half…and two halves are correlated) Interscorer Reliability Interscorer Reliability –When scoring is based partially on the judgment of the examiner (e.g., Rorschach). Responses are scored by two people or two people score one client’s responses)

All tests have a degree of error The inevitable, natural variation in human performance The inevitable, natural variation in human performance –Measures of ability usually have less variability than measures of personality…why? Psychological testing methods are necessarily imprecise Psychological testing methods are necessarily imprecise –Constructs in psychology are measured indirectly

Standard Error of Measurement Test scores consist of both truth and error Test scores consist of both truth and error SEM provides a range of to indicate how extensive that error is likely to be SEM provides a range of to indicate how extensive that error is likely to be –The higher the reliability, the narrower the range of error The SEM is a standard deviation score. The SEM is a standard deviation score. –A SEM of 3 on an IQ test would indicate that individual’s score has a 68% chance of being +/-3 IQ points from the estimated true score – refer back to the normal distribution curve –The SEM is a statistical index of how a person’s repeated scores on a specific test would fall around a normal distribution (also referred to as a confidence interval)

Validity Wheras reliability addresses issues of consistency, validity assess what the test is to be accurate about. Wheras reliability addresses issues of consistency, validity assess what the test is to be accurate about. What criteria and procedures were used to validate the test? What criteria and procedures were used to validate the test? Will the test produce accurate measurements in the context and for the purpose for which you would like to use it? Will the test produce accurate measurements in the context and for the purpose for which you would like to use it? –A psychological test is not valid in any abstract or absolute sense. It must be valid in a particular CONTEXT and for a specific group of people.

Face validity Face validity is present if the test looks good to the persons taking it, the policymakers who decide to include it in their programs, and to other untrained personnel. Face validity is present if the test looks good to the persons taking it, the policymakers who decide to include it in their programs, and to other untrained personnel.

Criterion validity Concurrent validity Concurrent validity –Measurements taken at the same, or approximately the same, time as the test –Concurrent validation is preferable if an assessment of the client’s current status is required Predictive validity Predictive validity –Outside measurements that were taken some time after the test scores were derived. For example, the predictive validity may be evaluated by correlating test scores with other scores from similar measures a year after the initial testing

Construct Validity The extent to which the test measures a theoretical construct or trait The extent to which the test measures a theoretical construct or trait –First, the trait must be carefully analyzed –Consider the ways in which the trait should relate to other variable –Test the hypothesized relationships Does the test converge with variables that are theoretically similar to it? Does the test converge with variables that are theoretically similar to it? Does it discriminate from variables that are dissimilar to it? Does it discriminate from variables that are dissimilar to it?

Incremental validity For a test to be considered useful and efficient, it must be able to produce accurate results above and beyond the results that could be obtained with greater ease and less expense For a test to be considered useful and efficient, it must be able to produce accurate results above and beyond the results that could be obtained with greater ease and less expense Hey, self-assessments are pretty handy! Hey, self-assessments are pretty handy!

Beck Depression Inventory II (BDI-II) Add up the score for each of the twenty- one questions and obtain the total. The highest score on each of the twenty-one questions is three, the highest possible total for the whole test is sixty-three. The lowest possible score for the whole test is zero. Only add one score per question (the highest rated if more than one is circled). Add up the score for each of the twenty- one questions and obtain the total. The highest score on each of the twenty-one questions is three, the highest possible total for the whole test is sixty-three. The lowest possible score for the whole test is zero. Only add one score per question (the highest rated if more than one is circled).

“So what does my BDI-II score mean?” Below 4 = possible denial of depression, faking good Below 4 = possible denial of depression, faking good = these ups and downs are considered normal (i.e., suck it up) = these ups and downs are considered normal (i.e., suck it up) = mild to moderate depression = mild to moderate depression = moderate to severe depression = moderate to severe depression = severe depression = severe depression Over 44 = pretty damn high even for severely depressed persons; possible exaggeration of symptoms Over 44 = pretty damn high even for severely depressed persons; possible exaggeration of symptoms

Same for BAI 0-21 = low anxiety 0-21 = low anxiety = moderate anxiety = moderate anxiety Over 36 = high anxiety, may be severe Over 36 = high anxiety, may be severe