PSYCHOMETRICS. SPHS 5780, LECTURE 6: PSYCHOMETRICS, “STANDARDIZED ASSESSMENT”, NORM-REFERENCED TESTING.

Slides:



Advertisements
Similar presentations
Ed-D 420 Inclusion of Exceptional Learners. CAT time Learner-Centered - Learner-centered techniques focus on strategies and approaches to improve learning.
Advertisements

Psychometrics to Support RtI Assessment Design Michael C. Rodriguez University of Minnesota February 2010.
Wortham: Chapter 2 Assessing young children Why are infants and Preschoolers measured differently than older children and adults? How does the demand for.
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
What is a Good Test Validity: Does test measure what it is supposed to measure? Reliability: Are the results consistent? Objectivity: Can two or more.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 6 Validity.
Standardized Tests What They Measure How They Measure.
Types of Tests. Why do we need tests? Why do we need tests?
General Information --- What is the purpose of the test? For what population is the designed? Is this population relevant to the people who will take your.
Chapter 4 Validity.
REVIEW I Reliability Index of Reliability Theoretical correlation between observed & true scores Standard Error of Measurement Reliability measure Degree.
VALIDITY.
STANDARDIZED TESTING MEASUREMENT AND EVALUATION  All teaching involves evaluation. We compare information to criteria and ten make judgments.  Measurement.
Concept of Measurement
Measuring Achievement and Aptitude: Applications for Counseling Session 7.
Lesson Thirteen Standardized Test. Yuan 2 Contents Components of a Standardized test Reasons for the Name “Standardized” Reasons for Using a Standardized.
Chapter 9 Flashcards. measurement method that uses uniform procedures to collect, score, interpret, and report numerical results; usually has norms and.
Classroom Assessment A Practical Guide for Educators by Craig A
Norms & Norming Raw score: straightforward, unmodified accounting of performance Norms: test performance data of a particular group of test takers that.
Classroom Assessment A Practical Guide for Educators by Craig A
Chapter 9 Instructional Assessment © Taylor & Francis 2015.
Formative and Summative Assessment
MEASUREMENT AND EVALUATION
Susan Brown Purposes of Assessments Assessments serves three important purposes: Identification and program placement of students in need of special.
Topic 4: Formal assessment
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
Technical Adequacy Session One Part Three.
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
1 An Introduction to Language Testing Fundamentals of Language Testing Fundamentals of Language Testing Dr Abbas Mousavi American Public University.
CRT Dependability Consistency for criterion- referenced decisions.
The World of Assessment Consider the options! Scores based on developmental levels of academic achievement Age-Equivalent scores.
Chapter 3 Understanding Test Scores Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition.
Validity & Practicality
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Miller Function & Participation Scales (M-FUN)
Understanding Meaning and Importance of Competency Based Assessment
Assessment in Education Patricia O’Sullivan Office of Educational Development UAMS.
Diagnostics Mathematics Assessments: Main Ideas  Now typically assess the knowledge and skill on the subsets of the 10 standards specified by the National.
Understanding the TerraNova Test Testing Dates: May Kindergarten to Grade 2.
Teaching Today: An Introduction to Education 8th edition
Power Point and Syllabus h3443.html.
Item specifications and analysis
Issues in Selecting Assessments for Measuring Outcomes for Young Children Issues in Selecting Assessments for Measuring Outcomes for Young Children Dale.
Cut Points ITE Section One n What are Cut Points?
Formal Assessment Week 6 & 7. Formal Assessment Formal assessment is typical in the form of paper-pencil assessment or computer based. These tests are.
Introduction Gathering Information Observation Interviewing Norm Referenced Tools Authentic Assessment Characteristics of Authentic Assessment – 7M’s Validity.
Chapter 2 ~~~~~ Standardized Assessment: Types, Scores, Reporting.
Assessment and Testing
Assessment 101 Types.
Assessment Information from multiple sources that describes a student’s level of achievement Used to make educational decisions about students Gives feedback.
Ch 9 Internal and External Validity. Validity  The quality of the instruments used in the research study  Will the reader believe what they are readying.
Psychometrics. Goals of statistics Describe what is happening now –DESCRIPTIVE STATISTICS Determine what is probably happening or what might happen in.
Mini-Project #2 Quality Criteria Review of an Assessment Rhonda Martin.
DIAGNOSTICS SPHS 5780 (LECTURE 2). THE NATURE OF DIAGNOSIS AND EVALUATION.
REVIEW I Reliability scraps Index of Reliability Theoretical correlation between observed & true scores Standard Error of Measurement Reliability measure.
Copyright Keith Morrison, 2004 ASSESSMENT IN EDUCATION The field of assessment is moving away from simple testing towards a much more sophisticated use.
Standards-Based Tests A measure of student achievement in which a student’s score is compared to a standard of performance.
Educational Research Chapter 8. Tools of Research Scales and instruments – measure complex characteristics such as intelligence and achievement Scales.
Lesson 3 Measurement and Scaling. Case: “What is performance?” brandesign.co.za.
Assistant Instructor Nian K. Ghafoor Feb Definition of Proposal Proposal is a plan for master’s thesis or doctoral dissertation which provides the.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 5 What is a Good Test?
Lesson Thirteen Standardized Test. Contents Components of a Standardized test Reasons for the Name “Standardized” Reasons for Using a Standardized Test.
CSD 2230 HUMAN COMMUNICATION DISORDERS Topic 4 Assessment and Treatment of Communication Disorders Clinical Methods.
Understanding Populations & Samples
Week 3 Class Discussion.
Bursting the assessment mythology: A discussion of key concepts
Ace it! Summer Conference 2011 Assessment
TESTING AND EVALUATION IN EDUCATION GA 3113 lecture 1
Presentation transcript:

SPHS 5780, LECTURE 6: PSYCHOMETRICS, “STANDARDIZED ASSESSMENT”, NORM-REFERENCED TESTING

pSYCHOMETRICS

psychometricS “psycho” +“metric” Who is responsible for assessing psychometric adequacy? Is this the case even when a test or procedure is mandated? Who is affected when a psychometrically inadequate measure is used?

psychometricS Discussion: What makes a test or diagnostic procedure “good”?

psychometricS When assessing psychometric adequacy, we are assessing rigor of assessment Validity: Testing actually assesses what you think it is assessing (what it is supposed to be assessing) Reliability: Testing is consistent across time and across different clients

psychometricS Is assessment of psychometric adequacy (reliability and validity) needed for: Formal, norm-referenced tests? Criterion-referenced tests? Behavioral observation?

Psychometrics Rigor of assessment for informal testing Validity: Clear definition of the test domain, on which experts agree, as evidenced through “mastery by masters” and “non-mastery by non-masters” Reliability: Consistency in decision outcome “distance” from cut-off score score on closely related versions of measure Rigor or assessment for formal tests will be addressed in more detail in the current lecture

standardization

Standardization insures that all people taking the test….. receive the same experience are expected to perform the same task with the same materials receive the same amount of assistance from the examiner are evaluated according to a standard set of criteria WELL CONCIEVED STANDARDIZED TESTS ALLOW FOR MEANINGFUL COMPARISON AMONG CHILDREN Ideally, standardized test are developed by giving the test to a large group of children so that they (the test makers) may compute an acceptable range of variation in the scores for the age range covered. If the same materials, procedure, scoring are not used on all children than the results will have limited comparability across children. Example: put together a three pieces puzzle some would show child first some might not show the child first some would impose a 1 minute strategy others would have limitless times. Strict adherence to standardized procedure is fundamentally important Problem doing this with some handicapped children---Informal use of standardized test. …regardless of time or place of administration, regardless of clinician

Standardization “Standardized test” is a term commonly used to refer to norm-referenced tests. Why? Does this mean that other types of tests are not standardized?

Standardization How does standardization support a test’s validity? a test’s reliability?

Review of the formal /informal distinction

Purpose of formal vs. Informal Testing To determine eligibility for services, a norm-referenced test needs to be sensitive to the presence of a disorder (not miss those who have disorder) and specific in determination of referrals (not refer those who do not have a disorder). The same requirements do not hold for informal testing. Do you see why?

Examples of formal vs. Informal Testing Formal: CELF Informal: Augmentative communication evaluation

NORM-REFERENCED MEASURES CRITERION-REFERENCED MEASURES Function Identify clients who need help Plan the nature of that help Normative group performance Establish group of scores against which client is compared Establish performance standard (performance “criterion”) NOTE: Performance standard may be client’s own past performance! Interpreta- tions of client’s performance Relative to range of performance of others Relative to standard (“criterion”): mastery vs. non-mastery When to use When norms are available When broad content (“trait”) is of interest When norms are unavailable or inappropriate When specific skills/behaviors are of interest Specific Dx Focus Tx planning

NORM-REFERENCED MEASURES CRITERION-REFERENCED MEASURES Level of detail for given area Low detail (broad area) High detail (narrow area) Summary scores Converted scores Raw scores Choice of test items Those items on which test-takers perform variably Those items always passed by masters Those items always failed by non-masters Decision to be defended Relative ranking compared to normative group Dichotomous (mastery, non-mastery)

Formal (Norm-referenced) Tests: Nature

Purpose and procedure of norm-referenced tests Fundamental purpose is to rank individuals “to determine if an individual obtains a score similar to the group average or, if not, how far away from average the score is” (H&P, M&P) “(to determine) if there is a problem, or a significant enough difference from standard performance to warrant concern with regard to normalcy” Performance summarized with reference to the “standardization sample” (normative sample), through conversion of raw scores percentile scores standard scores

Advantages of Norm-Referenced Tests Objectivity Replicability Elimination of unwanted / uncontrolled variation

Disadvantages of Norm-Referenced Tests For some domains (e.g. language) it is difficult to find a language test that is: valid reliable The normative sample needs to represent your client NRTs do not accommodate cultural variation NRTs do not accommodate individual variation

Norm referenced tests cannot: Measure treatment progress Guide creation of therapy goals, through analyzing individual test items Richly describe performance Accommodate cultural variation Discussion of WHY this is the case….

Lecture 06a ends here this is where material for exam 2 ends