Measurement MANA 4328 Dr. Jeanne Michalski

Slides:

Advertisements

Similar presentations

Issues of Reliability, Validity and Item Analysis in Classroom Assessment by Professor Stafford A. Griffith Jamaica Teachers Association Education Conference.

Advertisements

Topics: Quality of Measurements

Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.

© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.

Chapter 4 – Reliability Observed Scores and True Scores Error

Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.

Part 4 Staffing Activities: Selection

VALIDITY AND RELIABILITY

Reliability for Teachers Kansas State Department of Education ASSESSMENT LITERACY PROJECT1 Reliability = Consistency.

MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT

Dr. Jeanne Michalski Selection MANA 3320 Dr. Jeanne Michalski

Managing Human Resources, 12e, by Bohlander/Snell/Sherman © 2001 South-Western/Thomson Learning 5-1 Managing Human Resources Managing Human Resources Bohlander.

Chapter 4 Validity.

Concept of Measurement

Developing a Hiring System Reliability of Measurement.

Reliability and Validity

Chapter 7 Correlational Research Gay, Mills, and Airasian

INTELLIGENCE AND PSYCHOLOGICAL TESTING. KEY CONCEPTS IN PSYCHOLOGICAL TESTING Psychological test: a standardized measure of a sample of a person’s behavior.

Chapter 7 Evaluating What a Test Really Measures

Standardized Test Scores Common Representations for Parents and Students.

Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.

Measurement and Data Quality

Validity and Reliability

Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.

Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.

LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.

Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.

Technical Adequacy Session One Part Three.

Psychometrics William P. Wattles, Ph.D. Francis Marion University.

Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.

Reliability REVIEW Inferential Infer sample findings to entire population Chi Square (2 nominal variables) t-test (1 nominal variable for 2 groups, 1 continuous)

Back Ground Checks and Measurement types MANA 4328 Dennis C. Veit

Managing Human Resources, 12e, by Bohlander/Snell/Sherman © 2001 South-Western/Thomson Learning 5-1 Managing Human Resources Managing Human Resources Bohlander.

Reliability & Validity

Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.

Tests and Measurements Intersession 2006.

Assessing Learners with Special Needs: An Applied Approach, 6e © 2009 Pearson Education, Inc. All rights reserved. Chapter 4:Reliability and Validity.

6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)

Selecting a Sample. Sampling Select participants for study Select participants for study Must represent a larger group Must represent a larger group Picked.

Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.

Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.

Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.

Validity and Item Analysis Chapter 4. Validity Concerns what the instrument measures and how well it does that task Not something an instrument has or.

Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”

Chapter 9 Correlation, Validity and Reliability. Nature of Correlation Association – an attempt to describe or understand Not causal –However, many people.

SOCW 671: #5 Measurement Levels, Reliability, Validity, & Classic Measurement Theory.

Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.

Correlation They go together like salt and pepper… like oil and vinegar… like bread and butter… etc.

©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Technical Adequacy of Tests Dr. Julie Esparza Brown SPED 512: Diagnostic Assessment.

Chapter 6 - Standardized Measurement and Assessment

Reliability and Validity in Testing. What is Reliability? Consistency Accuracy There is a value related to reliability that ranges from -1 to 1.

WEEK 5 Staffing Activities: Selection Chapter 7: Measurement.

Dr. Jeffrey Oescher 27 January 2014 Technical Issues  Two technical issues  Validity  Reliability.

Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.

5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)

Dr. Jeanne Michalski Selection MANA 3320 Dr. Jeanne Michalski

Evaluation of measuring tools: validity

Tests and Measurements: Reliability

Reliability & Validity

Human Resource Management By Dr. Debashish Sengupta

MANA 4328 Dennis C. Veit Measurement MANA 4328 Dennis C. Veit 1.

پرسشنامه کارگاه.

Reliability and Validity of Measurement

PSY 614 Instructor: Emily Bullock, Ph.D.

MANA 5341 Dr. George Benson Measurement MANA 5341 Dr. George Benson 1.

MANA 4328 Dennis C. Veit Measurement MANA 4328 Dennis C. Veit 1.

The first test of validity

How can one measure intelligence?

Presentation transcript:

Measurement MANA 4328 Dr. Jeanne Michalski

Employment Tests  Employment Test  An objective and standardized measure of a sample of behavior that is used to gauge a person’s knowledge, skills, abilities, and other characteristics (KSAOs) in relation to other individuals.  Pre-employment testing has the potential for lawsuits.

Classification of Employment Tests  Cognitive Ability Tests  Aptitude tests Measures of a person’s capacity to learn or acquire skills.  Achievement tests Measures of what a person knows or can do right now.  Personality and Interest Inventories  “Big Five” personality factors: Extroversion, agreeableness, conscientiousness, neuroticism, openness to experience.

Classification of Employment Tests (cont’d)  Physical Ability Tests  Must be related to the essential functions of job.  Job Knowledge Tests  An achievement test that measures a person’s level of understanding about a particular job.  Work Sample Tests  Require the applicant to perform tasks that are actually a part of the work required on the job.

Reliability: Basic Concepts  Observed score = true score + error  Error is anything that impacts test scores that is not the characteristic being measured  Reliability measures error  Lower the error the better the measure  Things that can be observed are easier to measure than things that are inferred

Basic Concepts of Measurement 1. Variability and comparing test scores  Mean / Standard Deviation 2. Correlation coefficients 3. Standard Error of Measurement 4. The Normal Curve  Many people taking a test  Z scores and Percentiles

EEOC Uniform Guidelines Reliability – consistency of the measure If the same person takes the test again will he/she earn the same score? Potential contaminations:  Test takers physical or mental state  Environmental factors  Test forms  Multiple raters

Reliability Test Methods  Test – retest  Alternate or parallel form  Inter-rater  Internal consistency  Methods of calculating correlations between test items, administrations, or scoring.

Correlation  How strongly are two variables related?  Correlation coefficient (r)  Ranges from to 1.00  Shared variation = r 2 If two variables are correlated at r =.6 then they share.6 2 or 36% of the total variance.  Illustrated using scatter plots  Used to test consistency and accuracy of measure

Correlation Scatterplots Figure 5.3

Summary of Types of Reliability Compare scores within T1 Compare Scores across T1 and T2 Objective Measures (Test items) Internal Consistency or Alternate Form Test-retest Subjective Ratings Interrater – Compare different Raters Intrarater – Compare same Rater different times

Standard Error of Measure (SEM)  Estimate of the potential error for an individual test score  Uses variability AND reliability to establish a confidence interval around a score  95% Confidence Interval (CI) means if one person took the test 100 times, 95 of the scores will fall within the upper and lower bounds. SEM = SD * √ (1- reliability)  There is a 5% chance that scores observed outside the CI are due to chance, therefore the differences are “significant”.

Standard Error of Measure (SEM) SEM = SD * √ (1- reliability) Assume a mathematical ability test has a reliability of.9 and a standard deviation of 10: SEM = 10 * √ (1-.9) = 3.16 If an applicant scores a 50, the SEM is the degree to which the score would vary if she were retested on another day. Plus or minus 2 SEM gives you a ~95% confidence interval (3.16) = – 2(3.16) = 43.68

Standard Error of Measure  If an applicant scores 2 points above a passing score and the SEM is 3.16 – then there is a good chance of making a bad selection choice.  If two applicants score within 2 points of one another and the SEM is 3.16 then it is possible that the difference is due to chance.

Standard Error of Measure  The higher the reliability, the lower the SEM Std. Dev.rSEM

Confidence Intervals Jim -- 40Mary -- 50Jen SEM-2 SEM +2 SEM -2 SEM +2 SEM -2 SEM +2 SEM Do the applicants differ when SEM = 2? Do the applicants differ when SEM = 4?

Validity  Accuracy of the measure Are you measuring what you intend to measure? OR Does the test measure a characteristic related to job performance?  Types of test validity  Criterion – test predicts job performance  Predictive or Concurrent  Content – test representative of the job

Approaches to Validation  Content validity  The extent to which a selection instrument, such as a test, adequately samples the knowledge and skills needed to perform a particular job. Example: typing tests, driver’s license examinations, work sample  Construct validity  The extent to which a selection tool measures a theoretical construct or trait. Example: creative arts tests, honesty tests

Approaches to Validation  Criterion-related Validity  The extent to which a selection tool predicts, or significantly correlates with, important elements of work behavior. A high score indicates high job performance potential; a low score is predictive of low job performance.  Two types of Criterion-related validity  Concurrent Validity  Predictive Validity

Approaches to Validation  Concurrent Validity  The extent to which test scores (or other predictor information) match criterion data obtained at about the same time from current employees. High or low test scores for employees match their respective job performance.  Predictive Validity  The extent to which applicants’ test scores match criterion data obtained from those applicants/ employees after they have been on the job for some indefinite period. A high or low test score at hiring predicts high or low job performance at a point in time after hiring.

Tests of Criterion-Related Validity  Predictive validity “Future Employee or Follow-up Method” Test Applicants Performance of Hires Time mos.Time 2  Concurrent validity “Present Employee Method” Test Existing Employee AND Measure Performance Time 1

Types of Validity Job Duties KSA’s Selection Tests Job Performance Criterion-Related Content-Related

Reliability vs. Validity  Validity Coefficients  Reject below.11  Very useful above.21  Rarely exceed.40  Reliability Coefficients  Reject below.70  Very useful above.90  Rarely approaches 1.00 Why the difference?

More About Comparing Scores

The Normal Curve % 2% 16% 50% 84% 98% 99.9% Rounded Percentiles Z Scores Note: Not to Scale

Variability  How did an individual score compared to others?  How to compare scores across different tests? Test 1 Test 2 BobJimSueLinda Raw Score

Variability  How did an individual score compared to others?  How to compare scores across different tests? Test 1 Test 2 BobJimSueLinda Raw Score Mean48 46

Variability  How did an individual score compared to others?  How to compare scores across different tests? Test 1 Test 2 BobJimSueLinda Raw Score Mean48 46 Std. Dev2.5.80

Score – Mean Score – Mean Z Score = Std. Dev Std. Dev Z Score or “Standard” Score Test 1 Test 2 BobJimSueLinda Raw Score Mean48 46 Std. Dev Z score

The Normal Curve Note: Not to Scale Jim Bob Linda Sue

Z scores and Percentiles  Look up z scores on a “standard normal table”  Corresponds to proportion of area under normal curve  Linda has z score of 1.25  Standard normal table =.9265  Percentile score of 92.65%  Linda scored better than 92.65% of test takers Z score Percentile % % % % 15.9% % %

Proportion Under the Normal Curve Note: Not to Scale Jim Bob Linda Sue