Week 4 Slides. Conscientiousness was most highly voted for construct We will also give other measures – protestant work ethic and turnover intentions.

Week 4 Slides

Conscientiousness was most highly voted for construct We will also give other measures – protestant work ethic and turnover intentions – Measure of academic performance? GPA / SAT??

Validity Traditional: Extent to which the test measures what it claims to measure or predicts what it claims to predict. Construct Test

Historical Terms Construct Validity: Does it measure or is it related to what we would expect Content Validity: Does the test accurately measure the construct – Is each facet represented? Criterion-related Validity: Does the test predict what it is supposed to predict – Integrity tests predicting bad behaviors

Historical view of validity Construct Validity Content Validity Convergent & Discriminant Factorial Specific steps CVR Correlation & MTMM martix Criterion- related Predictive vs. concurrent Reliability Factor analysis

Validity – Current: Are the inferences based on test score appropriate – The newer definition emphasizes the role of test in regards to other variables besides the construct Predicting the criteria

New Conceptualization of Validity Validity is a unitary concept – There are not different types of validity New definition: Are the implications I am making from a test appropriate? – Is it appropriate to use this test to measure this construct? – Is it appropriate to use these items to assess this population? – Is it appropriate to use this test to predict this these behaviors?

Historical view of validity Validity Test Content Relationship with other Internal Structure Specific steps CVR Correlation & MTMM martix Criterion- related Predictive vs. concurrent Reliability Factor analysis Consequences Response Process Ask, Experiment, Observe

Sources of Evidence 1.Test Content 2.Relationships with Other Variables 3.Internal Structure 4.Response Processes 5.Consequences of Testing Evidence based on…

Content Validity Extent to which items on a test are representative of the construct Two ways to demonstrate – DURING test development – AFTER test development

Content Validity During Test Development 1.Define the testing universe Interview experts Old tests Review the literature Define the construct What is the testing universe of a test that measures physical fitness? 2.Develop test specifications Blue print What are the content areas How many Questions

Content Validity 3.Establish a test format – Establish a test format – Written test? CAT? Practical test? – What type of questions? MC? T/F? Matching? – Assessment Centers 4. Construct test questions – Construct test questions – Carefully write questions according to the blueprint. – Make sure questions represent content area. – This is done by Subject Matter Experts.

Content Validity After Test Development – Examine the extent subject matter experts (SMEs) agree on the content validity of the items SMEs rate how essential test items are to the attribute E.g., essential, useful but not essential, not necessary for success on the job Content Validity Ratio is calculated Items below a minimum CVR value are dropped

Content Validity Ratio Judges rate item on scale of importance – Essential, important, not-essential CVR i = (n e – (N/2))/ (N/2) – CVR = value of the item – n = number of experts saying the item is essential – N = total number of experts

Sources of Evidence 1.Test Content 2.Relationships with Other Variables A.Construct validity B.Criterion-Related Validity 3.Internal Structure 4.Response Processes 5.Consequences of Testing Evidence based on…

Multitrait-Multimethod Design Pick Variables that are theoretically unrelated Measure each variable with several different types of measurement (i.e. Forced Choice, True-false etc.) Each variable should correlate highly with other measures of the same construct (convergent validity)

Multitrait-Multimethod Design

Correlations between different variables with the same method assess method bias (using the same method results in higher correlations regardless of construct) – This can also be evidence of discriminant validity Different Variables should not be highly correlated regardless of the method (discriminant validity)

Criterion Related Validity Construct Test Criteria Test Criteria Construct Validity Coefficient

Issues with Criterion Objective Criterion: Observable & Measurable – How many sales did an employee make – Less error because no subjectivity – Scope is narrow and does not get at motivation Subjective Criterion: Matter of judgment – Supervisor rating of performance – More error – Can take into account circumstances

Issues with Criterion Criterion Deficiency – When the criterion measure does not assess every area of the testing universe of the Criterion (construct) Criterion Contamination – When the criterion measures extraneous variables that are not considered as part of the testing universe of the criterion.

Issues with Criterion Sales Job Performance Job Performance of a retailer: Customer service; successful sales; Dependability Job Performance Sales & Management Test

Jeremy’s Intelligence Test Concurrent Study – Give test to Employees  Performance – Little evidence of causality Predictive Study – Give Test to Applicants  Performance – Range restriction

Indirect range restriction If you select employees based on some criterion measure – Make sure it is not correlated with the test you are trying to gather evidence for – For instance: a personality test used to make hiring decisions would be bad to use if you are interested in the validity of an integrity test

Overlap of Predictors Predictor 2 Criterion Predictor 1 Predictor 3

Single Predictor Jealousy Criterion

Overlapping Predictors Jealousy Criterion Envy Anger

Evidence from Internal Structure Test responses should follow the format that is theoretically expected – If test is thought to be increasingly more difficult, there should be evidence to support that claim – If the test is thought to homogeneous or heterogeneous, there should be evidence to support that claim – Items may function differently for certain sub- groups Differential item functioning

Evidence from Response Process Ask Test Takers about their decision process – Are they using traditional or atypical strategies to answer the items Monitor the response process – Keyboard stroke analysis – Have them show their work This also applies to raters or judges – Look for evidence of consistency across judgments What about consistently inaccurate judgments?

Consequences of Testing Consequences of the test can help to set guidelines about acceptable evidence of validity Test for serious diseases – What are there false positive vs. false negative rates What about differential item functioning (i.e. sub group differences) If a test can place you in a better job, it better be able to prove it. Online or paper pencil?

Validity Generalization Can be done – Meta-analyses – Synthetic Validity Please be careful!!!!

Class Assignment Define Conscientiousness – Use the literature and be specific Define Testing Universe – What are some sub-facets that we can reasonable measure? Write 5 multiple choice items that will adequately assess the testing universe What demographics should we include? What variables should we control for?

Writing Good Items GOOD survey questions – Straightforward & Unambiguous How well did your student do? – This is both indirect and ambiguous? » How well did Ted do on the last exam? Try to be concise & specific – Long sentences can be easily confused – Appropriate response options How many time in the past month have you sabotaged? – All the time, occasionally, sometimes, never. » We should use objective frequency here instead of subjective accounts of frequency

Writing Good Items – Ask only one question Did you like my party last weekend? If so, please indicate how likely you will be to attend another one of my events. This is double barreled – Easy to read Make sure to use an appropriate reading level – Most surveys are safe at a 6 th grade level unless it is a sample of children

Week 4 Slides. Conscientiousness was most highly voted for construct We will also give other measures – protestant work ethic and turnover intentions.

Similar presentations

Presentation on theme: "Week 4 Slides. Conscientiousness was most highly voted for construct We will also give other measures – protestant work ethic and turnover intentions."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Week 4 Slides. Conscientiousness was most highly voted for construct We will also give other measures – protestant work ethic and turnover intentions.

Similar presentations

Presentation on theme: "Week 4 Slides. Conscientiousness was most highly voted for construct We will also give other measures – protestant work ethic and turnover intentions."— Presentation transcript:

Similar presentations

About project

Feedback