Presentation on theme: "VALIDITY vs. RELIABILITY by: Ivan Prasetya. Because of measuring the social phenomena is not easy like measuring the physical symptom and because there."— Presentation transcript:
VALIDITY vs. RELIABILITY by: Ivan Prasetya
Because of measuring the social phenomena is not easy like measuring the physical symptom and because there are a lot of MEASUREMENT ERROR in measuring social phenomena, so we should do VALIDITY & RELIABILITY TESTING.
VALIDITY vs. RELIABILITY Measurement VALIDITY: How well an empirical indicator and the conceptual definition of the construct that the indicator is suppose to measure ‘fit’ together. Measurement RELIABILITY: The independability or consistency of the measure of a variable.
Different tools are used for measuring different things A tape measure is used to measure length or distance. A thermometer is used to measure temperature. A stopwatch is used to measure time.
If we want to measure weight/mass, we can use these tools: Digital scale have the highest VALIDITY. (1 litre = 1 kg)
However, even though a digital scale has the highest validity, but if the subject is not cooperative, the test result will not be accurate: If you keep jumping around while measuring your weight, you won't get an accurate result of your body weight. And if you carry extra loads while stepping on a digital scale, you won't get an accurate result of your body weight either.
GOODNESS OF MEASURES GOODNESS OF DATA RELIABILITY (Accuracy in measurement) VALIDITY (Are we measuring the right thing?) STABILITY CONSISTENCY TEST-RETEST RELIABILITY PARALLEL-FORM RELIABILITY INTERITEM CONSISTENCY RELIABILITY SPLIT –HALF RELIABILITY LOGICAL VALIDITY (CONTENT) CONGRUENT VALIDITY (CONSTRUCT)) CRITERION-RELATED VALIDITY FACE VALIDITY PREDICTIVE CONCURENT CONVERGENT DISCRIMINANT
RELIABILITY Indicates the extent to which the measure is without bias (error free) and hence offers consistent measurement across time and across the various items in the instrument. In other words, the reliability of measure indicates the stability and consistency with which the instrument measures the concept and helps to assess the “goodness” of measure.
STABILITY OF MEASURES TEST-RETEST RELIABILITY The Reliability coefficient obtained with a repetition of the same measure on a second occasion. PARALLEL-FORM RELIABILITY When responses on two comparable sets of measures tapping the same construct are highly correlated.
TEST-RETEST RELIABILITY If infrastructure in the country is not good, your company will cancel your decision to do Foreign Direct Investment (FDI) in that following country. Answer (now): strongly disagree Answer (20 days later): agree
PARALLEL-FORM RELIABILITY Do you think that Susi Similikiti is beautiful? Answer: YES Do you think that Tukul’s wife is beautiful? Answer: NO Fact: Susi Similikiti is Tukul’s wife.
CONSISTENCY INTERITEM CONSISTENCY This the test of the consistency of respondents’ answer to all the items in the measure. SPLIT-HALF RELIABILITY Reflecting the correlations between two halves of an instrument.
PROBLEMS and PITFALL in VALIDITY
VALIDITY in QUESTIONAIRE DESIGN How diligent are you? 5 = very diligent 4 = diligent 3 = indifferent 2 = lazy 1 = very lazy Because respondents have to evaluate themselves, they tend to choose number (4) and (5), or at least number (3).
To measure the degree of diligent (as a student), we also can use this questionaire: How much time do you spend to learn everyday? How many books do you read every week? How often do you go to library every week? The answer which revealed from questionaire above will tend to be more accurate to measure the degree of diligent.
Avoid question that ask two things at once, you won’t know which ‘bit’ people are answering:
Avoid ambiguity: Apakah kalian sudah membeli buku sejarah demokrasi yang baru? Apanya yang baru??? Bukunya...??? Sejarahnya...??? Atau... Demokrasinya???
Avoid jargon/abbreviations/slang: Higher number of current GDP in a country will encourage your company to do more FDI in that country. Higher number of current Per capita Income in a country will make your company do more FDI in that country. If the level of corruption in a country increased, your company will reduce their amount of FDI in that country.
Avoid not mutually exclusive options: I’m 20 years old now. What age are you?
Problems and Pitfalls Avoid making questionnaire too long Typographical / spelling errors
VALIDITY: Evidence that the instrument, technique, or process used to measure a concept does indeed measure the intended concept. CONTENT VALIDITY: Ensures that the measures includes an adequate & representative set of items that tap the concept. This the function of how well the dimensions & elements of a concept have been delineated. FACE VALIDITY: Indicates that the items that are supposed to measure a concept, do on the face of it look like they measure the concept.
CRITERION RELATED VALIDITY: Is established the measure differentiates individuals on criterion it is expected to predict. This can be done by establishing concurrent validity or predictive validity. CONCURENT VALIDITY: Is established when the scale discriminates individuals who are known to be different. PREDICTIVE VALIDITY: Indicates the ability of the measuring instrument to differentiate among individuals as to a future criterion.
CONSTRUCT VALIDITY: Testifies to how well the results obtained from the use of the measure fit the theories around which the test is designed. This is assessed through convergent validity and discriminant validity. CONVERGENT VALIDITY: Is established when the scores obtained by two different instrument measuring the same concept are highly correlated. DISCRIMINANT VALIDITY: Is established when based on theory, two variables are predicted to be uncorrelated, and the scores obtained by measuring them are indeed empirically found to be so.
TYPES OF VALIDITY VALIDITYDESCRIPTION Content Validity Does the measure adequately measure the concept? Face Validity Do “expert” validate that the instrument measures what its name suggest it measures? Criterion-related Validity Does the measure differentiate in a manner that helps to predict a criterion variable? Concurent Validity Does the measure differentiate in a manner that helps to predict a criterion variable currently? Predictive Validity Does the measure differentiate individuals in a manner as to help predict a future criterion? Construct Validity Does the instrument tap the concept as theorized? Convergent Validity Do 2 instruments measuring the concept correlate highly? Discriminant Validity Does the measure have low correlation with a variable that is supposed to be unrelated to this variable?
Using SPSS Analyze >>> Scale >>> Reliability Scale If item delete >>> Continue >>> OK If Corrected Item-Total Correlation > r-table: VALID If Cronbach’s Alpha if item deleted > r-table: RELIABEL
Using SPSS Item-Total StatisticsScale Mean if Item Deleted Scale Variance if Item Deleted Corrected Item-Total Correlation Cronbach's Alpha if Item Deleted VAR , ,02990,27850,9368 VAR , ,75400,35680,9366 VAR , ,55060,31130,9364 VAR , ,99890,43540,9357 VAR , ,16550,33000,9364 VAR , ,67930,44320,9356 VAR , ,06440,45550,9356 VAR , ,38510,54540,9349 VAR , ,73450,39290,9361