Measuring variables – Reliability and Validity

Measuring variables – Reliability and Validity
Chapter 5 Measuring variables – Reliability and Validity

Scientific Vocabulary Variable: Anything that takes on different
values, at different times, places, or in different individuals. Constant: Anything that remains the same for all individuals, at all times, and all places during the study. Chap 5 p1

Measurements: The Assignment of Symbols or numbers to something according to a set of rules.
Nominal Scales – differ only in name Ordinal Scales – differ in order (i.e. Rank) Interval Scales – Measure has equal intervals Ratio scales – naturally falling zero.

Construct: Concept or Abstract Schema
E.g., What is Self-Esteem? How could you measure it? In psychology research we have taken on the challenge of coming up with ways to use measurements to define constructs that are of interest to us such as intelligence, self esteem, Hand Eye coordination or Mindsight ability!

Properties of Good Measurement
Reliability: Consistency or Stability of scores. Reliability Coefficient – a correlation that measures the consistency of measurements over repeated testing. Coefficient should be higher than +.70 to indicate a strong relationship between measures.

Methods of Measuring Reliability
Test-Retest Reliability: The correlation between a set of measures taken on at least two occasions over time. E.g., Mirror tracing as measure of Hand-Eye Coordination. Want to try?

Equivalent Forms Reliability: The correlation between a set of measures taken on at least two forms of the same test. E.g., Mindsight – is this a stable measure?

Inattentional Blindness
Change Blindness Stimuli

Multi-item questionnaires are often used to measure a construct.

Internal Consistency Reliability: Measures how well items on a test correlate with each other. Are they all measuring the same Construct? Reliability increases with higher numbers of items on a test.

Cronbach’s alpha (a) most commonly reported measure of Internal Consistency. Average correlation among the items on the test. Cronbach’s alpha higher than +.70 is generally considered acceptable.

Interrater Reliability (Agreement): Degree of consistency or agreement between 2 or more judges, observers or raters.

Construct Validity: Accuracy of inferences, interpretations or actions made on the basis of test scores. Validity must be established with reference to the particular use for which the test is being considered.

Operationalization: the way a construct is represented and measured in a particular study (AKA Operational Definition). Is the measure a correct and appropriate way to define a construct for the purposes of the study?

Validation Methods Content Validity - expert opinion on the degree to which the measure adequately represents the construct 1) Face Validity - do items on a test appear to represent the construct the researcher is attempting to measure?

2) Is the construct fully represented or are some aspects of the construct missing?
3) Do the items include things that are irrelevant to the construct under study.

Multidimensional Construct Measures
Measure different aspects of a construct. The Self-Perception Profile for College Students (Neeman & Harter, 2012)

Statistical Validation of Multidimensional Tests Internal Structure Factor Analysis – correlational analysis that determines if the items that should be correlated strongly are correlated strongly. i.e., The items on the Social Acceptance subscale should all correlate highly with each other, but not as highly with the items assessing other dimensions of Self Esteem.

Homogeneity Tests – the degree to which a set of items measure a single construct. Homogeneity test look at the correlations among items (note this is also done for reliability). If the items are all highly correlated they are likely all validly assessing a construct.

Criterion-Related Validity How does the measure you are using relate to other known and accepted measures? Predictive Validity – using your procedure to some future construct related behavior. e.g., Can a test of adolescent aggression accurately predict which members of a hockey club will have high penalty minutes during a regular season.

Concurrent Validity – Comparing measures to other well accepted measures of the same construct. E.g., Enright Depression Scale. Validate by comparing to Beck Depression scale or to diagnosis given by panels of experienced therapists.

Convergent Validity Evidence - how well the measure correlates with independent measures of the same construct. This can and does include Predictive and Concurrent validity evidence.

Discriminant Validity Evidence – evidence that the test is not correlated with other theoretically different constructs. e.g., Social Desirability. Are people just telling you what they think you want to hear?

Known Groups Validity Evidence – does the test discriminate between known member of groups. E.g., Graduate Records Exam. Do scores discriminate between those who do well in graduate programs and those who do not?

Using reliability and Validity Information
Norming Groups – the people the measurement was tested on. Is the test age appropriate? gender appropriate? culturally appropriate? - is it dated?

ch12(1)

Sampling WHO is in the study?

Who to include in your study!
Your Sample should represent the population you want to apply the findings to. ch12(1)

No Sample will represent everyone in the Universe.
ch12(1)

Research has found that Ginko aids memory. Should You use it???
WHO is in the study greatly effects WHO the findings apply to. Research has found that Ginko aids memory. Should You use it??? ch12(1)

Scientists at the New York Institute for Medical Research reported that one-third of the Alzheimer's patients taking ginkgo improved in tasks involving memory, such as remembering the date or the names of relatives. About half of the group didn't experience improved memory, but showed no signs of increased memory loss. ch12(1)

Sample Population Sampling Bias Population Sample
Collection of participants used in a study Population Larger collection of people To which we want to generalize the results of the study Sampling Bias When the sample is not representative of the larger population Sample Population

Population Sample Element

US Elections Predictions
235,248,000 People of voting age 55.9% voter turnout Polls are trying to estimate the voting behaviors of 129 million people

Small numbers of subjects can be used to estimate the
Who is in the study? Small numbers of subjects can be used to estimate the behavior of a larger group… Real Clear Polls But the results will be VALID only if the sample is a good mini-version of the population. EI (2)

Statistics Vs. Parameters
Statistics are estimates of characteristics of the population.

Sampling Error The difference between the value of a sample statistic and the value of the population estimate. Samples statistics are estimates and all estimates have some potential for error. The only sample that can perfectly estimate a population parameter is one that includes ALL members of the population (Census).

If I repeatedly take samples of the same size from a population, will I always get the same estimate?

No, if I repeatedly took samples I would get a distribution of estimates. Most would be pretty close to the population parameter. The more inaccurate the estimate the less likely I would be to get that estimate. Mean of the distribution of sample means, is an estimate of the parameter. Error is normally distributed. The standard deviation of this distribution is the Sampling error. Sample Statistic

The larger the sample size, the lower the sampling error
The larger the sample size, the lower the sampling error. Larger samples give better estimates!

Remember z-scores! On a Normal distribution 95% of scores will fall between and standard deviations of the estimate (sampling errors). We can use the Sampling Error to give a range of confidence. E.g., I sample 100 people on campus and measure their height. The mean of these measures (X )is a statistic that estimates the parameter of height of students on campus. If the sampling error is 3, I can say I am 95% sure the parameter is (X +/- 3).

Equal Probability of Selection Method
Each Member of the Population has an equal, unbiased chance of being included in the sample. Ex. The School board does a random sample of students in military academy. There are 75% male students enrolled and 25% female. What percentage of students in their sample should be female?

If a sample is truly random than the sample should be a good approximation to the characteristics of the population. Population Sample

In order to have a true random sample, there can be no systematic reason why one member of the population is included or excluded. Advantage: Even if you do not know the demographics of the population you are sampling, random sampling will give a non-biased sample.

Simple Random Sampling
1. Identify all members of the population. Use a random processes to select sample members.

Stratified Random Sampling
Population Divided into strata (mutually exclusive groups). i.e. Males and Females. Ex. The School board does a random sample of students in military Academy. There are 75% male students enrolled and 25% females.

Proportional Stratified Sampling
The proportion of members of each strata match the proportions of the population. From Military school example 25% Female and 75% Male Ensures that sampling error does not cause unequal sampling from each sex.

Disproportional Stratified Sampling
From Military Example: 50% Male and 50% Female Why might you want to use this method? Will it give an unbiased estimate of the population?

Cluster Random Sampling
Randomly Select from a set of groups of predefined units (classrooms, neighborhoods, teams). I.e., I may have 15 general psych classes I can choose among for a study. I could randomly choice 3. Each sample member is not independently selected, it is a random selection of classes.

Systematic Sampling List all members of the population. N = pop. size Determine what size sample you want. n = sample size. N/n = k (Sample interval). E.g., N = 100 n = 10 K = 10. Randomly select a number from 1 to K. Select each 5th person from the population list.

Disadvantage: You have to be able to identify all members of the population (Sampling Frame) and all randomly selected elements must agree to be included in the sample. If I do a telephone survey do you think there are some people who would be more likely to agree to participate than others? Response Rate – percentage of people selected who actually participate.

Non-Random Sampling Techniques
Convenience Sampling: Use who is available. Many Psychology studies use General Psychology Students. Is this a problem? It depends on how similar the General Psych students are to the population I want to generalize to.

Replication in Different Populations

e.g., What is the average weight of these two rats?
Trying to average very diverse groups often ends up in measures that are not very representative of any individual in the group. e.g., What is the average weight of these two rats? ch12(1)

According to the book, a majority of Americans:
• Eats peanut butter at least once a week • Prefers smooth peanut butter over chunky • Can name all Three Stooges • Lives within a 20-minute drive of a Wal-Mart • Eats at McDonald's at least once a year • Takes a shower for approximately 10.4 minutes a day • Never sings in the shower • Lives in a house, not an apartment or condominium • Has a home valued between $100,000 and $300,000 • Has fired a gun • Is between 5 feet and 6 feet tall • Weighs 135 to 205 pounds • Is between the ages of 18 and 53 • Believes gambling is an acceptable entertainment option • Grew up within 50 miles of current home ch12(1)

Non-Random Sampling Techniques
Q. If I do a telephone survey, is this a random sample or convenience?

Quota Samples Used when you are trying to produce a sample which match the demographics of a known population you wish to generalize to. e.g., opinion, attitude or political polls. Determine the numbers of people in specific demographics that you need. Then use convenience sampling to fill each quota.

The resulting sample is not random but it matches the demographics of the population you are trying to learn about.

Random Selection vs. Random Assignment
Random Selection is used to ensure that the sample is an unbiased representation of the population. Random Assignment is used to ensure that there are no systematic differences between the individuals who serve in each condition of your experiment.

Measuring variables – Reliability and Validity

Similar presentations

Presentation on theme: "Measuring variables – Reliability and Validity"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Measuring variables – Reliability and Validity

Similar presentations

Presentation on theme: "Measuring variables – Reliability and Validity"— Presentation transcript:

Similar presentations

About project

Feedback