 Probability Population:

Presentation on theme: "Probability Population:"— Presentation transcript:

Probability Population:
The set of all individuals of interest (e.g. all women, all college students) Sample: A subset of individuals selected from the population from whom data is collected probability

What we learned from Probability
The mean of a sample can be treated as a random variable. By the central limit theorem, sample means will have a normal distribution (for n > 30) with and Because of this, we can find the probability that a given population might randomly produce a particular range of sample means. Use table E.10

Inferential statistics
Population: The set of all individuals of interest (e.g. all women, all college students) Sample: A subset of individuals selected from the population from whom data is collected Inferential statistics

Once we’ve got our sample
The key question in statistical inference: Could random chance alone have produced a sample like ours? Patterns may not be indicative of some underlying factor Patterns may be natural fluctuations

Once we’ve got our sample
Distinguishing between 2 interpretations of patterns in the data: Random Causes: Fluctuations of chance Systematic Causes Plus Random Causes: True differences in the population Bias in the design of the study Inferential statistics separates Patterns may not be indicative of some underlying factor Patterns may be natural fluctuations

Reasoning of hypothesis testing
Make a statement (the null hypothesis) about some unknown population parameter. Collect some data. Assuming the null hypothesis is true, what is the probability of obtaining data such as ours? (this is the “p-value”). If this probability is small, then reject the null hypothesis.

Step 1: Stating hypotheses
Null hypothesis H0 Straw man: “Nothing interesting is happening” Alternative hypothesis Ha What a researcher thinks is happening May be one- or two-sided

Step 1: Stating hypotheses
Hypotheses are in terms of population parameters One-sided Two-sided H0: µ=110 H0: µ = 110 H1: µ < 110 H1: µ ≠ 110

Step 2: Set decision criterion
Decide what p-value would be “too unlikely” This threshold is called the alpha level. When a sample statistic surpasses this level, the result is said to be significant. Typical alpha levels are .05 and .01.

More on setting a criterion
The retention region. The range of sample mean values that are “likely” if H0 is true. If your sample mean is in this region, retain the null hypothesis. The rejection region. The range of sample mean values that are “unlikely” if H0 is true. If your sample mean is in this region, reject the null hypothesis the range of extreme sample mean values that are unlikely to be obtained by chance in cases where the “treatment” mean is the same as the population mean

Setting a criterion Accept H0 Reject H0 Reject H0 Null distribution
Zcrit Zcrit

Step 3: Compute sample statistics
A test statistic (e.g. Ztest, Ttest, or Ftest) is information we get from the sample that we use to make the decision to reject or keep the null hypothesis. A test statistic converts the original measurement (e.g. a sample mean) into units of the null distribution (e.g. a z-score), so that we can look up probabilities in a table.

Test Statistics Accept H0 Reject H0 Reject H0 Null distribution Ztest?
Zcrit Zcrit

Accept H0 Reject H0 Zcrit If we want to know where our sample mean lies in the null distribution, we convert X-bar to our test statistic Ztest If an observed sample mean were lower than z=-1.65 then it would be in a critical region where it was more extreme than than 95% of all sample means that might be drawn from that population

Step 4: Make a decision If our sample mean turns out to be extremely unlikely under the null distribution, maybe we should revise our notion of µH0 We never really “accept” the null. We either reject it, or fail to reject it.

Steps of hypothesis testing
State hypothesis (H0, HA) Select a criterion (alpha, Zcrit) Compute test statistic (Ztest) and get a p-value Make a decision

Z as test statistic Z test-statistic converts a sample mean into a z-score from the null distribution. Zcrit is the criterion value of Z that defines the rejection region Ztest is the value of Z that represents the sample mean you calculated from your data Standard error!!!! p-value is the probability of getting a Ztest as extreme as yours under the null distribution

Z as test statistic All test statistics are fundamentally a comparison between what you got and what you’d expect to get from chance alone Deviation you got Deviation from chance alone If the numerator is considerably bigger than the denominator, you have evidence for a systematic factor on top of random chance

Example I Tim believes that his “true weight” is 187 lbs with a standard deviation of 3 lbs. Tim weighs himself once a week for four weeks. The average of these four measurements is Are the data consistent with Tim’s belief?

Example I H0:  = 187 HA:  > 187
Criterion? Let’s say alpha=.05. That would be Zcrit = 1.65 An X-bar of is what Ztest? What is the probability of getting a Ztest as high as ours? If H0 were true, there would be only about a 1% chance of randomly obtaining the data we have. Reject H0.

Example I illustrated Reject H0 Zcrit Ztest z = 190.5-187 = 2.33 3 4
0.01 x = 187 x= 1.5 190.5 1.65 2.33 Zcrit Ztest

Exercise We have a sample of 500 students whose average score on some standardized test is 461. We think they are a particularly gifted bunch. Assume the general student population has a distribution of scores that is approximately normal with µ = 450 and  = 100. Does our sample come from a population with a mean of 450? Or are they a better test-taking species? H0: µ = 450 H1: µ > 450

Exercise How to proceed? Let’s: Select a criterion Calculate a z-score
Compare our sample z with our criterion Make a decision

Exercise We have a sample of 500 students whose average score on some standardized test is 461. We think they are a particularly gifted bunch. Assume the general student population has a distribution of scores that is approximately normal with µ = 450 and  = 100. Does our sample come from a population with a mean of 450? Or are they a better test-taking species? H0: µ = 450 H1: µ > 450

Exercise illustrated We reject the null hypothesis because sample means of 461 or larger have a very small probability. (We expect such large means less than 1% of the time.)

When we reject a null hypothesis, it is because
(a) if we believe the null hypothesis, there is only a small probability of getting data like ours by chance alone. (b) if we believe our data, and don’t think it came from an unlikely chance event, the null distribution is probably not true.

One-tailed tests If HA states  is < some value, critical region occupies left tail If HA states  is > some value, critical region occupies right tail If observed p-value is less than , reject Ho If observed p-value is greater that or equal to , do not reject Ho Graphic from

differ “significantly”
Right-tailed tests H0: µ = 100 H1: µ > 100 Points Right Fail to reject H0 Reject H0 alpha Values that differ “significantly” from 100 Zcrit 100

differ “significantly”
Left-tailed tests H0: µ = 100 H1: µ < 100 Points Left Reject H0 Fail to reject H0 alpha Values that differ “significantly” from 100 Zcrit 100

One- vs. two-tailed tests
In theory, should use one-tailed when 1. Change in opposite direction would be meaningless 2. Change in opposite direction would be uninteresting 3. No rival theory predicts change in opposite direction By convention/default in the social sciences, two-tailed is standard Why? Because it is a more stringent criterion (as we will see). A more conservative test.

Two-tailed hypothesis testing
HA is that µ is either greater or less than µH0 HA: µ ≠ µH0  is divided equally between the two tails of the critical region

Two-tailed hypothesis testing
Means less than or greater than Reject H0 Fail to reject H0 Reject H0 alpha Zcrit 100 Zcrit Values that differ significantly from 100

differ “significantly”
One tail Reject H0 Fail to reject H0 .05 Values that differ “significantly” from 100 Zcrit 100 100 Values that differ significantly from 100 Fail to reject H0 Reject H0 Two tail .025 Zcrit

Example We have a sample of 36 children of geniuses. They have an average IQ of We want to know whether they are significantly different from the general population of children, who have µ=100 and σ=25. Test the hypothesis that the mean of this group is higher than that of the population. What is Ztest? What is Zcrit for alpha = .05? For alpha = .01? Do we reject the null for either case? What is the exact p-value for this test?

Example Ztest= 10/4.16 = 2.4 alpha .05, Zcrit=1.64;
P(Z>2.4)=.008 Reject Ho Ztest

Example We have a sample of 36 children of geniuses. They have an average IQ of We want to know whether they are significantly different from the general population of children, who have µ=100 and σ=25. Test the hypothesis that the mean of this group is not equal to that of the population What is Ztest? What is Zcrit for alpha = .05? For alpha = .01? Do we reject the null for either case? What is the exact p-value for this test?

Example Ztest= 10/4.16 = 2.4 alpha .05, Zcrit=1.96;
P(/Z/>2.4)=.016 Reject Ho Ztest