Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.

Similar presentations


Presentation on theme: "Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has."— Presentation transcript:

1 Lecture 7: Bivariate Statistics

2 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has no impact on S.D. If a constant is multiplied to all scores, it will affect the dispersion (S.D. and variance) S = standard deviation X = individual score M = mean of all scores n = sample size (number of scores)

3 3

4 4 Distributions and Standard Deviations Example: A distribution has a mean of 40 and a standard deviation of 5. 68% of the distribution can be found between what two values? 95% of the distribution can be found between what two values?

5 5 Standard Error of the Mean Standard Error is an estimate of how much the mean would vary over many samples drawn from the same population. It is calculated from a single sample– it is an estimate of the standard deviation of the sampling distribution of the mean. It is calculated from a single sample– it is an estimate of the standard deviation of the sampling distribution of the mean. Smaller S.E. suggests that our sample is likely a good estimate of the population mean. Smaller S.E. suggests that our sample is likely a good estimate of the population mean.

6 6 Common Data Representations Histograms Simple graphs of the frequency of groups of scores. Simple graphs of the frequency of groups of scores. Stem-and-Leaf Displays Another way of displaying dispersion, particularly useful when you do not have large amounts of data. Another way of displaying dispersion, particularly useful when you do not have large amounts of data. Box Plots Yet another way of displaying dispersion. Boxes show 75 th and 25 th percentile range, line within box shows median, and “whiskers” show the range of values (min and max) Yet another way of displaying dispersion. Boxes show 75 th and 25 th percentile range, line within box shows median, and “whiskers” show the range of values (min and max)

7 7 Estimation and Hypothesis Tests: The Normal Distribution A key assumption for many variables (or specifically, their scores/values) is that they are normally distributed. In large part, this is because the most common statistics (chi-square, t, F test) rest on this assumption.

8 8 Why do we make this assumption? Central Limit Theorem Errors can be viewed as a sum of many independent random effects, thus individual scores will tend to be normally distributed. Even if Y is not normally distributed, the distribution of the sample mean will tend to be normal as the sample size increases. Y = µ + ε A given score (Y) is the sum of the mean of the population (µ) and some error (ε) A given score (Y) is the sum of the mean of the population (µ) and some error (ε)

9 9 The z-score Infinitely many normal distributions are possible, one for each combination of mean and variance– but all related to a single distribution. Standardizing a group of scores changes the scale to one of standard deviation units. Allows for comparisons with scores that were originally on a different scale.

10 10 z-scores (continued) Tells us where a score is located within a distribution– specifically, how many standard deviation units the score is above or below the mean. Properties The mean of a set of z-scores is zero (why?) The mean of a set of z-scores is zero (why?) The variance (and therefore standard deviation) of a set of z-scores is 1. The variance (and therefore standard deviation) of a set of z-scores is 1.

11 11 Area under the normal curve Example, you have a variable x with mean of 500 and S.D. of 15. How common is a score of 525? Z = 525-500/15 = 1.67 Z = 525-500/15 = 1.67 If we look up the z-statistic of 1.67 in a z-score table, we find that the proportion of scores less than our value is.9525. If we look up the z-statistic of 1.67 in a z-score table, we find that the proportion of scores less than our value is.9525. Or, a score of 525 exceeds.9525 of the population. (p <.05) Or, a score of 525 exceeds.9525 of the population. (p <.05) Z-Score Calculator Z-Score Calculator Z-Score Calculator Z-Score Calculator

12 12 Issues with Normal Distributions SkewnessKurtosis

13 Correlation Hypothesis testing an association between two metric variables

14 14 Checking for simple linear relationships Pearson’s correlation coefficient Measures the extent to which two variables are linearly related Measures the extent to which two variables are linearly related Basically, the correlation coefficient is the average of the cross products of the corresponding z-scores.

15 15 Correlations Ranges from zero to 1, where 1 = perfect linear relationship between the two variables. Remember: correlation ONLY measures linear relationships, not all relationships!

16 16 Correlation Example General Social Survey 1993 Education and Age

17 The t-test Hypothesis testing for the equality of means between two independent groups

18 18 Alternative Hypotheses Revisited Alternative Hypotheses: H 1 : μ 1 < μ c H 1 : μ 1 < μ c H 0 : μ 1 > μ c H 0 : μ 1 > μ c H 0 : μ 1 ≠ μ c H 0 : μ 1 ≠ μ c How do we test to see if the means between two sample populations are, in fact, different?

19 19 The t-test Where: M = mean SDM = Standard error of the difference between means N = number of subjects in group s = Standard Deviation of group df = degrees of freedom

20 20 Degrees of freedom d.f. = the number of independent pieces of information from the data collected in a study. Example: Choosing 10 numbers that add up to 100. Example: Choosing 10 numbers that add up to 100. This kind of restriction is the same idea: we had 10 choices but the restriction reduced our independent selections to N-1. In statistics, further restrictions reduce the degrees of freedom. In the t-test, since we deal with two means, our degrees of freedom are reduced by two.

21 21 Z-distribution versus t-distribution

22 22 t distribution As the degrees of freedom increase (towards infinity), the t distribution approaches the z distribution (i.e., a normal distribution) Because N plays such a prominent role in the calculation of the t-statistic, note that for very large N’s, the sample standard deviation (s) begins to closely approximate the population standard deviation (σ)

23 23 Assumptions Underlying the Independent Sample t-test Assumption of Normality Assumption of Homogeneity of Variance The outputs for the t-test in SPSS correspond to the standard t-test (equal variance assumed) and a separate variance t-test (equal variance not assumed) The outputs for the t-test in SPSS correspond to the standard t-test (equal variance assumed) and a separate variance t-test (equal variance not assumed)

24 24 Practical Example: Do men and women watch different amounts of TV per week? General Social Survey 1993


Download ppt "Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has."

Similar presentations


Ads by Google