Presentation is loading. Please wait.

Presentation is loading. Please wait.

Basic Quantitative Methods in the Social Sciences (AKA Intro Stats) 02-250-01 Lecture 3.

Similar presentations


Presentation on theme: "Basic Quantitative Methods in the Social Sciences (AKA Intro Stats) 02-250-01 Lecture 3."— Presentation transcript:

1 Basic Quantitative Methods in the Social Sciences (AKA Intro Stats) 02-250-01 Lecture 3

2 Variation Variability:Variability: The extent numbers in a data set are dissimilar (different) from each other When all elements measured receive the same scores (e.g., everyone in the data set is the same age, in years), there is no variability in the data set As the scores in a data set become more dissimilar, variability increases

3 Variation: Range rangeThe range tells us the span over which the data are distributed, and is only a very rough measure of variability Range:Range: The difference between the maximum and minimum scores   Example: The youngest student in a class is 19 and the oldest is 46. Therefore, the age range of the class is 46 – 19 = 27 years.

4 Variation X 5 0.00 This is an example of data 5 0.00 with NO variability 5 0.00 = 25 n = 5 = 5

5 Variation X 6+1.00 This is an example of data 4 -1.00 with low variability 6+1.00 5 0.00 4 -1.00 = 25 n = 5 = 5

6 Variation X 8+3.00 This is an example of data 1 -4.00 with higher variability 9+4.00 5 0.00 2 -3.00 = 25 n = 5 = 5

7 Note: Let’s say we wanted to figure out the average deviation from the mean. Normally, we would want to sum all deviations from the mean and then divide by n, i.e.,Let’s say we wanted to figure out the average deviation from the mean. Normally, we would want to sum all deviations from the mean and then divide by n, i.e., (recall: look at your formula for the mean from last lecture)(recall: look at your formula for the mean from last lecture) BUT: We have a problem. will always add up to zeroBUT: We have a problem. will always add up to zero

8 Variation However, if we square each of the deviations from the mean, we obtain a sum that is not equal to zero variance standard deviationThis is the basis for the measures of variance and standard deviation, the two most common measures of variability of data

9 Variation X 8 +3.00 9.00 1 -4.00 16.00 9 +4.00 16.00 5 0.00 0.00 2 -3.00 9.00 = 25 = 0.00 = 50.00 Note: The is called the Sum of Squares

10 Variance of a Population VARIANCE OF A POPULATIONVARIANCE OF A POPULATION: the sum of squared deviations from the mean divided by the number of scores (sigma squared):

11 Population Standard Deviation Square root of the variance

12 Sample Variance degrees of freedomthe sum of squared deviations from the mean divided by the number of degrees of freedom (an estimate of the population variance, n-1)

13 Sample Standard Deviation Square root of the variance s 2Square root of the variance s 2

14 Why use Standard Deviation and not Variance!??! Normally, you will only calculate variance in order to calculate standard deviation, as standard deviation is what we typically wantNormally, you will only calculate variance in order to calculate standard deviation, as standard deviation is what we typically want Why? Because standard deviation expresses variability in the same units as the dataWhy? Because standard deviation expresses variability in the same units as the data Example: Standard deviation of ages in a class is 3.7 yearsExample: Standard deviation of ages in a class is 3.7 years

15 Variance The above formulae are definitional - they are the mathematical representation of the concepts of variance and standard deviation When calculating variance and standard deviation (especially when doing so by hand) the following computational formulae are easiest to use (trust us, they really are easier to use. You should however have a good understanding of the definitional formulae):

16 Population Variance Computational Formula:Computational Formula:

17 Population Standard Deviation Computational Formula:Computational Formula:

18 Sample Variance Computational Formula:Computational Formula:

19 Sample Standard Deviation Computational Formula:Computational Formula:

20 Sample Standard Deviation Example Data: X X 2 8 64n = 5, = 5 1 9 81 5 25 s 2 = 175 – (25) 2 /5 2 4 4 X=25 =175 s 2 = 12.50 s = s = 3.54

21 Computing Standard Deviation When calculating standard deviation, create a table that looks like this:When calculating standard deviation, create a table that looks like this: X X 2 X 2 X1X1X1X1 X12X12X12X12 X2X2X2X2 X22X22X22X22 X3X3X3X3 X32X32X32X32 X 4 … X42X42X42X42 = 2 = 2 = X X 2 X 224 416 749 981 = 22 = 22 2 = 150 2 = 150

22 Computing Standard Deviation The values are then entered into the formula as follows:The values are then entered into the formula as follows: = 150 = 22 2 = 484 = 150 = 22 2 = 484 n = 4 n = 4 n-1 = 3 n-1 = 3

23 Computing Standard Deviation The values are then entered into the formula as follows:The values are then entered into the formula as follows: = 150 = 22 2 = 484 = 150 = 22 2 = 484 n = 4 n = 4 n-1 = 3 n-1 = 3

24 Computing Standard Deviation The values are then entered into the formula as follows:The values are then entered into the formula as follows:

25 Degrees of Freedom Degrees of Freedom:Degrees of Freedom: The number of independent observations, or, the number of observations that are free to vary In our data example above, there are 5 numbers that total 25 ( = 25, n = 5)

26 Degrees of Freedom 4Many combinations of numbers can total 25, but only the first 4 can be any value The 5 th number cannot vary if = 25 This example has 4 degrees of freedom, as four of the five numbers are free to vary Sample standard deviation usually underestimates population standard deviation. Using n-1 in the denominator corrects for this and gives us a better estimate of the population standard deviation.

27 Degrees of Freedom Degrees of freedom are usuallyn-1 (the total # of data points minus one)

28 Time for an example Seven people were asked to rate the taste of McDonalds french fries on a scale of 1 to 10. Their ratings are as follows:Seven people were asked to rate the taste of McDonalds french fries on a scale of 1 to 10. Their ratings are as follows: 8, 4, 6, 2, 5, 7, 7 Calculate the population standard deviation Calculate the sample variance Class discussion: When would this be a population, and when would it be a sample?

29 Why is Standard Deviation so Important? What does the standard deviation really tell us?What does the standard deviation really tell us? Why would a sample’s standard deviation be small?Why would a sample’s standard deviation be small? Why would a sample’s standard deviation be large?Why would a sample’s standard deviation be large?

30 An Example You’re sitting in the CAW Student Centre with 4 of your friends. A member of the opposite sex walks by, and you and your friends rate this person’s attractiveness on a scale from 1 to 10 (where 1=very unattractive and 10=drop dead gorgeous)You’re sitting in the CAW Student Centre with 4 of your friends. A member of the opposite sex walks by, and you and your friends rate this person’s attractiveness on a scale from 1 to 10 (where 1=very unattractive and 10=drop dead gorgeous)

31 Food for thought 1) What would it mean if all five of you rated this person a 9 on 10?1) What would it mean if all five of you rated this person a 9 on 10? 2) What would it mean if all five of you rated this person a 5 on 10?2) What would it mean if all five of you rated this person a 5 on 10? 3) What would it mean if the five of you produced the following ratings: 1, 10, 2, 9, and 3 (note that the mean rating would be 5)?3) What would it mean if the five of you produced the following ratings: 1, 10, 2, 9, and 3 (note that the mean rating would be 5)? Why would scenario #3 happen instead of scenario #2? What factors would lead to these different ratings?Why would scenario #3 happen instead of scenario #2? What factors would lead to these different ratings? These questions form the basis of why statisticians like to “explain variability”These questions form the basis of why statisticians like to “explain variability”

32 An In-Depth Look at Scenario #3 So if the five of you produced the following ratings: 1, 10, 2, 9, and 3, what is the standard deviation of these ratings?So if the five of you produced the following ratings: 1, 10, 2, 9, and 3, what is the standard deviation of these ratings? Calculate!Calculate! What is the standard deviation in Scenario #2? Calculate!What is the standard deviation in Scenario #2? Calculate!

33 Normal Distribution The normal distribution is a theoretical distribution “Normal” does not mean typical or average, it is a technical term given to this mathematical function Bell CurveThe normal distribution is unimodal and symmetrical, and is often referred to as the Bell Curve

34 Normal Distribution Mean Median Mode

35 Normal Distribution We study the normal distribution because many naturally occurring events yield a distribution that approximates the normal distribution

36 Properties of Area Under the Normal Distribution One of the properties of the Normal Distribution is the fixed area under the curve If we split the distribution in half, 50% of the scores of the sample lie to the left of the mean (or median, or mode), and 50% of the scores lie to the right of the mean (or median, or mode)

37 Properties of Area Under the Normal Distribution The mean, median, and mode always cut the Normal Distribution in half, and are equal since the Normal Distribution is unimodal and symmetrical:

38 Properties of Area Under the Normal Distribution Mean, Median, Mode 50% of scores

39 Properties of Area Under the Normal Distribution The entire area under the normal curve can be considered to be a proportion of 1.0000 Thus, half, or.5000 of the scores lie in the bottom half (i.e., left of the mean) of the distribution, and half, or.5000 of the scores lie in the top half (i.e., right of the mean)

40 Properties of Area Under the Normal Distribution Mean, Median, Mode.5000 of scores

41 Z-scores Z-ScoresZ-Scores (or standard scores) are a way of expressing a raw score’s place in a distribution Z-score formula:Z-score formula:

42 Z-scores The mean and standard deviation are always notated in Greek letters Z-scores only reflect the data points’ position relative to the overall data set (so you’re now considering the data as a population, as you’re not looking to infer to a greater population) This means use the population formula for standard deviation rather than the sample formula whenever you calculate Z

43 Z-scores A z-score is a better indicator of where your score falls in a distribution than a raw score A student could get a 75/100 on a test (75%) and consider this to be a very high score

44 Z-scores If the average of the class marks is 89 and the (population) standard deviation is 5.2, then the z-score for a mark of 75 would be: = 89 = 5.2 z = (75-89)/5.2 z = (-14)/5.2 z = -2.69

45 Z-scores This means that a mark of 75% is actually 2.69 standard deviations BELOW the mean The student would have done poorly on this test, as compared to the rest of the class

46 Z-scores z = 0 represents the mean score (which would be 89 in this example)z = 0 represents the mean score (which would be 89 in this example) z < 0 represents a score less than the mean (which would be less than 89)z < 0 represents a score less than the mean (which would be less than 89) z > 0 represents a score greater than the mean (which would be greater than 89)z > 0 represents a score greater than the mean (which would be greater than 89)

47 Z-scores For any set of scores:   the sum of z-scores will equal zero ( = 0.00)   have a mean equal to zero ( = 0.00)   and a standard deviation equal to one ( = 1.00)

48 Z-scores standard deviation sized unitsA z-score expresses the position of the raw score above or below the mean in standard deviation sized units E.g.,  above  z = +1.50 means that the raw score is 1 and one-half standard deviations above the mean  below  z = -2.00 means that the raw score is 2 standard deviations below the mean

49 Z-score Example If you write two exams, in Math and English, and get the following scores:   Math 70% (class = 55, = 10)   English 60% (class = 50, = 5) Which test mark represents the better performance (relative to the class)?

50 Z-score Example cont. Math mark: z = (70-55)/10   z = +1.50 English mark: z = (60-50)/5   z = +2.00

51 Z-score Example Illustration Z=1.50Z=2.00 Mean Z=0.00

52 The Answer Because: Z = +2.00 is greater than Z = +1.50, the English class mark of 60% reflects a better performance relative to that class than does the Math class mark of 70%Because: Z = +2.00 is greater than Z = +1.50, the English class mark of 60% reflects a better performance relative to that class than does the Math class mark of 70%

53 Z-score: Solving for X The z-score formula can be rearranged to solve for X:

54 Z-scores: Solving for X This formula is used when you know the z-score of a data point, and want to solve for the raw score.This formula is used when you know the z-score of a data point, and want to solve for the raw score.

55 Example E.g., if a class midterm exam has = 65 and = 5, what exam mark has a z-score value of 1.25? X = (1.25)(5) + 65 = 6.25 + 65 = 71.25 So, a person whose test is 1.25 standard deviations above the mean obtained a score of 71.25%

56 Z-scores Z-score problems ask you to solve for X or solve for zZ-score problems ask you to solve for X or solve for z Review both types of problems!Review both types of problems!


Download ppt "Basic Quantitative Methods in the Social Sciences (AKA Intro Stats) 02-250-01 Lecture 3."

Similar presentations


Ads by Google