Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.

Similar presentations


Presentation on theme: "Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill."— Presentation transcript:

1

2 Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill Building 8:00 - 8:50 Mondays, Wednesdays & Fridays. http://www.youtube.com/watch?v=ne6tB2KiZuk http://onlinestatbook.com/stat_sim/sampling_dist/index.html

3

4 Labs continue this week with Exam 2 review

5 Schedule of readings Before next exam (March 6 th ) Please read chapters 5, 6, & 8 in Ha & Ha Please read Chapters 10, 11, 12 and 14 in Plous Chapter 10: The Representativeness Heuristic Chapter 11: The Availability Heuristic Chapter 12: Probability and Risk Chapter 14: The Perception of Randomness

6 Exam 2 – This Friday (3/6/15) Study guide is online now Bring 2 calculators (remember only simple calculators, we can’t use calculators with programming functions) Bring 2 pencils (with good erasers) Bring ID

7 By the end of lecture today 3/4/15 Use this as your study guide Central Limit Theorem Law of Large Numbers Dan Gilbert Readings Review for Exam 2

8 No homework due Just study for Exam 2

9 Dan Gilbert Reading Hand in homework

10 Proposition 1: If sample size ( n ) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population Central Limit Theorem Proposition 2: If sample size ( n ) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. As n ↑ x will approach µ As n ↑ curve will approach normal shape As n ↑ curve variability gets smaller

11 Proposition 1: If sample size ( n ) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population Central Limit Theorem Law of large numbers: As the number of measurements increases the data becomes more stable and a better approximation of the true (theoretical) probability. Larger sample sizes tend to be associated with stability. As the number of observations ( n ) increases or the number of times the experiment is performed, the estimate will become more accurate.

12 Proposition 2: If sample size (n) is large enough (e.g. 100), the sampling distribution of means will be approximately normal, regardless of the shape of the population population sampling distribution n = 5 sampling distribution n = 30 sampling distribution n = 2 sampling distribution n = 5 sampling distribution n = 4 sampling distribution n = 25 Population Take sample (n = 5) – get mean Repeat over and over

13 Central Limit Theorem Proposition 2: If sample size (n) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population

14 Central Limit Theorem Proposition 2: If sample size (n) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population

15 Central Limit Theorem Proposition 2: If sample size (n) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population

16 Central Limit Theorem XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases.

17 Proposition 1: If sample size ( n ) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population Central Limit Theorem Proposition 2: If sample size ( n ) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. As n ↑ x will approach µ As n ↑ curve will approach normal shape As n ↑ curve variability gets smaller

18 Central Limit Theorem: If random samples of a fixed N are drawn from any population (regardless of the shape of the population distribution), as N becomes larger, the distribution of sample means approaches normality, with the overall mean approaching the theoretical population mean. Animation for creating sampling distribution of sample means http://onlinestatbook.com/stat_sim/sampling_dist/index.html Eugene Melvin Mean for sample 12 Mean for sample 7 Distribution of Raw Scores Sampling Distribution of Sample means Distribution of single sample Sampling Distribution of Sample means

19 Central Limit Theorem: If random samples of a fixed N are drawn from any population (regardless of the shape of the population distribution), as N becomes larger, the distribution of sample means approaches normality, with the overall mean approaching the theoretical population mean. Sampling distribution for continuous distributions XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX Melvin Eugene Sampling Distribution of Sample means Distribution of Raw Scores 2 nd sample 23 rd sample

20 Proposition 1: If sample size ( n ) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population Central Limit Theorem Proposition 2: If sample size ( n ) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. As n ↑ x will approach µ As n ↑ curve will approach normal shape As n ↑ curve variability gets smaller

21 . Writing Assignment: Writing a letter to a friend Imagine you have a good friend (pick one). This is a good friend whom you consider to be smart and interested in stuff generally. They are teaching themselves stats (hoping to test out of the class) but need your help on a couple ideas. For this assignment please write your friend/mom/dad/ favorite cousin a letter answering these five questions: (Feel free to use diagrams and drawings if you think that can help) Dear Friend, 1. I’m struggling with this whole Central Limit Theorem idea. Could you describe for me the difference between a distribution of raw scores, and a distribution of sample means? 2. I also don’t get the “three propositions of the Central Limit Theorem”. They all seem to address sample size, but I don’t get how sample size could affect these three things. If you could help explain it, that would be really helpful.

22 . Imagine you have a good friend (pick one). This is a good friend whom you consider to be smart and interested in stuff generally. They are teaching themselves stats (hoping to test out of the class) but need your help on a couple ideas. For this assignment please write your friend/mom/dad/ favorite cousin a letter answering these five questions: (Feel free to use diagrams and drawings if you think that can help) Dear Friend, 1. I’m struggling with this whole Central Limit Theorem idea. Could you describe for me the difference between a distribution of raw scores, and a distribution of sample means? 2. I also don’t get the “three propositions of the Central Limit Theorem”. They all seem to address sample size, but I don’t get how sample size could affect these three things. If you could help explain it, that would be really helpful.

23

24 Remember, no recording devices. These questions are protected Set your clicker to channel 50

25 Let’s try one

26 Alia is looking at a normal distribution and wants to know what proportion of the distribution that falls between 1 and -1 standard deviations from the mean. What is the correct proportion? a..3413 b..6826 c..9544 d..9970 Let’s try one Correct Answer

27 This will be so helpful now that we know these by heart 1 sd above and below mean 68% 2 sd above and below mean 95% 3 sd above and below mean 99.7%

28 When Stephan created this normal distribution, what did he plot on the “y” axis? (Remember to draw a picture.) a. memory test performance b. age of preschoolers c. frequency d. products advertised during their shows Let’s try one Correct Answer

29 Variability and means 38 40 44 48 52 56 58 40 44 48 52 56 Remember, there is an implied axis measuring frequency f f

30 Movie Packages We sampled 100 movie theaters (Two tickets, large popcorn and 2 drinks) Mean = $37 Range = $27 - $47 2728 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 Price per Movie Package 12 10 8 6 4 2 0 Frequency What’s the ‘typical’ or standard deviation? Standard Deviation = 3.5

31 Mean = 80 Range = 55 - 100 What’s the ‘typical’ or standard deviation? Standard Deviation = 10 Number correct on exam We tested 100 students (counted number of correct on 100 point test) What’s the largest possible deviation? 100 - 80 = 20 55 - 80= -25

32 Let’s try one Standard Deviation = 0.044 Amount of soda in 2-liter containers (measured amount of soda in 2-liter bottles) The best estimate of the population standard deviation is a. 0.106 b. 0.109 c. 0.044 d. 2.0 Mean = 2.0 Range = 1.894 – 2.109 What’s the largest possible deviation? 2 – 1.894 = 0.106 2 – 2.109 = -0.109

33 “If random samples of a fixed N are drawn from any population, as N becomes larger, the distribution of sample means approaches normality, with the overall mean approaching the theoretical population mean.” This is an example of? a. law of large numbers b. hypothesis test c. central limit theorem d. multiplication law for independent events Let’s try one Correct Answer

34 Proposition 1: If sample size ( n ) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population Central Limit Theorem Proposition 2: If sample size ( n ) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. As n ↑ x will approach µ As n ↑ curve will approach normal shape As n ↑ curve variability gets smaller

35 Law of large numbers: As the number of measurements increases the data becomes more stable and a better approximation of the true (theoretical) probability. Larger sample sizes tend to be associated with stability. As the number of observations ( n ) increases or the number of times the experiment is performed, the estimate will become more accurate.

36 Elle Woods is applying to law school and hopes to attend Harvard Law. She took the LSAT (Law School Admission Test). Which of the following percentile ranks would represent the best score on this test? a. 2 percentile b. 45 percentile c. 75 percentile d. 98 percentile Let’s try one Correct Answer

37 When estimating the population mean (µ) using a “point estimation”, one should use a. standard deviation b. standard error of the mean of the sample c. the mean of the sample d. the population mean (µ) cannot be estimated using the “point estimation” Let’s try one Correct Answer

38 Which of the following is inconsistent with the Central Limit Theorem: If random samples of a fixed N are drawn from any population as N becomes larger, a. the distribution of sample means approaches normality (regardless of the shape of the population distribution) b. the overall mean approaches the theoretical population mean c. The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. d. the dispersion of scores also becomes larger Let’s try one Correct Answer

39 Proposition 1: If sample size ( n ) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population Central Limit Theorem Proposition 2: If sample size ( n ) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. As n ↑ x will approach µ As n ↑ curve will approach normal shape As n ↑ curve variability gets smaller

40 Imagine that a Fatima takes a random sample from the population, and calculates the mean of that sample. Then, the procedure is repeated with a new sample (of the same size) generating a new mean, and then repeated again several times. If the Fatima drew a graph that represented the frequency of all of these means, the graph would be called a a. regularity distribution of central limits b. sampling distribution of sample means c. mean distribution of samples d. variance distribution of deviations Let’s try one Correct Answer

41 Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sized samples from a population XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX 2 nd sample 23 rd sample Sampling distributions sample means theoretical distribution we are plotting means of samples Frequency distributions of individual scores derived empirically we are plotting raw data this is a single sample Melvin Eugene

42 According to the Central Limit Theorem, for very large sample sizes, the mean of the sampling distribution of means a. underestimates the population mean b. equals the population mean c. overestimates the population mean d. has no relationship with the population mean Let’s try one Correct Answer

43 Proposition 1: If sample size ( n ) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population Central Limit Theorem Proposition 2: If sample size ( n ) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. As n ↑ x will approach µ As n ↑ curve will approach normal shape As n ↑ curve variability gets smaller

44 According to the Central Limit Theorem, for very large sample sizes, the distribution of sample means from a skewed population is a. skewed b. approximately normal c. binomial d. bimodal Let’s try one Correct Answer

45 Proposition 1: If sample size ( n ) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population Central Limit Theorem Proposition 2: If sample size ( n ) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. As n ↑ x will approach µ As n ↑ curve will approach normal shape As n ↑ curve variability gets smaller

46 According to the Central Limit Theorem as sample sizes get larger, the variability of the sampling distribution of means _______. a. becomes more positively skewed b. becomes smaller c. remains the same d. becomes larger Let’s try one Correct Answer

47 Proposition 1: If sample size ( n ) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population Central Limit Theorem Proposition 2: If sample size ( n ) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. As n ↑ x will approach µ As n ↑ curve will approach normal shape As n ↑ curve variability gets smaller

48 Maryanne gave an exam in her 10th grade art history class and was discouraged about how badly her class did. The scores were out of 100. The mean score was 50, the minimum score was 25 (deviation score = -25), while the maximum score was 70 (deviation score = 20). The best estimate of the population standard deviation for her whole class was: a. 50 b. 25 c. 10 d. 0.5 Let’s try one Correct Answer

49 Which of the following is not one of the three propositions of the Central Limit Theorem a. If sample size (n) is large enough, the mean of the sampling distribution will approach the mean of the population b. If sample size (n) is large enough, the sampling distribution of means will be approach normality c. If sample size (n) is large enough, the standard deviation of the sampling distribution equals the standard deviation of the population plus the square root of the sample size. d. As n increases SEM decreases Let’s try one Correct Answer

50 Proposition 1: If sample size ( n ) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population Central Limit Theorem Proposition 2: If sample size ( n ) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. As n ↑ x will approach µ As n ↑ curve will approach normal shape As n ↑ curve variability gets smaller

51 Fred was examining his company’s historical real estate data in attempt to predict what percentage of households built pools after buying a property without one. He counted all of the households that sold in the past couple years, and found that 60% of households actually built pools after buying houses without one. This is an example of: a. classic probability b. subjective probability c. empirical probability d. conditional probability Let’s try one Correct Answer

52 What is probability 1. Empirical probability: relative frequency approach Number of observed outcomes Number of observations Probability of hitting the corvette Number of carts that hit corvette Number of carts rolled 182 200 91% chance of hitting a corvette =.91

53 2. Classic probability: a priori probabilities based on logic rather than on data or experience. We assume we know the entire sample space as a collection of equally likely outcomes (deductive rather than inductive). Number of outcomes of specific event Number of all possible events In throwing a die what is the probability of getting a “2” Number of sides with a 2 Number of sides In tossing a coin what is probability of getting a tail Number of sides with a 1 Number of sides 1 2 50% chance of getting a tail 1 6 16% chance of getting a two = =

54 3. Subjective probability: based on someone’s personal judgment (often an expert), and often used when empirical and classic approaches are not available. There is a 50% chance that AT&T will merge with Cingular Bob says he is 90% sure he could swim across the river

55 There are 60 students in Kaisa’s physics class. The grades on the first midterm followed a normal distribution. What percent of the students were not within two standard deviations of the mean? a. 2.5% b. 5% c. 50% d. 95% Let’s try one Correct Answer

56 Louisa is excited that she scored 2.5 standard deviations above the mean on her chemistry test. The average was 50, and the standard deviation was 10. What grade did she receive? a. 75 b. 85 c. 95 d. 100 Let’s try one Correct Answer

57 Arthur knows that the probability of drawing the ace of spades from a standard deck of 52 is.01923, even though he has never actually tried it himself. What kind of probability is this? a. classic probability b. subjective probability c. empirical probability d. conditional probability Let’s try one Correct Answer

58 What is probability 1. Empirical probability: relative frequency approach Number of observed outcomes Number of observations Probability of hitting the corvette Number of carts that hit corvette Number of carts rolled 182 200 91% chance of hitting a corvette =.91

59 2. Classic probability: a priori probabilities based on logic rather than on data or experience. We assume we know the entire sample space as a collection of equally likely outcomes (deductive rather than inductive). Number of outcomes of specific event Number of all possible events In throwing a die what is the probability of getting a “2” Number of sides with a 2 Number of sides In tossing a coin what is probability of getting a tail Number of sides with a 1 Number of sides 1 2 50% chance of getting a tail 1 6 16% chance of getting a two = =

60 3. Subjective probability: based on someone’s personal judgment (often an expert), and often used when empirical and classic approaches are not available. There is a 50% chance that AT&T will merge with Cingular Bob says he is 90% sure he could swim across the river

61 Afra was interested in whether caffeine affects time to complete a cross-word puzzle, so she randomly assigned people to two groups. One group drank caffeine and the other group did not. She then timed them to see how quickly they could complete a crossword puzzle. This is an example of a a. correlation b. t-test c. one-way ANOVA d. two-way ANOVA Let’s try one Correct Answer

62 An advertising firm wanted to know whether the size of an ad in the margin of a website affected sales. They compared 4 ad sizes (tiny, small, medium and large). They posted the ads and measured sales. This is an example of a a. correlation b. t-test c. one-way ANOVA d. two-way ANOVA Let’s try one Correct Answer

63 Stephan is researching the television-watching behavior of preschoolers. He gave them a memory test for products advertised during their favorite shows. He used these test results to create a normal distribution. This distribution had a mean of 30 and a standard deviation of 2. Find the raw score associated with the 13th percentile. (Remember to draw a picture.) a. 27.74 b. 29.26 c. 29.34 d. 32.26 Let’s try one Correct Answer

64 13 th percentile Go to table.3700 nearest z = 1.13 x = mean + z σ = 30 + (-1.13)(2) = 27.74

65 Taylor is attending a conference about social networking, and the people’s ages in the room have a mean of 26 and a standard deviation of 3. What proportion of people’s ages were between 20 and 32? (Remember to draw a picture.) a..3413 b..6826 c..9544 d..9970 Let’s try one Correct Answer

66

67 A distribution has a mean of 50 and a standard deviation of 10. Find the raw score associated with the 77th percentile. (Hint: it may be helpful to draw a picture) a..74 b. 42.6 c. 57.4 d. 57.7 Let’s try one Correct Answer

68 77 th percentile Go to table.2700 nearest z =.74 x = mean + z σ = 30 + (.74)(2) = 31.48 x = mean + z σ = 50 + (.74)(10) = 57.4 Correct Answer for test

69 A distribution has a mean of 50 and a standard deviation of 10. Find the raw score associated with the 33rd percentile. (Hint: it may be helpful to draw a picture) a..44 b. 40.5 c. 45.6 d. 54.4 Let’s try one Correct Answer

70 33 th percentile Go to table.1700 nearest z = -.44 x = mean + z σ = 50 + (-.44)(10) = 45.6

71 A distribution has a mean of 50 and a standard deviation of 10. Find the area under the curve associated with a score of 35 and above. (Hint: it may be helpful to draw a picture) a..0668 b. 1.5 c..9332 d..4332 Let’s try one Correct Answer Notice only one is bigger than.50

72 z = 35-50 10 z = 1.5 Go to table.4332 A distribution has a mean of 50 and a standard deviation of 10. Find the area under the curve associated with a score of 35 and above. (Hint: it may be helpful to draw a picture)

73

74


Download ppt "Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill."

Similar presentations


Ads by Google