Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unit 1 Mr. Lang’s AP Statistics Power point. Homework Assignment 4 For the A: 1, 3, 5, 7, 8,11- 25 Odd, 27 – 32, 37 – 59 Odd, 60, 69 – 74, 79 – 105 Odd.

Similar presentations


Presentation on theme: "Unit 1 Mr. Lang’s AP Statistics Power point. Homework Assignment 4 For the A: 1, 3, 5, 7, 8,11- 25 Odd, 27 – 32, 37 – 59 Odd, 60, 69 – 74, 79 – 105 Odd."— Presentation transcript:

1 Unit 1 Mr. Lang’s AP Statistics Power point

2 Homework Assignment 4 For the A: 1, 3, 5, 7, 8,11- 25 Odd, 27 – 32, 37 – 59 Odd, 60, 69 – 74, 79 – 105 Odd (except 85, 99, 101) 107 – 110, R1-R10 4 For the C: 1, 3, 5, 8, 11- 25 Odd, 37 – 59 Odd, 79 – 103 Odd (except 85, 99, 101) R1- R10 4 For the D- : 1, 3, 5, 11, 15, 19, 23, 37, 41, 45, 49, 79, 83, 87, 91, 97, 103, R1- R10 All problems must be complete, including explanations with complete sentences and or work to show if the question asks for it. All Multiple Choice problems will be graded for correctness.

3 Statistics 4 the science of collecting, analyzing, and drawing conclusions from data

4 Descriptive statistics 4 the methods of organizing & summarizing data

5 Inferential statistics 4 involves making generalizations from a sample to a population

6 Population 4 The entire collection of individuals or objects about which information is desired

7 Sample 4 A subset of the population, selected for study in some prescribed manner

8 Variable 4 any characteristic whose value may change from one individual to another

9 Data 4 observations on single variable or simultaneously on two or more variables

10 Types of variables

11 Categorical variables 4 or qualitative 4 identifies basic differentiating characteristics of the population

12 Numerical variables 4 or quantitative 4 observations or measurements take on numerical values 4 makes sense to average these values 4 two types - discrete & continuous

13 Discrete (numerical) 4 listable set of values 4 usually counts of items

14 Continuous (numerical) 4 data can take on any values in the domain of the variable 4 usually measurements of something

15 Classification by the number of variables 4 Univariate - data that describes a single characteristic of the population 4 Bivariate - data that describes two characteristics of the population 4 Multivariate - data that describes more than two characteristics (beyond the scope of this course

16 Identify the following variables: 1. the income of adults in your city 2. the color of M&M candies selected at random from a bag 3. the number of speeding tickets each student in AP Statistics has received 4. the area code of an individual 5. the birth weights of female babies born at a large hospital over the course of a year Numerical Categorical

17 Self Check #1

18 Assignment #1

19 Graphs for categorical data

20 Bar Graph 4 Used for categorical data 4 Bars do not touch 4 Categorical variable is typically on the horizontal axis 4 To describe – comment on which occurred the most often or least often 4 May make a double bar graph or segmented bar graph for bivariate categorical data sets

21 Using class survey data: graph birth month graph gender & handedness

22 Pie (Circle) graph 4 Used for categorical data 4 To make: –Proportion 360° –Using a protractor, mark off each part 4 To describe – comment on which occurred the most often or least often

23 Graphs for numerical data

24 Dotplot 4 Used with numerical data (either discrete or continuous) 4 Made by putting dots (or X’s) on a number line 4 Can make comparative dotplots by using the same axis for multiple groups

25 Stemplots (stem & leaf plots) 4 Used with univariate, numerical data 4 Must have key so that we know how to read numbers 4 Can split stems when you have long list of leaves 4 Can have a comparative stemplot with two groups Would a stemplot be a good graph for the number of pieces of gun chewed per day by AP Stat students? Why or why not? Would a stemplot be a good graph for the number of pairs of shoes owned by AP Stat students? Why or why not?

26 Example: The following data are price per ounce for various brands of dandruff shampoo at a local grocery store. 0.320.210.290.540.170.280.360.23 Can you make a stemplot with this data?

27 Example: Tobacco use in G-rated Movies Total tobacco exposure time (in seconds) for Disney movies: 223176548371585129937 111657492623206 9 Total tobacco exposure time (in seconds) for other studios’ movies: 20516261117591155 245517 Make a comparative stemplot.

28 Graphing Activity

29 Self Check #2

30 Assignment #2

31 Histograms 4 Used with numerical data 4 Bars touch on histograms 4 Two types –Discrete Bars are centered over discrete values –Continuous Bars cover a class (interval) of values 4 For comparative histograms – use two separate graphs with the same scale on the horizontal axis Would a histogram be a good graph for the fastest speed driven by AP Stat students? Why or why not? Would a histogram be a good graph for the number of pieces of gun chewed per day by AP Stat students? Why or why not?

32 Cumulative Relative Frequency Plot (Ogive) 4... is used to answer questions about percentiles. 4 Percentiles are the percent of individuals that are at or below a certain value. 4 Quartiles are located every 25% of the data. The first quartile (Q1) is the 25th percentile, while the third quartile (Q3) is the 75th percentile. What is the special name for Q2? 4 Interquartile Range (IQR) is the range of the middle half (50%) of the data. IQR = Q3 – Q1

33 Ogive Activity

34 Self Check #3

35 Multiple Choice Test #1

36 Types (shapes) of Distributions

37 Symmetrical 4 refers to data in which both sides are (more or less) the same when the graph is folded vertically down the middle 4 bell-shaped is a special type –has a center mound with two sloping tails

38 Uniform 4 refers to data in which every class has equal or approximately equal frequency

39 Skewed (left or right) 4 refers to data in which one side (tail) is longer than the other side 4 the direction of skewness is on the side of the longer tail

40 Bimodal (multi-modal) 4 refers to data in which two (or more) classes have the largest frequency & are separated by at least one other class

41 Distribution Activity...

42 Self Check #4

43 How to describe a numerical, univariate graph

44 What strikes you as the most distinctive difference among the distributions of exam scores in classes A, B, & C ?

45 1. Center 4 discuss where the middle of the data falls 4 three types of central tendency –mean, median, & mode

46 What strikes you as the most distinctive difference among the distributions of scores in classes D, E, & F? Class

47 2. Spread 4 discuss how spread out the data is 4 refers to the variability of the data –Range, standard deviation, IQR

48 What strikes you as the most distinctive difference among the distributions of exam scores in classes G, H, & I ?

49 3. Shape 4 refers to the overall shape of the distribution 4 symmetrical, uniform, skewed, or bimodal

50 What strikes you as the most distinctive difference among the distributions of exam scores in class K ? K

51 4. Unusual occurrences 4 outliers - value that lies away from the rest of the data 4 gaps 4 clusters 4 anything else unusual

52 5. In context 4 You must write your answer in reference to the specifics in the problem, using correct statistical vocabulary and using complete sentences!

53 Features of the Distribution Activity

54 Means & Medians

55 Parameter - 4 Fixed value about a population 4 Typical unknown

56 Statistic - 4 Value calculated from a sample

57 Measures of Central Tendency 4 Median - the middle of the data; 50 th percentile –Observations must be in numerical order –Is the middle single value if n is odd –The average of the middle two values if n is even NOTE: n denotes the sample size

58 Measures of Central Tendency 4 Mean - the arithmetic average –Use  to represent a population mean –Use x to represent a sample mean  Formula:  is the capital Greek letter sigma – it means to sum the values that follow parameter statistic

59 Measures of Central Tendency 4 Mode – the observation that occurs the most often –Can be more than one mode –If all values occur only once – there is no mode –Not used as often as mean & median

60 Suppose we are interested in the number of lollipops that are bought at a certain store. A sample of 5 customers buys the following number of lollipops. Find the median. 2 3 4 8 12 The numbers are in order & n is odd – so find the middle observation. The median is 4 lollipops!

61 Suppose we have sample of 6 customers that buy the following number of lollipops. The median is … 2 3 4 6 8 12 The numbers are in order & n is even – so find the middle two observations. The median is 5 lollipops! Now, average these two values. 5

62 Suppose we have sample of 6 customers that buy the following number of lollipops. Find the mean. 2 3 4 6 8 12 To find the mean number of lollipops add the observations and divide by n.

63 Using the calculator...

64 What would happen to the median & mean if the 12 lollipops were 20? 2 3 4 6 8 20 The median is... 5 The mean is... 7.17 What happened?

65 What would happen to the median & mean if the 20 lollipops were 50? 2 3 4 6 8 50 The median is... 5 The mean is... 12.17 What happened?

66 What would happen to the median & mean if the 20 lollipops were 50? 2 3 4 6 8 50 The median is... 5 The mean is... 12.17 What happened?

67 Resistant - 4 Statistics that are not affected by outliers 4 Is the median resistant? ► Is the mean resistant? YES NO

68 Now find how each observation deviates from the mean. What is the sum of the deviations from the mean? Look at the following data set. Find the mean. 2223242525262930 0 Will this sum always equal zero? YES This is the deviation from the mean.

69 Look at the following data set. Find the mean & median. Mean = Median = 21232324252526 262627 27272728 303030313232 27 Create a histogram with the data. (use x-scale of 2) Then find the mean and median. 27 Look at the placement of the mean and median in this symmetrical distribution.

70 Look at the following data set. Find the mean & median. Mean = Median = 222928222425282125 2324232636386223 25 Create a histogram with the data. (use x-scale of 8) Then find the mean and median. 28.176 Look at the placement of the mean and median in this right skewed distribution.

71 Look at the following data set. Find the mean & median. Mean = Median = 214654475360555560 5658585858626364 58 Create a histogram with the data. Then find the mean and median. 54.588 Look at the placement of the mean and median in this skewed left distribution.

72 Recap: 4 In a symmetrical distribution, the mean and median are equal. 4 In a skewed distribution, the mean is pulled in the direction of the skewness. 4 In a symmetrical distribution, you should report the mean! 4 In a skewed distribution, the median should be reported as the measure of center!

73 Trimmed mean: To calculate a trimmed mean: 4 Multiply the % to trim by n 4 Truncate that many observations from BOTH ends of the distribution (when listed in order) 4 Calculate the mean with the shortened data set

74 Find a 10% trimmed mean with the following data. 12 14 19 20 22 24 25 26 26 35 10%(10) = 1 So remove one observation from each side!

75 Matching Graphs Activity

76 Mean and Median Assignment

77

78 Why use boxplots? 4 ease of construction 4 convenient handling of outliers 4 construction is not subjective (like histograms) 4 Used with medium or large size data sets (n > 10) 4 useful for comparative displays

79 Disadvantage of boxplots 4 does not retain the individual observations 4 should not be used with small data sets (n < 10)

80 How to construct 4 find five-number summary Min Q1 Med Q3 Max 4 draw box from Q1 to Q3 4 draw median as center line in the box 4 extend whiskers to min & max

81 Modified boxplots 4 display outliers 4 fences mark off mild & extreme outliers 4 whiskers extend to largest (smallest) data value inside the fence ALWAYS use modified boxplots in this class!!!

82 Inner fence Q1 – 1.5IQRQ3 + 1.5IQR Any observation outside this fence is an outlier! Put a dot for the outliers. Interquartile Range (IQR) – is the range (length) of the box Q3 - Q1

83 Modified Boxplot... Draw the “whisker” from the quartiles to the observation that is within the fence!

84 Outer fence Q1 – 3IQRQ3 + 3IQR Any observation outside this fence is an extreme outlier! Any observation between the fences is considered a mild outlier.

85 For the AP Exam...... you just need to find outliers, you DO NOT need to identify them as mild or extreme. Therefore, you just need to use the 1.5IQRs

86 A report from the U.S. Department of Justice gave the following percent increase in federal prison populations in 20 northeastern & mid- western states in 1999. 5.91.35.05.94.55.64.16.3 4.86.94.53.57.26.45.55.3 8.04.47.23.2 Create a modified boxplot. Describe the distribution. Use the calculator to create a modified boxplot.

87 Evidence suggests that a high indoor radon concentration might be linked to the development of childhood cancers. The data that follows is the radon concentration in two different samples of houses. The first sample consisted of houses in which a child was diagnosed with cancer. Houses in the second sample had no recorded cases of childhood cancer. (see data on note page) Create parallel boxplots. Compare the distributions.

88 Cancer No Cancer 100 200 Radon The median radon concentration for the no cancer group is lower than the median for the cancer group. The range of the cancer group is larger than the range for the no cancer group. Both distributions are skewed right. The cancer group has outliers at 39, 45, 57, and 210. The no cancer group has outliers at 55 and 85.

89 Matching Box Plots, Histograms, and Summary Statistics Activity

90 Self Check #5

91 Comparative Boxplots Assignment

92

93 Why is the study of variability important? 4 Allows us to distinguish between usual & unusual values 4 In some situations, want more/less variability –scores on standardized tests –time bombs –medicine

94 Measures of Variability 4 range (max-min) 4 interquartile range (Q3-Q1) 4 deviations 4 variance 4 standard deviation Lower case Greek letter sigma

95 Suppose that we have these data values: 2434263037 1628213529 Find the mean. Find the deviations. What is the sum of the deviations from the mean?

96 2434263037 1628213529 Square the deviations: Find the average of the squared deviations:

97 The average of the deviations squared is called the variance. PopulationSample parameter statistic

98 Calculation of variance of a sample df

99 Degrees of Freedom (df) 4 n deviations contain (n - 1) independent pieces of information about variability

100 A standard deviation is a measure of the average deviation from the mean.

101 Use calculator

102 Which measure(s) of variability is/are resistant?

103 Mean and Variance Activity

104 Mean and Variance Worksheet

105 Self Check #6

106 Show me the Money Assignment

107 Multiple Choice Test #2

108 Assignment #3

109 Linear transformation rule 4 When adding a constant to a random variable, the mean changes but not the standard deviation. 4 When multiplying a constant to a random variable, the mean and the standard deviation changes.

110 An appliance repair shop charges a $30 service call to go to a home for a repair. It also charges $25 per hour for labor. From past history, the average length of repairs is 1 hour 15 minutes (1.25 hours) with standard deviation of 20 minutes (1/3 hour). Including the charge for the service call, what is the mean and standard deviation for the charges for labor?

111 Rules for Combining two variables 4 To find the mean for the sum (or difference), add (or subtract) the two means 4 To find the standard deviation of the sum (or differences), ALWAYS add the variances, then take the square root. 4 Formulas: If variables are independent

112 Bicycles arrive at a bike shop in boxes. Before they can be sold, they must be unpacked, assembled, and tuned (lubricated, adjusted, etc.). Based on past experience, the times for each setup phase are independent with the following means & standard deviations (in minutes). What are the mean and standard deviation for the total bicycle setup times? PhaseMeanSD Unpacking3.50.7 Assembly21.82.4 Tuning12.32.7

113 Self Check #7


Download ppt "Unit 1 Mr. Lang’s AP Statistics Power point. Homework Assignment 4 For the A: 1, 3, 5, 7, 8,11- 25 Odd, 27 – 32, 37 – 59 Odd, 60, 69 – 74, 79 – 105 Odd."

Similar presentations


Ads by Google