Presentation is loading. Please wait.

Presentation is loading. Please wait.

+ CHAPTER 2 Descriptive Statistics SECTION 2.1 FREQUENCY DISTRIBUTIONS.

Similar presentations


Presentation on theme: "+ CHAPTER 2 Descriptive Statistics SECTION 2.1 FREQUENCY DISTRIBUTIONS."— Presentation transcript:

1 + CHAPTER 2 Descriptive Statistics SECTION 2.1 FREQUENCY DISTRIBUTIONS

2 + Section 2.1: Frequency Distributions and Their Graphs GOAL: explore many ways to organize and describe a data set Center, variability (or spread), and shape

3 + FREQUENCY DISTRIBUTION A table that shows classes or intervals of data entries with a count of the number of entries in each class. The frequency f of a class is the number of data entries in the class. Frequency – how often Distribution – how spread out/concentrated Example: Pg. 40

4 + Example of a Frequency Distribution ClassFrequency, f 1 – 55 6 – 108 11 – 156 16 – 208 21 - 255 26 – 304 Lower Class Limit – least number that can belong to a class Upper Class Limit – greatest number that can belong to a class Class Width – the distance between lower (or upper) limits of consecutive classes Range – difference between the maximum and minimum data entries

5 + Guidelines for Creating a Frequency Distribution 1. Determine the range of the data. 2. Determine the number of classes to use. 3. Determine the class width. 4. Find Class Limits. 5. Find the Class Midpoints. 6. Find the Class Boundaries. 7. Tally up the data in each class. 8. Get the FREQUENCY for each class.

6 + Definitions – Additional Features of Frequency Distributions Class Midpoint – Sum of the lower and upper limits of a class divided by two (also known as class mark) Relative Frequency – portion or percentage of the data that falls in that class. Take the frequency (f) divided by the sample size (n). Cumulative Frequency – sum of the frequency for that class and all previous classes. The cumulative frequency of the last class is equal to the sample size n

7 + Class Example 1 Page 41

8 + Class Activity/HW Pg. 51 #27, #28 We’ll be using these frequency distributions again, so make sure to hold onto them. HAVE DONE FOR TOMORROW, WE NEED THEM! DO ON SEPARATE PAPER

9 + Graphs of Frequency Distributions Frequency Histogram – a bar graph the represents the frequency distribution of a data set Properties of a Frequency Histogram 1. The horizontal scale is quantitative and measures the data values 2. The vertical scale measures the frequencies of the classes 3. Consecutive bars MUST touch

10 + Other Types of Graphs FREQUENCY POLYGON A line graph that emphasizes the continuous change in frequencies RELATIVE FREQUENCY HISTOGRAM Has the same shape/horizontal scale as frequency histogram Vertical scale measures RELATIVE frequencies CUMULATIVE FREQUENCY GRAPH (OGIVE) Line graph that displays the cumulative frequency of each class at its upper class boundary

11 + #27: Newspaper Reading Times (min) ClassFrequencyMid-pointRelative fCumulative f 0 – 783.50.328 8 – 15811.50.3216 16 – 23319.50.1219 24 – 31327.50.1222 32 – 39335.50.1225 n = 25

12 + Class Activity/HW Using Frequency Distribution you created for #28 from page 51 complete the following: ON GRAPH PAPER: 1. Frequency Histogram 2. Frequency Polygon 3. Relative Frequency Histogram 4. Ogive **MAKE SURE TO LABEL GRAPHS AND WRITE NEATLY! (TURN IN WITH FREQUENCY DISTRIBUTION FOR WRITTEN FEEDBACK) DUE TOMORROW!!!!

13 + #28 Book Spending Per Semester ($) ClassFrequencyMid-PointRelative fCumulative f 30 – 113571.50.17245 114 – 1977155.50.241412 198 – 2818239.50.275920 282 – 3652323.50.069022 366 – 4493407.50.103425 450 – 5334491.50.137929 n = 29

14 + Pirate Baseball Activity: Due Given: Pittsburgh Pirates Home Run Data 1961 – 2009 Using this data, create the following: USING EIGHT CLASSES 1. Frequency Distribution (including ALL parts and rel./cum. freq) 2. Frequency Histogram 3. Frequency Polygon 4. Relative Frequency Histogram 5. Ogive Must include: Title, Axis Labels, equal class widths Evidence of ALL calculations (class widths, boundaries, midpoints) Straight lines Neatness Straight Edge Graph Paper Then, using your phone or an iPad look up homerun data for 2010, 2011, 2012, 2013, 2014, and 2015. Create a NEW Frequency Distribution Two New Charts Explain how this new data has changed the distribution (one paragraph) THIS WILL BE GRADED. Due: Only given TODAY and TOMORROW to work in class.

15 + Section 2.2: More Graphs and Displays

16 + Stem and Leaf Plot Display for quantitative data Give the feel of a histogram while retaining data values Easy way to sort data Stem – the entry’s leftmost digits Leaf – the entry’s rightmost digits Example 1 and 2 on Pages 55 – 56 Ordered/Unordered MUST ALWAYS INCLUDE A KEY!

17 + Dot Plot Each data entry is plotted, using a point, above a horizontal axis Can see how data is distributed, see specific data entries, and identify unusual data values Example 3 Pg. 57

18 + Graphing Qualitative Data Sets: Pie Charts A circle that is divided into sectors that represent categories Area of each sector is proportional to the category’s frequency KEY: To find central angle: MULTIPLY RELATIVE FREQUENCY BY 360°

19 + Pareto Chart A vertical bar graph where the height represents frequency or relative frequency BARS ARE POSITIONED IN ORDER OF HIGHEST TO LOWEST REMEMBER: Qualitative Data Example 5 Page 59

20 + Graphing Paired Data Sets: Scatter Plot Paired Data Sets: one data set corresponds to one entry in a second data set Scatter Plot: ordered pairs are graphed as points in a coordinate plane Use to SHOW THE RELATIONSHIP BETWEEN TWO QUANTITATIVE VARIABLES Example 6 Page 60

21 + Time Series Chart Used to graph a time series Time series – data set composed of quantitative entries taken at regular intervals over a period of time Example 7 Page 61 Scatter Plot: No Line Time Series Chart: Connected data points

22 + GRADED ASSSIGNMENT: Individually, complete the following graphs from pages 64 – 65. #18, #20, #22, #24, #25, #29, #30 Must be handed in by the beginning of class on ________ (only ______to work in class) Will be graded for correctness and neatness Use graph paper, ruler, protractor, and compass!

23 + Section 2.3- Measures of Central Tendency

24 + Measures of Central Tendency MEAN, MEDIAN, MODE Value that represents TYPICAL, or CENTRAL entry of the data

25 + Mean Population Mean μ = Σ x /N Sample Mean x = Σ x / n N = number of entries in a population n = number of entries in a sample

26 + Example 1 Pg. 67 The prices (in dollars) for a sample of roundtrip flights from Chicago, Illinois to Cancun, Mexico are listed. What is the mean price of the flights? 872432397427388782397 WHEN CALCULATING GO ONE DECIMAL FURTHER THAN ORIGINAL DATA

27 + Median Value that lies in the middle of the data when the data is ORDERED If data set has an even number of entries, the median is the mean of the two middle data entries Median divides a data set into TWO equal parts EX: 4 5 6 8 10 14

28 + Mode Most frequently occurring data point If ALL occur only ONCE, then there is NO MODE If two data entries occur the same number of times, then BOTH are modes and we have a BIMODAL DISTRIBUTION If more than two modes, we have a MULITMODAL DISTRIBUTION

29 + Note on Mode Mode is only measure of central tendency that MUST be an actual data point.

30 + Outlier Data point that is far away from all of the other data points

31 + Assignment: Part 1 Section 2.3 Pg. 75 – 78 #18 - #34 even Finding mean, median, and mode. Label any outliers. Use correct notation for mean. (population mean vs. sample mean)

32 + Today’s Question: How can we describe the “middle” of unequal data? You have $200 for 17 days, $300 for 5 days, and $150 dollars for 9 days out of a month. What was your average amount of money for the month?

33 + Weighted Mean A mean where each data point in not “worth” the same amount. Entries have varying “weights”. x = Σ (x * w) / Σ w **Where w is the weight of each entry

34 + Example: Weighted Mean Vs. Regular Mean Tests are worth 50% of overall grade, quizzes 30% and homework 20%. You get 100 in HW, 90 on a quiz, and 80 on a test. Calculate regular and weighted mean. Why is one lower than the other?

35 + Example: Weighted Mean Vs. Regular Mean You have $200 for 17 days, $300 for 5 days, and $150 dollars for 9 out of a month. Calculate regular and weighted mean. Why is one lower than the other?

36 + Mean of a Frequency Distribution x = Σ (x * f) / n Where n = Σ f, x is the class midpoint, and f is the frequency of each class

37 + Guidelines: Finding the Mean of a Frequency Distribution (Pg. 72) Find the midpoint of each class. Find the sum of the products of the midpoints and the frequencies. Σ (x *f ) Find the sum of the frequencies. n = Σ f Find the mean of the frequency distribution. x = Σ (x * f) / n

38 + The Shape of Distributions (Pg. 73) Symmetric – can be folded in the middle Uniform – Rectangular, equal frequencies Multimodal – More than one peak Skewed – a “long tail” on one side Direction of the skew is the side the tail is on. Left skewed means the tail is on the left side Right skewed means the tail in on the right side

39 + EXAMPLES: Page 73 Mean describes data best when data is symmetric. Median describes data best when data is skewed or contains outliers. Mode describes data best when data is nominal level of measurement.

40 + Assignment: Part 2 Section 2.3 Pg. 77 – 78 #41-#44, #46 - #48, #52- #54 THIS IS A LENGTHY ASSIGNMENT, GET STARTED ON IT!!!

41 + Section 2.4: Measures of Variation

42 + Find the mean, median, and mode. SET A: 37, 38, 39, 41, 41,41, 42, 44, 45, 47 SET B: 23, 29, 32, 40, 41, 41, 48, 50, 52, 59

43 +

44 + Measures of Variation: Range, Deviation, Variance, Standard Deviation Range = (Maximum Data Entry) – (Minimum Data Entry) Range only uses two pieces of data Variation and Standard Deviation use ALL entries of a data set

45 + Deviation Deviation of an entry x in a POPULATION data set is the difference between the entry and the mean μ of the data set. Deviation of x = x – μ (POPULATION) Deviation of x = x – x (SAMPLE) DISTANCE FROM MEAN!

46 + Calculate Deviations of Company A 37, 38, 39, 41, 41,41, 42, 44, 45, 47 Find the sum of the deviations.

47 + POPULATION VARIANCE For POPULATION DATA σ ^2 = Σ (x- μ ) ^2 / N σ is the lowercase Greek letter Sigma

48 + Population Standard Deviation Square Root of Variance (only σ ) Average distance away from the mean Larger standard deviation means more spread out data.

49 + Sample Variance and Sample Standard Deviation. When using sample data use x not μ Divide by N-1 instead of N

50 + Calculate sample variation and standard deviation for Company B. SET A: 37, 38, 39, 41, 41,41, 42, 44, 45, 47 SET B: 23, 29, 32, 40, 41, 41, 48, 50, 52, 59

51 +

52 + Assignment: Part 1 Section 2.4 Pg. 92 – 94 #1, 3, 13, 14, 19, 20

53 + How can we use standard deviation to make decisions about data? Standard deviation and variance tell us how spread out the data is

54 + Empirical Rule (68-95-99.7 Rule) In a BELL – SHAPED distribution, 1. ~68% of data is within 1 Standard Deviation of mean 2. ~95% of data is within 2 Standard Deviations of mean 3. ~99.7% of data is within 3 Standard Deviations of mean

55 +

56 + Example: If 65 men’s heights have a bell shaped distribution with mean of 68 in and standard deviation of 2.5 inches, what percent of people are between 68 and 73 inches? How many men is that?

57 + Chebychev’s Theorem In ANY distribution, the percent of data with k standard deviations (k >1) is AT LEAST 1 – (1/k^2) For k = 2: For k = 3:

58 + Example: A sample of 40 runners in a 1 mile race gave a mean of 7 minutes with a standard deviation of 1.25 minutes. What can we say about how many people ran a mile in between 4.5 and 9.5 minutes?

59 + Assignment: Part 2 Section 2.4 Pg. 95 – 97 #29 - #36 ONLY PART A Pg. 88 has nice picture of Empirical Rule and Bell-Shaped Distributions

60 + Section 2.5: Measure of Position

61 + Fractiles Numbers that partition, or divide, an ORDERED data set into equal parts Example: Median – Fractile that divides data set into two equal parts

62 + Quartiles Three Quartiles: Q1, Q2, and Q3 Divide an ordered data set into four equal parts Q1 – First Quartile – one quarter of data fall on or below Q1 Q2 – Second Quartile – half of the data fall on or below Q2 Q2 is MEDIAN of the data set Q3 – Third Quartile – ¾ of the data fall on or below Q3

63 + Interquartile Range Difference between the third and first quartiles IQR = Q3 – Q1

64 + Box-and-Whisker Plot Five Number Summary: Maximum Minimum Median Q1 Q3 5, 7, 9, 10, 11, 13, 14, 15, 16, 17, 18, 18, 20 21, 37 What conclusion can we draw from graph?

65 +

66 + Assignment: Part 1 Section 2.5 Pg. 110 – 111 #17 - #20, #23, #26, #27, #28

67 + The Standard Score or Z-Score Measures a data value’s position in the data set The STANDARD SCORE or Z-SCORE represents the number of standard deviations a given value x fall from the mean μ. To find the z-score for a given value, use the following formula: Z = Value – Mean = x – μ Standard Dev. σ

68 + Z-Score Can be POSITIVE, NEGATIVE, or ZERO If z is NEGATIVE, then the corresponding x value is BELOW the mean. If z is POSITIVE, then the corresponding x value is ABOVE the mean. If z is ZERO, then the corresponding x value is the MEAN.

69 + Z-Score Example Mean speed of vehicles is 56 MPH. Standard Deviation of 4 MPH. Car 1: 62 MPH Car 2: 47 MPH Car 3: 56 MPH Calculate the z-score for Cars 1, 2, and 3. Interpret this information.

70 +

71 + Z-Scores PLUS the Empirical Rule Empirical Rule: 95% of data lies within 2 Standard Deviations Z-Score: 95% of data lies within -2 and 2. Usual scores A z-score less than -2 or greater than 2 we would consider unusual. A z-score less than -3 or greater than 3 we would consider VERY unusual. REMEMBER – BELL-Shaped for Empirical Rule

72 + Assignment: Part 2 Section 2.5 Pg. 111 - 112 #29 - #34

73 + Section 2.3 Part 1 (Mean, Median, Mode,) 18. 6.2, 6, 5 20. 200.4, 186, none 22. 61.2, 55, 80 and 125 24. NP, NP, worse 26. NP, NP, domestic 28. 16.6, 15, none 30. 314.1, 374, none 32. 2.49, 2.35, 4.0 34. 213.4, 214, 217 Section 2.3 Part 2 41. 89 42. 36320 43. 612.73 44. 982.19 46. 84 47. 65 48. 69.7 52. Skewed Right 53. Symmetric 54. Uniform Section 2.4 Part 1 1. R = 8, M = 7.9, V = 6.1, SD = 2.5 3. R = 12, M = 11.9, V = 17.1, SD = 4.1 19. LA: R = 17.6, V = 37.5, SD = 6.11 LB: R = 8.7, V = 8.71, SD = 2.95 20. Dallas: R = 18.1, V = 37.33, SD = 6.11 Houston: R = 13, V = 12.26, SD = 3.5 Section 2.4 Part 2 29. 68% 30. Between 1500 and 3300 31. a. 51, b. 17 32. a. 38, b. 19 33. 1000, 2000 34. 3325, 1490 35. 24 36.Sentences involving 54.97 and 59.17 Section 2.5 Part 1 17. None 18. SR 19. SL 20. S 23. Q1 = 2, Q2 = 4, Q3 = 5 26. Q1 = 15.125, Q2 = 15.8, Q3 = 17.65 27. a. 5, b. 50%, c. 25% 28. a. 17.65, b. 50%, c. 50% Section 2.5 Part 2 31. Stats: 1.43, Bio: 0.77. Did better on Stats 32. Stats: -0.43, Bio: -0.77, Did better on Stats 33. Stats: 2.14, Bio: 1.54, Did better on Stats 34.Both 0, Both performed equally.


Download ppt "+ CHAPTER 2 Descriptive Statistics SECTION 2.1 FREQUENCY DISTRIBUTIONS."

Similar presentations


Ads by Google