Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics Frequencies

Similar presentations


Presentation on theme: "Statistics Frequencies"— Presentation transcript:

1 Statistics Frequencies

2 Histograms Many bar/column charts display count data The counts shown in each category are called “frequencies”

3 Histograms There are a lot of graphs that specialize in showing frequencies These are called “histograms” There are several popular types of histograms

4 Histograms Dot plot (automatic graph)

5 Histograms What month is your birthday?

6 Histograms Fancy dot plot using pictographs:

7 Histograms

8 Histograms Living Histogram

9 Histograms Stem and leaf plot Also changes the data into a bar graph For measurement data Let you see the original data values

10 Histograms the stem is usually the leftmost digit/s the leaf is the rightmost digit (the "ones")

11 Histograms It forms a sort of dot plot… But the data are still there!

12 Histograms More stem and leaf:

13 Histograms Comparative stem and leaf

14 FREQUENCIES IN-CLASS PROBLEM You are doing research on traffic offenses in Denver. Your first research objective is to find out if Franklin Street’s speed limit should be greater than 25 mph. You start by sampling 30 speeding tickets from this street and record the speed.

15 FREQUENCIES IN-CLASS PROBLEM What is the population? What is the variable? Is the variable Qualitative or Quantitative?

16 Create a Stem and Leaf for the raw data:
FREQUENCIES IN-CLASS PROBLEM Create a Stem and Leaf for the raw data: 48 92 50 29 40 129 43 108 39 42 57 104 83 45 81 123 38 67 32 65 46 80 100 98

17 Questions?

18 Frequencies Types of frequencies: Absolute frequency – the number of observations that fall in a certain category

19 Frequencies A table of absolute frequencies is called a frequency distribution

20 Frequencies Data table: A B A B A C B B Frequency Histogram: distribution: A: 3 B: 4 C: 1

21 Questions?

22 Measurement Frequencies
So… what if your data are measurements rather than counts?

23 Measurement Frequencies
Often we change the measurements into counts These derived counts are also “frequencies”

24 Measurement Frequencies
We can change measured data to categories by splitting the continuum into named categories

25 Measurement Frequencies
Sale price (in thousand $) 8.0 – 11.0 14.2 – 17.2 17.3 – 20.3 20.4 – 23.4 23.5 – 26.5 Minutes Internet Usage 1-10 11-20 21-30 31-40 41-50 51-60 60+ Years of experience 1 - 2 3 - 4 5 - 6 7 - 8 9 - 10 11+

26 Measurement Frequencies
The counts of observations falling in these user-manufactured categories are still called “frequencies”

27 Measurement Frequencies
A bar graph of frequencies in user-manufactured categories is still called a “histogram”

28 Measurement Frequencies
It is less confusing to viewers to keep the numerical categories the same width

29 Measurement Frequencies
Numerical categories should not overlap Numerical categories should not leave any blank spaces in the continuum

30 Measurement Frequencies
Numerical categories are also called “classes”

31 Measurement Frequencies
For numerical categories, the maximum and minimum values in each category are called the “class limits”

32 Measurement Frequencies
For numerical categories, the range of values included in each category is called the “width”

33 Measurement Frequencies
The middle of each numerical category is called the “midpoint” Add the maximum and minimum (class limits) and divide by 2

34 Measurement Frequencies
Rounding may move observed values into different numerical categories The actual maximum and minimum values that end up in a given numerical category are called the “class boundaries”

35 Measurement Frequencies
We want at least 5 categories This allows us to pretend the data is still “continuous” (one of those statistical things)

36 Measurement Frequencies
For psychological reasons, we usually limit the number of categories to a maximum of 8

37 Measurement Frequencies
For psychological reasons, we usually limit the number of categories to a maximum of 8 Typically the human brain can compare only 7-8 things before becoming overloaded

38 Measurement Frequencies
So you should aim for 5-8 classes with “kinda-nice” class limit values

39 Which would be better? FREQUENCIES IN-CLASS PROBLEM
Minutes Internet Usage Number of Users 1-15 16-30 31-45 46-60 61-75 76-90 91-105 121+ Minutes Internet Usage Number of Users 1-20 21-40 41-60 61-80 81-100 121+

40 Create a frequency distribution:
FREQUENCIES IN-CLASS PROBLEM Create a frequency distribution: 48 92 50 29 40 129 43 108 39 42 57 104 83 45 81 123 38 67 32 65 46 80 100 98

41 Questions?

42 Measurement Frequencies
Open the data set on “InClass-Internet” Your assignment: create a chart for this data What could you do?

43 Measurement Frequencies
Bar chart? Yuck! Try an x-y plot!

44 Measurement Frequencies
Still yuck! Now what???

45 Measurement Frequencies
The numbers are not in any particular order – the dots don’t tell a story about the data (other than that it’s messy and disorganized…)

46 Measurement Frequencies
A better graph would group the numbers into a meaningful pattern that will answer an interesting question we might have about the data

47 Measurement Frequencies
Let’s sort the data! In Excel, first highlight just the numbers Then click on the “Data” tab

48 Measurement Frequencies
Click on “Sort” Use the “A->Z” sort to go from lo to hi

49 Measurement Frequencies
Poof! Little numbers on top, big numbers on the bottom Try a bar graph now…

50 Measurement Frequencies
Not as ugly… but still no story!

51 Measurement Frequencies
You need to consider what question you are trying to answer with the data

52 Measurement Frequencies
What you might want to show is: “Do people spend a lot of time on the Internet? How much?”

53 Measurement Frequencies
To show this, it would make sense to create categories from this quantitative data! (Believe it or not…)

54 Measurement Frequencies
How many minutes is “not many”? 5? 10? 15? Let’s say “10 or fewer” That becomes our first category: “1-10”

55 Measurement Frequencies
Type in: ‘1-10 or Excel will change it into a date

56 Measurement Frequencies
Start a “Summary Table” of categories and how many observations fall into each one: Minutes Internet Usage Number of Users 1-10 3

57 Measurement Frequencies
What would be the next category? It could be anything you decide… but…

58 Measurement Frequencies
It is less confusing to viewers to keep the categories the same width

59 Measurement Frequencies
In general: Categories should usually be the same width Categories should not overlap Categories should not leave any blank spaces in the continuum

60 Measurement Frequencies
Your previous category was “1-10 minutes” Its width is: = 10

61 Measurement Frequencies
SO, the next category should be right next to “1-10”, not overlap with “1-10” and be 10 wide: “ ”

62 Measurement Frequencies
Continuing, our categories will be: Minutes Internet Usage Number of Users 1-10 3 11-20 21-30 31-40 41-50 51-60 etc...

63 Measurement Frequencies
But… that’s still A LOT of categories! Our data goes up to 123 minutes!! The graph will be better than the original, but still cluttered!

64 Measurement Frequencies
Remember… the human brain can compare only 7-8 things before becoming overloaded

65 Measurement Frequencies
We need to redo our categories to have only about 7 or 8 of them

66 Measurement Frequencies
Our data goes from a minimum of 5 to a maximum of 123 If we had only one category, it would have a width: = 119

67 Measurement Frequencies
If we split the 119 into 7 equal pieces: 119/7 = is not a very “nice” number for category splits 15 or 20 would be “evener”

68 Measurement Frequencies
Which would be better? Minutes Internet Usage Number of Users 1-15 16-30 31-45 46-60 61-75 76-90 91-105 121+ Minutes Internet Usage Number of Users 1-20 21-40 41-60 61-80 81-100 121+

69 Measurement Frequencies
Now we have to get the number of users for each of these categories: Minutes Internet Usage Number of Users 1-20 21-40 41-60 61-80 81-100 121+

70 Measurement Frequencies
How many observations fall in this first category? Highlight the observations that are 20 or less

71 Measurement Frequencies
The number of observations is

72 Measurement Frequencies
The number of observations in that category is the frequency

73 Measurement Frequencies
Let’s graph these new categorical data: Minutes Internet Usage Number of Users 1-20 9 21-40 18 41-60 15 61-80 8 81-100 1 121+

74 Measurement Frequencies
Much better! The graph now tells a story

75 Measurement Frequencies
You can also now see that the value “123” is an “outlier”

76 Measurement Frequencies
Outliers have one or more empty (zero count) categories between their category and the others

77 Measurement Frequencies
Outliers can be a problem in statistical analysis You have to decide whether the value is truly an outlier and should be eliminated or a valid extension of the data

78 Measurement Frequencies
Another option:

79 Questions?

80 Frequencies A relative frequency is the fraction or percent of observations that fall in each category

81 Frequencies You first find the total sample size (n) by adding up all of the counts in each category

82 Frequencies Then divide each category count by n

83 Frequencies You can make these percentages by multiplying by 100 (or just clicking the % sign on the Excel ribbon)

84 Frequencies Data table: n = 8 A B A B A C B B Rel Freq Histogram: distribution: A: 3/8 B: 4/8 C: 1/8

85 Frequencies Notice the shapes of the absolute frequency and relative frequency graphs are the same

86 Frequencies Because we see % more easily in a pie chart, relative frequencies should be shown in this format

87 Create a relative frequency distribution:
FREQUENCIES IN-CLASS PROBLEM Create a relative frequency distribution: 48 92 50 29 40 129 43 108 39 42 57 104 83 45 81 123 38 67 32 65 46 80 100 98

88 Questions?

89 Measurement Frequencies
Numerical categories are also called “classes”

90 Measurement Frequencies
For numerical categories, the maximum and minimum values in each category are called the “class limits”

91 Measurement Frequencies
What are the class limits for the Franklin St data?

92 Measurement Frequencies
For numerical categories, the range of values included in each category is called the “width”

93 What is the class width for the Franklin St data?
FREQUENCIES IN-CLASS PROBLEM What is the class width for the Franklin St data?

94 Measurement Frequencies
The middle of each numerical category is called the “midpoint” Add the maximum and minimum (class limits) and divide by 2

95 What is the midpoint for the first class in the Franklin St data?
FREQUENCIES IN-CLASS PROBLEM What is the midpoint for the first class in the Franklin St data?

96 Measurement Frequencies
Rounding may move observed values into different numerical categories The actual maximum and minimum values that end up in a given numerical category are called the “class boundaries”

97 FREQUENCIES IN-CLASS PROBLEM What are the class boundaries for second category in the Franklin St data?

98 Questions?

99 What graph? Which are frequency distributions?
FREQUENCIES IN-CLASS PROBLEM What graph? Which are frequency distributions?

100 Questions?


Download ppt "Statistics Frequencies"

Similar presentations


Ads by Google