Presentation is loading. Please wait.

Presentation is loading. Please wait.

Identifying key characteristics of a set of data

Similar presentations


Presentation on theme: "Identifying key characteristics of a set of data"— Presentation transcript:

1 Identifying key characteristics of a set of data
Mr. Buckley gathered some information on his class and organized it in a table similar to the one below: Student Gender ACT Score Favorite Subject GPA James M 34 Statistics 3.89 Jen F 35 Biology 3.75 DeAnna 32 History 4.0 Jonathan 28 Literature 3.0 Doug 33 Algebra 2.89 Sharon 30 Spanish 3.25 Who are the Individuals in the data set? What variables were measured? Identify each as quantitative or categorical. Describe the distribution fo ACT scores. Could we infer from this set that the students who prefer math & science perform betteron the ACT?

2 -What are the Individuals in the data set?
Student Gender ACT Score Favorite Subject GPA James M 34 Statistics 3.89 Jen F 35 Biology 3.75 DeAnna 32 History 4.0 Jonathan 28 Literature 3.0 Doug 33 Algebra 2.89 Sharon 30 Spanish 3.25 -What are the Individuals in the data set? -What variables were measured? Identify each as quantitative or categorical. -Describe the distribution of ACT scores. -Could we infer from this set that the students who prefer math & science perform better on the ACT? Who are the Individuals in the data set? What variables were measured? Identify each as quantitative or categorical. Describe the distribution fo ACT scores. Could we infer from this set that the students who prefer math & science perform betteron the ACT?

3 Student Gender ACT Score Favorite Subject GPA James M 34 Statistics 3.89 Jen F 35 Biology 3.75 DeAnna 32 History 4.0 Jonathan 28 Literature 3.0 Doug 33 Algebra 2.89 Sharon 30 Spanish 3.25 -What are the Individuals in the data set? The student in Mr. Buckleys Class -What variables were measured? gender, act, favorite subject, GPA -Quantitative: ACT SCore, GPA Categorical: Gender, Favorite subject -Describe the distribution of ACT scores. Range from 28-35, center around 32, 4 scores bunched up in the range, Jonathans score doesnt quite fit with the rest of the scores. -Could we infer from this set that the students who prefer math & science perform better on the ACT? It appears the student who prefer math & science ahd higher scores on the ACT, however we cannot infer there is a large difference between the scores of the students and those who do not prefer math & science. Who are the Individuals in the data set? What variables were measured? Identify each as quantitative or categorical. Describe the distribution fo ACT scores. Could we infer from this set that the students who prefer math & science perform betteron the ACT?

4 Displaying Categorical Data
-Frequency table (Relative frequency table): displays the counts (or %)of individuals that take on each value of a variable.

5 Displaying Categorical Data
-Frequency table (Relative frequency table): displays the counts (or %)of individuals that take on each value of a variable. BECAUSE...Tables are sometimes difficult to read and they dont always highlight important features of a distribution.

6 Displaying Categorical Data
-Frequency table (Relative frequency table): displays the counts (or %)of individuals that take on each value of a variable. Tables are sometimes difficult to read and they dont always highlight important features of a distribution. Graphical displays of data are much easier to read and often reveal interesting patterns and departures from patterns in the distribution of data.

7 Displaying Categorical Data
-Frequency table (Relative frequency table): displays the counts (or %)of individuals that take on each value of a variable. Tables are sometimes difficult to read and they dont always highlight important features of a distribution. Graphical displays of data are much easier to read and often reveal interesting patterns and departures from patterns in the distribution of data. SO….. the use of Pie charts, Bar graphs, to display the distribution.

8 Displaying Quantitative Data With Graphs
-Construct & Interpret a Dot Plot and Stemplot -Construct & Interpret a Histogram -Describe the shape, center, and spread of a distribution -Identify major parts from a pattern of a distribution -Compare distributions of data

9 dotplot Skewed Right stemplot Skewed Left shape unimodal center
Vocabulary dotplot Skewed Right stemplot Skewed Left shape unimodal center bimodal spread histogram Outliers symmetric

10 Like Categorical Data…..
We use graphs to also display the data of quantitative data, in three different forms, dot plots, stem plots, & histograms.

11 DOT PLOT Less Abstract than Histograms or box plots and make it easy for students to recognize key features of a distribution of quantitative data. Step 1. Draw a Horizontal axis(a number line) and label it with the variable name. Sodium

12 DOT PLOT-Less Abstract than Histograms or box plots and make it easy for students to recognize key features of a distribution of quantitative data. Consumer Reports magazine rated frozen pizza in its January issue. Here are the amounts of sodium (in milligrams) in a single serving of 16 different brands of cheese Pizza. Step 2. Scale the Axis. Look at Min and Max,marking the scale equally proportion. Sodium

13 DOT PLOT-Less Abstract than Histograms or box plots and make it easy for students to recognize key features of a distribution of quantitative data. Consumer Reports magazine rated frozen pizza in its January issue. Here are the amounts of sodium (in milligrams) in a single serving of 16 different brands of cheese Pizza. Step 2. Scale the Axis. Look at Min and Max,marking the scale equally proportion. Min = 570 MAX = 870 I I I I I I I I I I I I I I I I Sodium

14 DOT PLOT-Less Abstract than Histograms or box plots and make it easy for students to recognize key features of a distribution of quantitative data. Consumer Reports magazine rated frozen pizza in its January issue. Here are the amounts of sodium (in milligrams) in a single serving of 16 different brands of cheese Pizza. Step 3.Mark a dot above the location on the horizontal axis corresponding to each data value. I I I I I I I I I I I I I I I I Sodium

15 DOT PLOT-Less Abstract than Histograms or box plots and make it easy for students to recognize key features of a distribution of quantitative data. Consumer Reports magazine rated frozen pizza in its January issue. Here are the amounts of sodium (in milligrams) in a single serving of 16 different brands of cheese Pizza. Step 3.Mark a dot above the location on the horizontal axis corresponding to each data value. I I I I I I I I I I I I I I I I Sodium

16 How to Examine the distribution of quantitative variable
Look at the overall pattern and for striking departures from the pattern. Shape: Symmetric? Left or Right Skewed? Unimodal? Bimodal? I I I I I I I I I I I I I I I I Sodium

17 How to Examine the distribution of quantitative variable
Look at the overall pattern and for striking departures from the pattern. Center: Mean,Median,Mode (usually the easiest measure of Center) I I I I I I I I I I I I I I I I Sodium

18 How to Examine the distribution of quantitative variable
Look at the overall pattern and for striking departures from the pattern. Spread: Range (Max-Min) or IQR I I I I I I I I I I I I I I I I Sodium

19 How to Examine the distribution of quantitative variable
Look at the overall pattern and for striking departures from the pattern. Outliers

20 Frozen Pizza- Calories per serving for 16 brands of frozen cheese pizza, along with a dotplot of the data:

21 Frozen Pizza- Describe the shape, Center,and any spread of the distribution. I I I I I I I I I Sodium

22 Frozen Pizza- Describe the shape, Center,and any spread of the distribution. I I I I I I I I I Sodium Solution: Shape: 310 & 340 and a main cluster of values between 310 & 360 calories Center: The middle value is 330 Calories (median) Spread: The values vary from 260 calories to 380 calories. Outliers: There is one pizza with an unusually small number of calories (260).

23 SOCS shape center spread Outliers Vocabulary dotplot Skewed Right
stemplot Skewed Left shape unimodal center bimodal spread histogram Outliers symmetric SOCS

24 SHAPE: Looking at symmetry & skewness
A Distribution is roughly SYMMETRIC if the right and left sides of the graph are approximately mirror images of each other. (when mean and median are approx =)

25 SHAPE: Looking at symmetry & skewness
A Distribution is SKEWED TO THE RIGHT if the right side of the graph is much longer than the left side. (Mode is smallest, then the median, and closest to the tail is the Mean)

26 SHAPE: Looking at symmetry & skewness
A Distribution is SKEWED TO THE LEFT if the let side of the graph is much longer than the right side. (Mean is smallest, then the median, and closest to the tail is the Mode)

27 Stemplots Give us a quick picture of the shape of a distribution while including the actual numerical values in the graph.

28 36 | 6 32 | 7 40 | 5 36 | 2 EPA-Measured MPG for 30 cars
Separate each observation into stem (everything except the final digit) & leaf (the final digit). EPA-Measured MPG for 30 cars 36 | 6 32 | 7 40 | 5 36 | 2

29 STEM In a vertical column write the stems, smallest at the top. Do not skip any stems, even if there is no data for that particular stem. 31 32 33 34 35 36 37 38 39 40 41

30 Write each leaf to the right of the stem.
31 8 32 7 33 6 34 2 5 35 1 8 36 37 38 5 39 40 3 5 41 0 0 Stemplots Give us a quick picture of the shape of a distribution while including the actual numerical values in the graph. ADD LEAVES Write each leaf to the right of the stem.

31 Arrange leaves in increasing order out from the stem.
ORDER LEAVES Arrange leaves in increasing order out from the stem. 31 8 32 7 33 6 34 2 5 35 1 8 36 37 38 5 39 40 3 5 41 0 0

32 PROVIDE A KEY. 31 8 32 7 33 6 34 2 5 35 1 8 36 37 38 5 39 40 3 5 41 0 0 KEY: | 8 represents the EPA-Measured this car at 31.8 MPG for 1 out of 30 cars tested.

33 KEY: 31 | 8 represents the EPA-Measured MPG for 1 out of 30 cars tested.
32 7 33 6 34 2 5 35 1 8 36 37 38 5 39 40 3 5 41 0 0 Distribution? Shape Outliers Center Spread

34 Shape: Fairly Symmetric
KEY: | 8 represents the EPA-Measured MPG for 1 out of 30 cars tested. 31 8 32 7 33 6 34 2 5 35 1 8 36 37 38 5 39 40 3 5 41 0 0 Distribution? Shape: Fairly Symmetric Outliers: There do not appear to be any extreme values. Center at about 37 mpg Spread ranges from 31.8 to 41 mpg

35 Constructing a Histogram: Histograms are used for larger sets of data.
Step 1) Divide data into equal classes of width Identify Min & max and determine best scale

36 Constructing a Histogram: Histograms are used for larger sets of data.
Step 1) Divide data into equal classes of width Identify Min & max and determine best scale Min = 1.2, max = 27.2; , 0-5, 5-10, 10-15, 15-20, 20-25, 25-30 Ie. 0 to <5, 5 to < 10, 10 to < 15, 15 to < 20, 20 to < 25, 25 to < 30

37 Constructing a Histogram: Histograms are used for larger sets of data.
2) Find the count (frequency) or percent (relative frequency) of individuals in each class. (Create a Frequency (relative) Table.) Class Percentage 0 to <5 40 5 to <10 26 10 to < 15 18 15 to < 20 10 20 to < 25 4 25 to <30 2 Total 100 Class Count 0 to <5 20 5 to <10 13 10 to < 15 9 15 to < 20 5 20 to < 25 2 25 to <30 1 Total 50

38 Constructing a Histogram: Histograms are used for larger sets of data.
3) Label, scale axes, and draw histogram. 1-Choose classes that are all the same width 2- Too few classes, skyscraper effect, and too many will give the pancake effect.

39 Constructing a Histogram: Histograms are used for larger sets of data.
4) Describe the distribution of the histogram. S O C


Download ppt "Identifying key characteristics of a set of data"

Similar presentations


Ads by Google