Presentation is loading. Please wait.

Presentation is loading. Please wait.

What is Statistics? Statistics is the science of collecting, analyzing, and drawing conclusions from data –Descriptive Statistics Organizing and summarizing.

Similar presentations


Presentation on theme: "What is Statistics? Statistics is the science of collecting, analyzing, and drawing conclusions from data –Descriptive Statistics Organizing and summarizing."— Presentation transcript:

1 What is Statistics? Statistics is the science of collecting, analyzing, and drawing conclusions from data –Descriptive Statistics Organizing and summarizing –Inferential Statistics Generalizing from a sample to the population from which it was selected

2 What kind of data is there? How can it be graphed for visual comparison? How can it be described verbally? How can it be analyzed numerically? Describing Data

3 Data--Types of Variables CategoricalGroup or category names w/no order Eye Color (brown, blue, green) QuantitativeNumerical values (in order, can be averaged, etc.) Weight (117 lbs, 170 lbs, 253 lbs)

4 Types of Quantitative (Numerical) Data Discrete Takes on only certain values Example: Number of siblings, number of pockets in a pair of jeans, number of free throws made in a season,… Continuous Takes on any of an infinite number of values Example: Time, Weight, Height, …because of our limitations of measurement accuracy we often round to the nearest second, ounce, inch,…

5 Describing Univariate Data The distribution of a variable tells us what values the variable takes, how often it takes those values, and shows the pattern of variation Categorical –Bar graph –Segmented Bar Graph –Pie chart Quantitative –Dotplot –Stemplot (Stem & leaf) –Histogram (Frequency distribution) –Ogive: Cumulative relative frequency plot –Boxplot

6 Bar, Segmented Bar, & Pie Charts

7 What’s misleading about this graph?

8 Source: Marist Institute for Public Opinion How is this graph misleading?

9 Describing Data using Summary Features of Quantitative Variables Center — Location in middle of all data Unusual features - Outliers, gaps, clusters Spread—Measure of variability, range Shape—Distribution pattern: symmetric, skewed, uniform, bimodal, etc. Always CUSS in context!

10 Dotplot for Univariate Quantitative Data

11 1.11 Stemplot Answer 0 399 1 1345677889 2 000123455668888 3 25699 4 1345579 5 0359 6 1 7 0 8 366 9 3 0 3 0 99 1 134 1 5677889 2 0001234 2 55668888 3 2 3 5699 4 134 4 5579 5 03 5 59 6 1 6 7 0 7 8 3 8 66 9 3 (c) The distribution is skewed to the right. The spread is approximately 90 (3 to 93). The center of the distribution is at approximately $28. There are several moderate outliers visible in the split- stem plot; specifically, the five amounts of $70 or more. While most shoppers spent small to moderate amounts of money around $30, a “cluster” of shoppers spent larger amounts ranging from $70 to $93. a) 1| 9 represents $19 spent at store b) Stemplots: Stems & Leaves in order Leave stem blank if no leaf Split stems if too few stems

12 Back-to-back Stemplot Babe Ruth Roger Maris | 0 | 8 | 1 | 3, 4, 6 5, 2 | 2 | 3, 6, 8 5, 4 | 3 | 3, 9 9, 7, 6, 6, 6, 1, 1 | 4 9, 4, 4 | 5 | 0 | 6 | 1 Number of home runs in a season When comparing data, use comparative language! (higher, more than, etc.)

13 Histogram of Discrete Data: Rolling a fair six-sided die 300 times

14 1.14 Answer Histogram of Continuous Data The center is located at 350 ($350,000). There appears to be one outlier of $1,103,000. The distribution is skewed to the right with a peak in the $200,000s. The spread is approximately $1,082,000 ($21,000 to $1,103,000) Which bars did the $200,000 and $300,000 salaries go? Border values always go in the bar on the right! (First bar is salaries of at least 0 to less than $100,000.)

15 Histograms on the calculator Enter data into List1 by going to Stat, 1:Edit Turn StatPlot on and choose histogram option. Set Xlist to the list you used to enter in the data. Choose 1 for Freq or a 2nd list if data is stored in two lists (values in one, frequency in another) Press Zoom 9:Statplot to set window to the data initially Check the window and set reasonable, pretty values of min & max for both x (values) and y (frequency count). The Xscl will set the width of the bins – make this is a “pretty” number also! Then press graph to see the adjusted graph Press trace to see details of the graph

16 Histogram of People’s Weights

17 Data from Histogram Weight Class IntervalFrequency Relative Frequency Cumulative Relative Frequency 100 to <12030.038 120 to < 14021 140 to < 16024 160 to < 18019 180 to < 2005 200 to < 2203 220 to < 2404 Total79 0.3040.608 0.2410.849 0.0630.912 0.0380.95 0.0511.001 0.2660.304

18 Ogive: Cumulative Relative Frequency Graph Weight (in pounds) Cumulative Relative Frequency

19 5 Number Summary Minimum Q1 (lower quartile) is the 25 th percentile of ordered data or median of lower half of ordered data Median (Q2) is 50 th percentile, or middle number of ordered data (average the two middle numbers if there is an even number of #s) Q3 (upper quartile) is the 75 th percentile of ordered data or median of upper half of ordered data Maximum Range = Maximum – minimum IQR(Interquartile Range) = Q3 – Q1 Outlier Formula: Any point that falls below Q1- 1.5(IQR) or above Q3 + 1.5(IQR) is considered an outlier.

20 Boxplot – using the 5 # summary Salaries from 1.14 – Enter in calc and press stat, calc, 1-var stats Min 21 Q1 250 Median 350 Q3 543 Max 1103 Check for outliers: IQR = Q3 – Q1 = 543-250 =293 Low boundary: Q1 - 1.5(IQR) = 250 – 1.5(293) = -389.5 no outliers on low end since no salaries are less than this High boundary: Q3 + 1.5(IQR) = 543 + 1.5(293) = 982.5 one outlier on high end (1103) since it is higher than 982.5 Max value that’s not an outlier

21

22 Scatterplot—Bivariate quantitative data


Download ppt "What is Statistics? Statistics is the science of collecting, analyzing, and drawing conclusions from data –Descriptive Statistics Organizing and summarizing."

Similar presentations


Ads by Google