Download presentation

Presentation is loading. Please wait.

1
**Ways to Describe Data Sets**

Shape Outliers Center Spread

3
Shape We can use a graphs to look at the shape of the quantitative variable distribution. An example of a bell-shaped or normal distribution which appear often in nature: Symmetric Mean, median, mode roughly equal

4
**Shape: Skewed Distributions**

Scores from an easy exam, skewed left. Scores from a hard exam, skewed right. Non-symmetric Mean < Median < Mode Non-symmetric Mean > Median > Mode

5
Shape Shape described by number of peaks (mode)

6
**What does this graph tell you?**

7
Outliers An outlier is an extreme value of the data (extremely high or extremely low). It is an observation value that is significantly different from the rest of the data. There may be more than one outlier in a set of data.

8
**Outliers Possible Reasons for Outliers:**

1. An error was made while taking the measurement or entering it into the computer. 2. The individual belongs to a different group than the bulk of individuals measured. 3. The outlier is a legitimate, though extreme data value.

9
**Identifying Outliers (1.5*IQR)**

We can identify an outlier if it is Less Q1 – 1.5×IQR or Greater than Q ×IQR

10
Example Make a box and whisker plot of the data and identify any outliers. 10, 12, 11, 15, 11, 14, 13, 17, 12, 22, 14, 11

11
**The prices of a gallon of gasoline (in dollars) for selected countries in 2003 are listed below.**

Australia: $2.20 Canada: $2.02 Germany: $4.58 Mexico: $2.09 United States: $1.59 Japan: $3.47 Taiwan: $2.16 Make a box and whisker plot for the gasoline prices. Which countries, if any, had gasoline prices that can be considered outliers?

12
**Center Measures of center: mean, median, mode Average of the data set**

The middle value of a data set arranged from smallest to largest The data value that occurs the most often, is a common measure of center for categorical data

13
Mean vs. Median

14
Center When describing data, you must decide which number is the most appropriate description of the center. Mean Median applet: Use the mean on symmetric data and the median on skewed data or data with outliers

15
**What does this graph tell you?**

16
**Mean Absolute Deviation Variance and Standard Deviation**

Spread Range Interquartile Range Mean Absolute Deviation Variance and Standard Deviation Max value subtract minimum value (spread of all data) Interquartile range (IQR) : shows middle 50% of data IQR = Q3 – Q1 Not affected as much by outliers Use when measure of center is median Average distance between each data value and the mean Use when measure of center is mean a measure of the “average” deviation of all observations from the mean.

17
5 number summary Complete a 5 number summary and box and whisker plot for the following data. Number of hours spent on internet per week: 12, 4, 16, 18, 1, 6, 10, 8

18
**Mean Absolute Value Deviation**

To calculate Mean Absolute Value Deviation: Calculate the mean for the data set. Find the distance between each data value and the mean. That is, find the absolute value of the difference between each data value and the mean. Find the average of those distances.

19
**Example Find the mean absolute value of the following data set:**

52, 48, 60, 55, 59, 54, 58, 62

20
Standard Deviation A measure of spread is the Standard Deviation: a measure of the “average” deviation of all observations from the mean. The symbol for Standard Deviation is σ (the Greek letter sigma).

21
**Standard Deviation To calculate Standard Deviation:**

Calculate the mean for the data set. Determine each observation’s deviation: subtract the mean from each data point. (𝑥 − 𝑥 ). Square each deviation. “Average” the squared-deviations by totaling the squared- deviation and dividing the total squared deviation by (n-1). This quantity is the Variance. Square root the result to determine the Standard Deviation.

22
**Example Calculate the standard deviation of the following test scores:**

15, 20, 21, 20, 36, 15, 25, 15

23
**The most appropriate measure of variability depends on …**

The shape of the data’s distribution! If data are symmetric, with no serious outliers, use range and standard deviation. If data are skewed, and/or have serious outliers, use IQR.

24
**Comparing Data Quantitative Data: through graphs**

Categorical Data: through two way frequency tables

25
Graphs Multiple bar graphs Multiple box and whisker plots

26
**Two Way Frequency Tables**

These tables examine the relationships between the two categorical variables. A two-way frequency table will deal with two variables

27
**Two-Way Frequency Table**

28
**Two-Way Relative Frequency Table**

Relative frequency is the ratio of the value of a subtotal to the value of the total.

29
Example Create a two-way frequency table for the following problem.

30
Hint

31
Answer

32
Example

Similar presentations

© 2020 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google