Presentation is loading. Please wait.

Presentation is loading. Please wait.

MAT 1000 Mathematics in Today's World. Last Time 1.Three keys to summarize a collection of data: shape, center, spread. 2.The distribution of a data set:

Similar presentations


Presentation on theme: "MAT 1000 Mathematics in Today's World. Last Time 1.Three keys to summarize a collection of data: shape, center, spread. 2.The distribution of a data set:"— Presentation transcript:

1 MAT 1000 Mathematics in Today's World

2 Last Time 1.Three keys to summarize a collection of data: shape, center, spread. 2.The distribution of a data set: which values occur, and how often they occur 3.Graph the distribution to describe its shape

3 Today Two useful ways to describe the center of a distribution: mean and median. How are they calculated? They can be either statistics or parameters. Why have two ways to find the center?

4 Describing Center There are different notions of the "center" of a distribution. The three most common are: Mean Median Mode

5 Mean The mean is just another word for the average. How do we calculate the mean of a list of numbers? If we have n numbers then we add them up and then divide by the number n.

6 Mean Example: 3, 1, 5, 7, 20 The mean (= average) in this case is:

7 Formula for the mean Given n numbers (I will give you this formula on tests.) The mean of these numbers is:

8 Statistics and parameters Recall that a statistic is a number that describes a sample, a parameter is a number that describes a population. The mean of a set of numbers can be either one, depending on where those numbers come from.

9 Statistics and parameters Example Suppose I want to know the average height of all Wayne State students. I could measure every WSU student, add up their heights, and divided by the number of WSU students. This is a parameter.

10 Statistics and parameters Example On the other hand, I could use a sample of Wayne State students, say the students in this class. I could measure the height of all MAT 1000 students, add up those heights, and divide by the number of MAT 1000 students. This would be a statistic.

11 Statistics and parameters

12

13 Mean How can we estimate the mean from the graph of a distribution? The mean represents the “balance point” of the distribution.

14 Mean It’s easy to see where a symmetric distribution balances… …right in the middle.

15 Mean What about an asymmetric distribution? The midpoint of the distribution is clearly not the balance point. Here, the balance point is further to the right.

16 Mean What about an asymmetric distribution? The midpoint of the distribution is clearly not the balance point. Here the balance point is further to the right.

17 Mean In a right-skewed distribution, the mean will be to the left of the midpoint:

18 Mean Example: Suppose we look at 10 people’s savings accounts. Nine have $1 in their accounts, and the tenth has $1,000,000. Does this represent the “typical” account size among these 10 people? The very large savings account is clearly an outlier in that data set, and it is also the cause of the large mean.

19 Mean As a measure of center, the mean is “susceptible to outliers.” This also means that if a distribution is strongly skewed, the long tail will tend to pull the mean in the same direction. Sometimes it is better to have a measure of center which is not susceptible to outliers.

20 Median The median of a ordered list of numbers is the number in the middle. Must put the numbers in order from smallest to largest. If the number of data values is odd, there is a middle number, and this is the median. Example 1 3 5 9 9 The median here is 5.

21 Median If the number of data values is even, there is no middle number. Example 1 3 5 9 9 10 In that case, the median is the mean of the middle pair. So here the median is 7. Notice the median doesn’t need to be in the data set.

22 Median Just like with means, medians may either be parameters or statistics. There is no commonly used notation to distinguish a median which is a statistic and a median which is a parameter. We won’t worry about notation for medians.

23 Median Let’s revisit those ten people and their savings accounts. What is the median of this data set? 1 1 1 1 1 1 1 1 1 1,000,000 There are ten values in this set, so the median is the mean of the middle pair, in this case it is 1.

24 Median Estimating the median from the graph of a distribution is harder than estimating the mean. But we can use that the median is less sensitive to outliers to get a general idea.

25 Median In a symmetric distribution the median (green) will be close to the mean (blue).

26 Median In a left-skewed distribution the mean (blue) is smaller than the median (green). The “long tail” pulls the mean to the left.

27 Median In a right-skewed distribution, the mean (blue) will be larger than the median (green): Here the tail pulls the mean to the right.

28 Comparing the mean and median Example How much money does the typical American earn? In 2004, the mean income was $60,528. The median was $43,389. Why the discrepancy? The distribution of incomes is skewed to the right: you can’t have an income less than $0, but there is no upper limit on income. The number of people earning very large incomes is relatively small, but those large incomes affect the mean.

29 Comparing the mean and median Example The famous biologist Stephen Jay Gould was diagnosed with a form of cancer that had a median survival time of 8 months. He lived another 20 years, dying of a different, unrelated cancer. The median tells us what happens about half the time. If 30-40% of people with a disease can be completely cured, the mortality distribution will be skewed to the right.

30 Comparing the mean and median Whether to use the mean or median depends on the shape of the distribution. For a symmetric distribution with few outliers, the mean is a good measure of center. If the distribution is asymmetric or has lots of outliers, the median is a better choice. How do you determine the shape of a distribution? Look at a histogram!


Download ppt "MAT 1000 Mathematics in Today's World. Last Time 1.Three keys to summarize a collection of data: shape, center, spread. 2.The distribution of a data set:"

Similar presentations


Ads by Google