Presentation is loading. Please wait.

Presentation is loading. Please wait.

1/23/2016Slide 1 We have seen that skewness affects the way we describe the central tendency and variability of a quantitative variable: if a distribution.

Similar presentations


Presentation on theme: "1/23/2016Slide 1 We have seen that skewness affects the way we describe the central tendency and variability of a quantitative variable: if a distribution."— Presentation transcript:

1 1/23/2016Slide 1 We have seen that skewness affects the way we describe the central tendency and variability of a quantitative variable: if a distribution is more skewed than the threshold of -1.0 to 1.0, we report the median and interquartile range rather than the mean and standard deviation. A major cause of skewed distributions is the presence of outliers – cases that have very small or very large scores relative to the other cases in the distribution. Outliers have a larger effect on the results of statistical analysis than other cases. One far outlier may change our view of central tendency and variability for the entire distribution.

2 1/23/2016Slide 2 Outliers pose a dilemma for us in terms of our justification for either omitting them or retaining them in the analysis. It is easy to remove outliers that were data entry errors. It is more difficult to defend removing outliers when the scores represent accurate data. One response to the dilemma is to run the analysis with and without the outliers, and describe the difference. Sometimes it makes little difference and we can ignore the presence of the outliers. Another response to the dilemma is to re-express or transform the variable and see if the outliers are eliminated. If there are no outliers using the re-expressed data, we can run the analysis with the re-expressed data and draw our conclusions based on the results for the re-expressed variables.

3 1/23/2016Slide 3 Two downsides to the strategy of re-expressing data are: the skepticism of audiences who already think we massage the numbers to produce the results we want, and the need to convert the results back to the original scale if we need to report numerical results. In this problem set, we will use a boxplot strategy for detecting outliers and examine the use of two of the possible transformations: the square and the logarithm. The Explore procedure in SPSS provides both the boxplot and the descriptive statistics needed to solve these problems. In the boxplot, two types of outliers are identified by symbols: circles for outliers, and stars for extreme (or far) outliers.

4 1/23/2016Slide 4 A case is identified as an outlier (circle) if its value is less than or equal to the first quartile minus 1.5 times the interquartile range, or is greater than or equal to the third quartile plus 1.5 times the interquartile range. If the case has a value less than or equal to the first quartile minus 3 times the interquartile range or greater than the third quartile plus 3 times the interquartile range, it is characterized as a far outlier (stars). If outliers or far outliers are found for a variable, we will examine the behavior of the outliers when the variable is re- expressed by computing the logarithm of the values if the variable is skewed to the right. If the variable is negatively skewed, we will square the values and examine the effect on the outliers.

5 Slide 5 The HistogramBoxplot script positions the boxplot above the histogram. In this chart, we see a number of circles at the right end of the distribution. These are outliers, and there are no far outliers in this distribution. As we would expect, this distribution has a skewness problem (skewness=1.47) at the bottom of the table of descriptive statistics. NOTE: the horizontal axis for the boxplot approximates the axis for the histogram, but is not exact.

6 Slide 6 Some distributions will show both outliers and far outliers. Our problems will state the number of outliers, and the number of far outliers as a subset of the total number of outliers. Note that the chart shows the presence or absence of outliers, but does not necessarily provide an exact count since the outlier symbol might represent more than one case with the score.

7 Slide 7 The boxplots for some distributions will indicate that there are no outliers.

8 Slide 8 The boxplot for the distribution for this variable shows several outliers at the right end of the distribution. When the variable is positively skewed, the data values are re-expressed on the logarithmic scale. When we re-express the values for the variable on a logarithmic scale, the boxplot does not indicate that there are any outliers.

9 Slide 9 If we re-express the data values using the wrong transformation, we actually increase the problem of outliers. The distribution was positively skewed, and we squared the data values, rather than converting to a log scale, resulting in more outliers.

10 Slide 10 The boxplot of the squared values indicates that there are no outliers for the re-expressed data values. The boxplot for the distribution for this variable shows several outliers at the low end of the scale. Since this variable is skewed to the left, we will re-express the data values as squares.

11 Slide 11 If we re-express the data values using the wrong transformation, we actually increase the problem of outliers. The distribution was negatively skewed, and we applied a log transformation rather than the square transformation, resulting in more outliers.

12 Slide 12 Re-expressing the data values does not always remedy the problem of outliers. In the chart to the right, the logarithmic transformation appears to have only partially removed the outliers.

13 Slide 13 Some variables have outliers at both ends of the distribution. We can attempt to re-express the variable based on the presence of outliers and far outliers, but the re- expression may not be effective because each re-expression works on only one tail of the distribution.

14 1/23/2016Slide 14 Re-expression changes the measuring scale for the variable by altering the distance between the values. All of the lines below represent the numbers 1 to 10, on a decimal, logarithmic, and squared scale. On our familiar decimal measuring scale, the distance between numbers is the same for all numbers. On a logarithmic scale, the distance between the numbers decreases as the numbers get larger On a square scale, the distance between the numbers decreases as the values get smaller. All of the dots represent the same sequence of values from 1 to 10 on different measuring scales.

15 1/23/2016Slide 15 The logarithmic transformation works by stretching the scale at the left end of the distribution and compressing the scale at the right end of the distribution. As shown in the diagram below, the numbers 1 to 5 (red dots) are converted to their log equivalents (blue dots). The distance between the log points decreases as the values increase. The distance between the log of 4 and the log of 5 is less than the distance between the log of 1 and log of 2.

16 1/23/2016Slide 16 Positive skewing is reduced because the distance between consecutive numbers on the decimal scale decreases as the size of the decimal number increases. For example, the difference between the log of 2 and the log of 3 is 0.176, larger than the difference between the log of 4 and log of 5, which is 0.097. decimal scalelog scaledifference between consecutive values 10.000 20.301 30.4770.176 40.6020.125 50.6990.097 60.7780.079 70.8450.067 80.9030.058 90.9540.051 101.0000.046

17 1/23/2016Slide 17 The square transformation works by compressing the scale at the left end of the distribution and stretching the scale at the right end of the distribution. As shown in the diagram below, the numbers 1 to 5 (red dots) are converted to their squared equivalents (blue dots). The distance between the squared points increases as the values increase. The distance between the square of 4 and the square of 5 is larger than the distance between the square of 1 and square of 2.

18 1/23/2016Slide 18 Negative skewing is reduced because the distance between consecutive numbers on the decimal scale increases as the size of the decimal number increases. For example, the difference between the square of 2 and the square of 3 is 5.0, less than the difference between the square of 4 and square of 5, which is 9.0. decimal scalesquared scale difference between consecutive values 11.000 24.0003.000 39.0005.000 416.0007.000 525.0009.000 636.00011.000 749.00013.000 864.00015.000 981.00017.000 10100.00019.000

19 1/23/2016Slide 19 As long as we can reverse the transformation and get back to the original values, the transformations are legitimate. To make certain we can get back to the original values, we must make certain the numbers on all scales are mathematically defined as real numbers. Not all numbers are defined, such as the logarithm of 0 and the square root of negative numbers. To make certain we do not do a transformation we cannot work backwards, we may need to add a constant to each number. If numbers are negative, we add the amount of the smallest value to each number. If the smallest value in the distribution is 0, we add 1 to each score in the distribution. Since we are starting out with transformations, the problem statement will tell you if you need to add a numeric constant when doing the transformations.

20 Outliers and Re-expressing Data Homework Problems

21 1/23/2016Slide 21 Outliers and Re-expressing Data - 1 These problems include a series of narrative statements, but no table. APA guidelines suggest that a table not be used if it would contain information for only a single variable. The notes provide information about the data set to use (world2003.sav), the variable used in the problem, hivRate, and the formulas for re-expression if needed.

22 Slide 22 To compute the descriptive statistics and charts that we need to check for outliers, select the Descriptive Statistics > Explore command from the Analyze menu. Outliers and Re-expressing Data - 2

23 Slide 23 Move the variable for the analysis hivRate to the Dependent List list box.. Click on the Statistics button to select optional statistics. Outliers and Re-expressing Data - 3

24 Slide 24 The check box for Descriptives is already marked by default. Click on Continue button to close the dialog box. Mark the Percentiles check box. This will provided the upper and lower bounds for the interquartile range. While there is a check box for Outliers, it lists the five largest scores and the five smallest scores, but does not tell us whether or not they are really outliers. Outliers and Re-expressing Data - 4

25 Slide 25 Next, we click on the Plots button to obtain visual evidence of the presence of outliers in the distribution. Outliers and Re-expressing Data - 5

26 Slide 26 We accept the default for the Box plot, which provides us the output we need even though we are not using factor levels in this problem. We accept the default Stem-and-Leaf plot, and mark the check box for a Histogram as well. We click on the Continue button to close the Plots dialog. Outliers and Re-expressing Data - 6

27 Slide 27 After returning to the Explore dialog box, click on the OK button to produce the output. Outliers and Re-expressing Data - 7

28 Slide 28 The SPSS output includes three tables and three graphs. Outliers and Re-expressing Data - 8 The Case Processing Summary tells us the number of valid and missing cases. The table of Descriptives contains the different measures of central tendency, variability, and shape of the distribution. The table of Percentiles contains the statistics used to construct the box plot and evaluate outliers.

29 1/23/2016Slide 29 Outliers and Re-expressing Data - 9 The histogram shows us a plot of the class intervals for the variable. We can see that the variable is badly skewed to the right and probably contains outliers.

30 1/23/2016Slide 30 Outliers and Re-expressing Data - 10 The stem and leaf plot provides information similar to that supported by the histogram.

31 1/23/2016Slide 31 Outliers and Re-expressing Data - 11 The box plot shows a distribution that contains numerous outliers (circles) and far outliers (asterisks).

32 1/23/2016Slide 32 Outliers and Re-expressing Data - 12 The first paragraph asks for the total number of cases in the data set, the number of cases excluded because of missing data, and the number of cases used in the analysis.

33 Slide 33 The 'Case Processing Summary' in the SPSS output showed the total number of cases in the data set to be 192, with 166 valid cases and 26 cases that do not have a valid value for the variable. Outliers and Re-expressing Data - 13

34 1/23/2016Slide 34 Outliers and Re-expressing Data - 14 The total number of cases in the data set, the number of cases excluded because of missing data, and the number of cases used in the analysis are entered in the blanks.

35 1/23/2016Slide 35 Outliers and Re-expressing Data - 15 The second paragraph is focused on the descriptive statistics for the variable hivRate that used to construct the boxplot, and thus provide a basis for detecting outliers and far outliers. The first blank asks us to enter the value of the median. The second blank asks us to enter the value of the interquartile range. The third and fourth blanks ask us to enter the quartiles used to compute the interquartile range. There are different ways to calculate the quartiles. The boxplot uses a method call “Tukey’s Hinges” which may produce a different value for the interquartile range than what SPSS presents in the table of Descriptive Statistics. We will use the table of “Percentiles” in these problems.

36 1/23/2016Slide 36 Outliers and Re-expressing Data - 16 The median is the value at the 50 th percentile.

37 1/23/2016Slide 37 Outliers and Re-expressing Data - 17 The interquartile range is the difference between the third quartile (75 th percentile) of 1.90 and first quartile (25 th percentile) 0f.10, which is equals 1.80. The lower bound of the interquartile range is the first quartile, or 25 th percentile, which is.10. The upper bound of the interquartile range is the third quartile, or 75 th percentile, which is 1.90.

38 1/23/2016Slide 38 Outliers and Re-expressing Data - 18 The first sentence in the third paragraph defines the numeric criteria for outliers.

39 1/23/2016Slide 39 Outliers and Re-expressing Data - 19 Our first task is to compute the values that would let us determine whether or not a case is an outlier. The value for the first quartile (the 25 th percentile) is 0.10. The value for the third quartile (the 75 th percentile) is 1.90. The interquartile range is the difference between the two: 1.80 (1.90 – 0.10 = 1.80). To be characterized as an outlier in the distribution of "Percent of adults living with hiv/aids" [hivRate], a case would have to have: a value less than -2.60 [Q1 – (1.5 x IQR) = 0.10 – (1.5 x 1.80) = -2.60] or a value greater than 4.60 [Q3 +( 1.5 x IQR) = 1.90 + (1.5 x 1.80) = 4.60] NOTE: multiply IQR by 1.5 before adding to Q3 or subtracting from Q1.

40 1/23/2016Slide 40 Outliers and Re-expressing Data - 20 The lower bound for outliers (-2.60) and the upper bound for outliers (4.60) are entered into the statement. The second sentence in the third paragraph asks us how many outliers were in the distribution.

41 Slide 41 While there are a number of strategies that could be used to count outliers, we will create a new variable that will have a value of 1 if the score for the case is an outlier, and 0 if it is not an outlier. A frequency distribution on this new variable will tell us how many outliers are in the distribution To compute the new variable, select the Compute command from the Transform menu. Outliers and Re-expressing Data - 21

42 Slide 42 We will name the new variable outlier, selecting a name which describes its contents. Type the formula as shown in the Numeric Expression text box. The formula will assign outlier a value of 1 if the score is less than -2.60 or greater than 4.60. If the value is not outside this range, outlier will be assigned a 0. Outliers and Re-expressing Data - 22 This works because SPSS assigns a 1 to a true statement and a 0 to a false statement.

43 Slide 43 To find the number of outliers, we create a frequency distribution for the variable outlier. To create the frequency distribution, select Descriptive Statistics > Frequencies from the Analyze menu. Outliers and Re-expressing Data - 23

44 Slide 44 First, move the variable outlier to the Variable(s) list box. Second, click on the OK button to produce the output. Outliers and Re-expressing Data - 24

45 Slide 45 A total of 24 of the cases have a score of 1 on the outlier variable. There are 24 outliers. Recall that an outlier will have a value of 1 if the case score is less than -2.60 or greater than 4.60. The number of outliers is equal to the number of cases where outlier is equal to 1. Outliers and Re-expressing Data - 25

46 1/23/2016Slide 46 Outliers and Re-expressing Data - 26 The first sentence in the fourth paragraph defines the numeric criteria for far outliers.

47 1/23/2016Slide 47 Outliers and Re-expressing Data - 27 Our next task is to compute the values that would let us determine whether or not a case is an far outlier. The value for the first quartile (the 25 th percentile) is 0.10. The value for the third quartile (the 75 th percentile) is 1.90. The interquartile range is the difference between the two: 1.80 (1.90 – 0.10 = 1.80). To be characterized as an far outlier in the distribution of "Percent of adults living with hiv/aids" [hivRate], a case would have to have: a value less than -5.30 [Q1 – (3.0 x IQR) = 0.10 – (3.0 x 1.80) = -5.30] or a value greater than 7.30 [Q3 + (3.0 x IQR) = 1.90 + (3.0 x 1.80) = 7.30] NOTE: multiply IQR by 3.0 before adding to Q3 or subtracting from Q1.

48 1/23/2016Slide 48 Outliers and Re-expressing Data - 28 The lower bound for far outliers (- 5.30) and the upper bound for outliers (7.30) are entered into the statement. The second sentence in the fourth paragraph asks us how many far outliers were in the distribution.

49 Slide 49 We will create a new variable that will have a value of 1 if the score for the case is an far outlier, and 0 if it is not an far outlier. To compute the new variable, select the Compute command from the Transform menu. Outliers and Re-expressing Data - 29

50 Slide 50 We will name the new variable faroutlier, selecting a name which describes its contents. Type the formula as shown in the Numeric Expression text box. The formula will assign faroutlier a value of 1 if the score is less than -5.30 or greater than 7.30. If the value is not outside this range, faroutlier will be assigned a 0. Outliers and Re-expressing Data - 30

51 Slide 51 To find the number of far outliers, we create a frequency distribution for the variable faroutlier. To create the frequency distribution, select Descriptive Statistics > Frequencies from the Analyze menu. Outliers and Re-expressing Data - 31

52 Slide 52 First, move the variable faroutlier to the Variable(s) list box. Second, click on the OK button to produce the output. Outliers and Re-expressing Data - 32

53 Slide 53 A total of 13 of the cases have a score of 1 on the faroutlier variable. There are 13 far outliers. Recall that a far outlier will have a value of 1 if the case score is less than -5.30 or greater than 7.30. The number of far outliers is equal to the number of cases where faroutlier is equal to 1. Outliers and Re-expressing Data - 33

54 1/23/2016Slide 54 Outliers and Re-expressing Data - 34 The last paragraph focuses on re-expression of the variable if outliers or far outliers are found. If there are no outliers or far outliers in the distribution, we only answer the questions about direction, size of the skewness, and effectiveness of the re-expression. The other blanks are na. The number of far outliers is entered into the blank.

55 1/23/2016Slide 55 Outliers and Re-expressing Data - 35 The first sentence in the last paragraph asks for the direction and size of the skewness.

56 1/23/2016Slide 56 Outliers and Re-expressing Data - 36 To answer the blanks on skewness, we return to the table of Descriptive Statistics for the original variable. A skewness value of 4.02 tells us that the distribution is positively skewed (skewed to the right).

57 1/23/2016Slide 57 Outliers and Re-expressing Data - 37 For distributions that are positively skewed, the text book recommends a logarithmic transformation. For distributions that are negatively skewed, the text book recommends a square transformation.

58 1/23/2016Slide 58 Outliers and Re-expressing Data - 38 The second sentence ask for the number of outliers remaining after re- expression. To answer this question, we repeat the steps that we used above to determine the number of outliers for the original variable: create the transformed variable; compute the descriptive statistics to obtain the first and third quartile, and the interquartile range; compute the values for detecting outliers and far outliers create a variable that measures whether or not a cases was an outlier or a far outlier; and use a frequency distribution to tally the number of outliers. We use the formulas in note 2 when we need to re-express the variable.

59 1/23/2016Slide 59 Outliers and Re-expressing Data - 39 The formula for transforming the variable is based on the direction of the skewness and the minimum value for the variable. The skewness determines whether we try a log transformation or a square transformation. For either transformations, the argument in parentheses is: the name of the variable if the minimum value for the variable is greater than 0, or the name of the variable plus the absolute value of the minimum value plus 1 if the minimum value is less than or equal to 0.

60 Slide 60 To compute the transformed variable, select the Compute Variable command from the Transform menu. Outliers and Re-expressing Data - 40

61 Slide 61 In the Compute Variable dialog box, we type the name for the new variable, LG_hivRate, in the Target Variable text box. Click on the Arithmetic function group so that the list of available functions appears in the Functions and Special Variables list box. Outliers and Re-expressing Data - 41 To make the relations between variables clear, I prepend the variable name with LG_ if the logarithmic transform is used and SQ_ if the square transformation is used.

62 Slide 62 First, in the list of Functions and Special Variables, highlight Lg10 which computes logarithmic values using a base of 10. Second, click on the up arrow button to paste the Lg10 function in the Numeric Expression text box. Outliers and Re-expressing Data - 42

63 Slide 63 Next, type the name of the variable to be transformed hivRate between the parentheses after the function name. Finally, click on the OK button to compute the transformed variable. Outliers and Re-expressing Data - 43 If a square transformation were required, I would have named the variable SQ_hivRate and used the formula (hivRate)**2 in the numeric expression text area.

64 Slide 64 Scroll the data editor window to the right to see the transformed variable, LG_hivRate. Note that I moved the hivRate variable to the right as well. It will not appear in this position in your data editor. Outliers and Re-expressing Data - 44

65 Slide 65 To calculate the descriptive statistics so we can identify outliers on the transformed variable, click on the Dialog Recall tool button. Outliers and Re-expressing Data - 45

66 Slide 66 In the pop-up menu for Dialog Recall, select the Explore item (the second to the last command we executed in SPSS). Outliers and Re-expressing Data - 46

67 Slide 67 Since we want the same statistics computed in the last Explore procedure, we only need to replace the variable hivRate with LG_hivRate. Click on the OK button to produce the output. Outliers and Re-expressing Data - 47

68 1/23/2016Slide 68 Outliers and Re-expressing Data - 48 The box plot for LG_hivRate shows no circles or asterisks, indicating that there are no outliers in this distribution.

69 1/23/2016Slide 69 Outliers and Re-expressing Data - 49 Similarly, the histogram displays a distribution that is more symmetric.

70 1/23/2016Slide 70 Outliers and Re-expressing Data - 50 Next, we compute the values that would let us determine whether or not a case is an outlier for LG_hivRate. The value for the first quartile (the 25 th percentile) is -1.00. The value for the third quartile (the 75 th percentile) is 0.28. The interquartile range is the difference between the two: 1.28 (0.28 – (-1.00) = 1.28). To be characterized as an outlier in the distribution of "Percent of adults living with hiv/aids" [LG_hivRate], a case would have to have: a value less than -2.92 (Q1 - 1.5 x IQR = -1.00 - 1.5 x 1.28 = -2.92) or a value greater than 2.20 (Q3 + 1.5 x IQR = 0.28 + 1.5 x 1.28 = 2.20)

71 Slide 71 We will create a new variable that will have a value of 1 if the score for the re-expressed variable is an outlier, and 0 if it is not an outlier. To compute the new variable, select the Compute command from the Transform menu. Outliers and Re-expressing Data - 51

72 Slide 72 We will name the new variable reexpressedoutlier, selecting a name which describes its contents. Type the formula as shown in the Numeric Expression text box. The formula will assign reexpressedoutlier a value of 1 if the score is less than -2.92 or greater than 2.20. If the value is not outside this range, reexpressedoutlier will be assigned a 0. Outliers and Re-expressing Data - 52

73 Slide 73 To find the number of outliers, we create a frequency distribution for the variable reexpressedoutlier. To create the frequency distribution, select Descriptive Statistics > Frequencies from the Analyze menu. Outliers and Re-expressing Data - 53

74 Slide 74 First, move the variable reexpressedoutlier to the Variable(s) list box. Second, click on the OK button to produce the output. Outliers and Re-expressing Data - 54

75 Slide 75 All 166 valid cases have a value of 0 as an outlier. There are no outliers for the re-expressed variable. Recall that an outlier will have a value of 1 if the log value is less than -2.92 or greater than 2.20. The number of outliers is equal to the number of cases where reexpressedoutlier is equal to 1. Outliers and Re-expressing Data - 55

76 1/23/2016Slide 76 Outliers and Re-expressing Data - 56 We enter 0 for the number of outliers in the distribution of the re-expressed variable. The next blank in the sentence wants the number of far outliers.

77 1/23/2016Slide 77 Outliers and Re-expressing Data - 57 Our next task is to compute the values that would let us determine whether or not a case is a far outlier for LG_hivRate. The value for the first quartile (the 25th percentile) is -1.00. The value for the third quartile (the 75th percentile) is 0.28. The interquartile range is the difference between the two: 1.28 (0.28 – (-1.00) = 1.28). To be characterized as a far outlier in the distribution of "Percent of adults living with hiv/aids" [LG_hivRate], a case would have to have a value less than -4.84 (Q1 - 3 x IQR = -1.00 - 3 x 1.28 = -4.84) or a value greater than 4.12 (Q3 + 3 x IQR = 0.28 + 3 x 1.28 = 4.12) The calculations may produce values that do not exist in the data set, e.g. -4.84. Since there can be no outliers at that value or smaller, it does not have any impact on our solution.

78 Slide 78 We will create a new variable that will have a value of 1 if the score for the re-expressed variable is an far outlier, and 0 if it is not an far outlier. To compute the new variable, select the Compute command from the Transform menu. Outliers and Re-expressing Data - 58

79 Slide 79 We will name the new variable reexpressedfaroutlier, selecting a name which describes its contents. Type the formula as shown in the Numeric Expression text box. The formula will assign reexpressedfaroutlier a value of 1 if the score is less than -4.84 or greater than 4.12. If the value is not outside this range, reexpressedfaroutlier ill be assigned a 0. Outliers and Re-expressing Data - 59

80 Slide 80 To find the number of outliers, we create a frequency distribution for the variable reexpressedfaroutlier. To create the frequency distribution, select Descriptive Statistics > Frequencies from the Analyze menu. Outliers and Re-expressing Data - 60

81 Slide 81 First, move the variable reexpressedfaroutlier to the Variable(s) list box. Second, click on the OK button to produce the output. Outliers and Re-expressing Data - 61

82 Slide 82 All 166 valid cases have a value of 0 as an outlier. There are no far outliers for the re-expressed variable. Recall that a far outlier will have a value of 1 if the log value is less than -4.84 or greater than 4.12. The number of far outliers is equal to the number of cases where reexpressedfaroutlier is equal to 1. Outliers and Re-expressing Data - 62

83 1/23/2016Slide 83 Outliers and Re-expressing Data - 63 The final sentence indicates the outcome of the re-expression on the presence of outliers in the distribution. If there are no outliers and no far outliers, the re- expression was successful at eliminating outliers. If any outliers or far outliers remain, the re-expression was not successful at eliminating outliers. We enter 0 for the number of far outliers in the distribution of the re-expressed variable.

84 1/23/2016Slide 84 Outliers and Re-expressing Data - 64 Since there were no outliers or far outliers after re-expressing the data, the re-expression was successful in eliminating outliers. Having completed all of the entries, we submit the problem for grading.

85 1/23/2016Slide 85 Outliers and Re-expressing Data - 65 The green shading on the answers indicates that all were correct.


Download ppt "1/23/2016Slide 1 We have seen that skewness affects the way we describe the central tendency and variability of a quantitative variable: if a distribution."

Similar presentations


Ads by Google