Download presentation
Presentation is loading. Please wait.
1
Descriptive Statistics
Prepared By Masood Amjad Khan GCU, Lahore
2
Index Slide No. Subject Slide No. Subject 1. Index 2 2. Index 3
3. Statistics (Definitions) 4. Descriptive Statistics Inferential Statistics Examples of 4 and 7. Data, Level of measurements 8. Variable 9. Discrete variable 10. Continues variable 11. Frequency Distribution 12. Constructing Freq. Distn , 23 13. Example of , 25 14. Displaying the Data 15. Bar Chart, Pie Chart 16. Stem Leaf Plot 17. Graph 18. Histogram , 27 19. Frequency Polygon , 29 20. Cumulative Freq. Polygon , 31 Subject Slide No. 21. Summary Measures 22. Goals 23. Arithmetic Mean , 40 24. Characteristic of Mean 25. Examples of 26. Weighted Mean 27. Example weighted Mean 28. Geometric Mean 29. Example: Geometric Mean 30. Median 31. Example of Median 32. Properties of Median 33. Mode 34. Examples of Mode 35. Positions of mean, median and mode 36. Dispersion 37. Range and Mean Deviation 39. Example of Mean Deviation 40. Variance Subject Prepared by Masood Amjad Khan
3
Index Subject Subject Slide No. 61. 62. 63. 64. 65. 66. 67. 68. 69.
70. 71. 72. 73. 74. 75. 76. 77. 79. 80. Slide No. Subject 41. Examples of variance 42. Moments 43. Examples of Moments 44. Skewness 45. Types of Skewness 46. Coefficient of Skewness 47. Example of skewness 48. Empirical Rule 49. Exercise 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60.
4
Examples of Descriptive
STATISTICS Numerical Facts (Common Usage) Field or Discipline of Study Definition The Science of Collection, Presentation, Analyzing and Interpretation of Data to make Decisions and Forecasts. 1. No. of children born in a hospital in some specified time. 2. No. of students enrolled in GCU in 2007. 3. No of road accidents on motor way. 4. Amount spent on Research Development in GCU during 5. No. of shut down of Computer Network on a particular day. Probability provides the transition between Descriptive and Inferential Statistics Examples of Descriptive And Inferential Statistics Descriptive Statistics Inferential Statistics 1
5
Descriptive Statistics
Consists of methods for Organizing, Displaying, and Describing Data by using Tables, Graphs, and Summary Measures. Data A data set is a collection of observations on one or more variables. Types of Data 1
6
Frequency Distribution
Organizing the Data 1 Construction of Frequency Distribution Tables Frequency Table Frequency Distribution A grouping of quantitative data into mutually exclusive classes showing the number of observations in each class. A grouping of qualitative data into mutually exclusive classes showing the number of observations in each class. Selling price of 80 vehicles Vehicle Selling Number of Price Vehicles 15000 to 24000 to 33000 to Preference of four type of beverage by 100 customers. Beverage Number Cola-Plus Coca-Cola Pepsi 7-UP
7
Displaying the Data Diagrams/Charts Graph Stem and Leaf Plot 1
Bar Chart Pie Chart Histogram Frequency Polygon Stem and Leaf Plot 1
8
Go to Descriptive Statistics
Variable A characteristic under study that assumes different values for different elements. (e.g Height of persons, no. of students in GCU ) Qualitative or Categorical variable Quantitative Variable A variable that can be measured numerically is called quantitative variable. A variable that can not assume a numerical value but can be classified into two or more non numeric categories is called qualitative or categorical variable. Continuous variable Discrete variable Educational achievements Marital status Brand of PC 1 Go to Descriptive Statistics
9
A variable whose observations can assume any
Continuous variable A variable whose observations can assume any value within a specific range. Amount of income tax paid. Weight of a student. Yearly rainfall in Murree. Time elapsed in successive network breakdown. 1 Back
10
Variable that can assume only certain values, and there
Discrete variable Variable that can assume only certain values, and there are gaps between the values. Children in a family Strokes on a golf hole TV set owned Cars arriving at GCU in an hour Students in each section of statistics course 1 Back
11
Inferential Statistics
Consists of methods, that use sample results to help make decisions or predictions about population. 1
12
Go to Inferential Statistics
Sample A portion of population selected for study. 2. A sub set of Data selected from a population. Estimation Testing of Hypothesis Point Interval Selecting a Sample Go to Inferential Statistics 1
13
Go to Inferential Statistics
Population 1. Consists of all-individual items or objects-whose characteristics are being studied. 2. Collection of Data that describe some phenomenon of interest. Examples Finite Population Infinite Population Length of fish in particular lake. No. of students of Statistics course in BCS. No. of traffic violations on some specific holiday. Depth of a lake from any conceived position. Length of life of certain brand of light bulb. Stars on sky. 1 Go to Inferential Statistics
14
Descriptive and Inferential Statistics
Examples Inferential Descriptive At least 5% of all fires reported last year in Lahore were deliberately set. Next to colonial homes, more residents in specified locality prefer a contemporary design. As a result of recent poll, most Pakistanis are in favor of independent and powerful parliament. As a result of recent cutbacks by the oil-producing nations, we can expect the price of gasoline to double in the next year. 1
15
Types of Data 1 Level of measurement
Data can be classified according to level of measurement. The level of measurement dictates the calculations that can be done to summarize and present the data. It also determines the statistical tests that should be performed. Level of measurement Nominal Ordinal Interval Ratio Data are ranked no meaningful difference between values Data may only be classified Meaningful difference between values. Meaningful 0 point and ratio between values. Jersey numbers of football player. Make of car. Your rank in class. Team standings. Temperature Dress size No. of patients seen No of sales call made Distance students travel to class
16
Diagrams/Charts 1 Bar Chart Pie Chart A graph in which the classes
are reported on the horizontal axis and the class frequencies on vertical axis. The class frequencies are proportional to the heights of the bars. A chart that shows the proportion or percent that each class represents of the total number of frequencies. 360 1300 79 286 Red 126 455 Orange 90 325 Lime 29 104 Black 36 130 White f Angle n = Angle = (f/n)360 1 Back
17
Go to Descriptive Statistics
Graphs Histogram Frequency Polygon Cumulative Frequency Go to Descriptive Statistics 1
18
Describing the Data Summary Measures Goals Skewness 1 Measures of
Location Dispersion Goals Moments Arithmetic Mean Weighted Arithmetic Mean Geometric Mean Median Mode Range, Mean Deviation Variance, Standard Deviation Moments about Origin Moments about mean Skewness 1
19
Summary Measures Goals 1 Calculate the arithmetic mean,
weighted mean, median, mode, and geometric mean. Explain the characteristics, uses, advantages, and disadvantages of each measure of location. Identify the position of the mean, median, and mode for both symmetric and skewed distributions. Compute and interpret the range, mean deviation, variance, and standard deviation. Understand the characteristics, uses, advantages, and disadvantages of each measure of dispersion. Understand Chebyshev’s theorem and the Empirical Rule as they relate to a set of observations. 1
20
Characteristics of the Mean
The arithmetic mean is the most widely used measure of location. It requires the interval scale. Its major characteristics are: All values are used. It is unique. The sum of the deviations from the mean is 0. It is calculated by summing the values and dividing by the number of values. Every set of interval-level and ratio-level data has a mean. All the values are included in computing the mean. A set of data has a unique mean. The mean is affected by unusually large or small data values. The arithmetic mean is the only measure of central tendency where the sum of the deviations of each value from the mean is zero. 1
21
Use of Tables of Random Numbers
Selecting a Sample 1 Use of Tables of Random Numbers Random numbers are the randomly produced digits from 0 to 9. Table of random numbers contain rows and columns of these randomly produced digits. In using Table, choose: the starting point at random read off the digits in groups containing either one, two, three, or more of the digits in any predetermined direction (rows or columns). Example Choose a sample of size 7 from a group of 80 objects. Label the objects 01, 02, 03, …, 80 in any order. Arbitrarily enter the Table on any line and read out the pair of digits in any two consecutive columns. Ignore numbers which recur and those greater than 80. Go to Sample
22
Construction of Frequency Distribution
Step 1 Step 2 How many no. of groups (classes)? Just enough classes to reveal the shape of the distribution. Let k be the desired no. of classes. k should be such that 2k > n. If n = 80 and we choose k = 6, then 26 = 64 which is < 80, so k = 6 is not desirable. If we take k = 7, then 27 = 128, which is > 80, so no. of classes should be 7. Determine the class interval (width). the class interval should be the same for all classes. The formula to determine class width: where i is the class width, H is the highest observed value, L is the lowest observed value, and k is the number of classes. Next 1
23
Construction of Frequency Distribution (continued)
Step 3 Step 4 Set the individual class limits. Class limits should be very clear. Class limits should not be overlapping. Some time class width is rounded which may increase the range H-L. Make the lower limit of the first class a multiple of class width. Make tally of observations falling in each class. Step 5 Count the number of items in each class (class frequency) Back Example 1
24
Construction of Frequency Distribution ( Example )
Raw Data ( Ungrouped Data ) 20445 19251 26613 22817 25449 24571 22374 21740 23613 17266 27443 35925 27896 26285 22845 20962 18890 29237 15546 32277 19331 21722 21442 20356 20633 20642 35851 23657 18263 15794 25799 24052 17968 32492 27453 24533 26661 25783 23765 20203 21981 19766 20818 28670 19688 20155 17357 20004 20895 17399 28337 23169 28034 25277 25251 19873 19889 29076 26651 24609 24324 24285 20047 15935 24296 21639 21558 19587 30872 28683 18021 17891 22442 30655 24220 23591 20454 23372 23197 Continued Back 1
25
Construction of Frequency Distribution ( Example Continued )
Following Step 1, with n = 80 k should be 7. Following Step 2 the class width should be 2911. The width size is usually rounded up to a number multiple of 10 or 100. The width size is taken as i = 3000. Following Step 3, with i = 3000 and k = 7, the range is 7×3000=21000. Where as the actual range is H – L = = The lower limit of the first class should be a multiple of class width. Thus the lower limit of starting class is taken as Total = 80 2 33000 up to 36000 4 30000 up to 33000 8 27000 up to 30000 18 24000 up to 27000 17 21000 up to 24000 23 18000 up to 21000 15000 up to 18000 Frequency Selling Price Following Step 4 and Step 5 Back 1
26
Histogram A graph in which the classes are marked on the horizontal axis and the class frequencies on the vertical axis. The class frequencies are represented by the heights of the bars and the bars are drawn adjacent to each other. 2 40 5.1 6 38 4.5 13 32 3.9 3.4 – 4.0 19 3.3 4 2.7 2.2 – 2.8 2.1 f cf H Group Histogram (Example 1) 5 10 15 20 25 30 35 1.60 2.20 2.80 3.40 4.00 4.60 5.20 Groups Example 1 k = 6 Next 1
27
Histogram 1 Example 1 k = 7 Back Histogram (Example 1) 10 20 30 1.5
40 5 6 38 4.5 8 32 4 15 24 3.5 9 3 2.5 f cf H Group Histogram (Example 1) 10 20 30 1.5 2.0 3.0 4.0 5.0 Groups Percent Example 1 k = 7 1 Back
28
Frequency Polygon (Example 1)
A graph in which the points formed by the intersections of the class midpoints and the class frequencies are connected by line segments. 2 40 4.9 6 38 4.3 13 32 3.7 3.4 – 4.0 19 3.1 4 2.5 1.9 f cf Mid pt Group Example 1 k = 6 Frequency Polygon (Example 1) 1.90 2.50 3.10 3.70 4.30 4.90 0.0 5.0 10.0 15.0 20.0 25.0 30.0 35.0 1 3 5 Raw Data Percent Mid point = ( Li +Hi )/2 1 Back
29
Frequency Polygon Continued
3 40 4.75 4.5 – 5.0 5 37 4.25 10 32 3.75 3.5 – 4.0 15 22 3.25 4 7 2.75 2.5 – 3.0 1 2.25 2 1.75 1.5 – 2.0 f cf Mid pt Group Example 1 k = 7 Frequency Polygon (Example 1) 1.75 2.25 2.75 3.25 3.75 4.25 4.75 0.0 5.0 10.0 15.0 20.0 25.0 30.0 35.0 40.0 1 2 3 4 Data Example1 Percent Back 1
30
Cumulative Frequency Polygon
A graph in which the points formed by the intersections of the class midpoints and the class cumulative frequencies are connected by line segments. A cumulative frequency polygon portrays the number or percent of observations below given value. 2 40 4.9 6 38 4.3 13 32 3.7 3.4 – 4.0 19 3.1 4 2.5 1.9 f cf Mid pt Group Example 1 k = 6 1 Next
31
Cumulative Frequency Polygon Continued
Example 1 K = 7 Group Mid pt cf f 1.5 – 2.0 1.75 2 2 2.25 3 1 2.5 – 3.0 2.75 7 4 3.25 22 15 3.5 – 4.0 3.75 32 10 4.25 37 5 4.5 – 5.0 4.75 40 3 Back 1
32
What is A Stem and Leaf Plot Diagram?
What Are They Used For? A Stem and Leaf Plot is a type of graph that is similar to a histogram but shows more information. Summarizes the shape of a set of data. provides extra detail regarding individual values. The data is arranged by placed value. Stem and Leaf Plots are great organizers for large amounts of information. The digits in the largest place are referred to as the stem. The digits in the smallest place are referred to as the leaf The leaves are always displayed to the left of the stem. Series of scores on sports teams, series of temperatures or rainfall over a period of time, series of classroom test scores are examples of when Stem and Leaf Plots could be used. Constructing Stem and Leaf Plot 1
33
Constructing Stem and Leaf Plot
Begin with the lowest temperature. The lowest temperature of the month was 50. Enter the 5 in the tens column and a 0 in the ones. The next lowest is 57. Enter a 7 in the ones Next is 59, enter a 9 in the ones. find all of the temperatures that were in the 60's, 70's and 80's. Enter the rest of the temperatures sequentially until your Stem and Leaf Plot contains all of the data. Make Stem and Leaf Plot with the following temperatures for June. Stem (Tens) and Leaf (Ones) 8 7 6 0 7 9 5 Leaf (Ones) Stem (Tens) Temperature 1 Next
34
Stem and Leaf Example Make a Stem and Leaf Plot for the
following data. Freq Stem Leaf 6 14 1 17 2 8 3 4 3 6 5 3 9 50 1.7 1.2 2.1 2.5 1.9 2.0 0.2 6.3 5.3 5.9 1.1 3.9 1.4 3.5 2.8 0.4 2.7 1.8 2.6 1.3 2.4 4.3 1.5 2.3 3.4 0.9 4.6 0.3 3.1 3.2 3.7 2.9 1.6 0.7 1 Next Back
35
Stem and Leaf Plot Example
Following are the car battery life Data. Make a Stem and Leaf Plot. f S L 2 1 6 9 5 25 3 8 4 40 2.2 4.1 3.5 4.5 3.2 3.7 3 2.6 3.1 1.6 3.3 3.8 4.7 2.5 4.3 3.4 3.6 2.9 3.9 4.4 1.9 4.2 1 Next Back
36
Stem and Leaf Plot Example
Frequency Stem Leaf 2 1 6 9 4 15 3 10 5 5 7 7 40 Go to Stem and Leaf Plot 1 Back
37
Measures of Location 1 Point of Arithmetic Mean Equilibrium Next
Ungrouped Data Grouped Data Population Sample N observations X1, X2,…, XN in the population. n observations X1, X2 ,…, Xn in the sample Let Xi and fi be the mid point and frequency respectively of the ith group in the population The mean is defined as Let Xi and fi be the mid point and frequency respectively of the ith group in the sample The mean is defined as Next
38
Numerical Examples Of Arithmetic Mean Ungrouped Data
Example of Sample Mean Following is a random sample of 12 Clients showing the number of minutes used by clients in a particular cell phone last month. What is the mean number of Minutes Used? Example of Population Mean There are automobile manufacturing Companies in the U.S.A. Listed below is the no. of patents granted by the US Government to each company. Is this information a sample or population? Number of Company Patent Granted General Motors 511 Mazda 210 Nissan 385 Chrysler 97 DaimlerChrysler 275 Porsche 50 Toyota 257 Mistubishi 36 Honda 249 Volvo 23 Ford 234 BMW 13 90 110 89 113 91 94 100 112 77 92 119 83 1 Next Back
39
Numerical Examples Of Arithmetic Mean Grouped Data
Following is the frequency distribution of Selling Prices of Vehicles at Whitner Autoplex Last month. Find arithmetic mean. So the mean vehicle selling price is $23100. 1845.0 80 Total 69.0 34.5 2 126.0 31.5 4 228.0 28.5 8 459.0 25.5 18 382.5 22.5 17 448.5 19.5 23 132.0 16.5 fX X f ($ thousands) Midpoint Frequency Selling Price Go to Summary measures Back 1
40
Point of Equilibrium An object is balanced at when Back 1
41
Summary Measures EXAMPLE Weighted Mean 1 Weighted Mean
A special case of arithmetic mean. Case when values of variable are associated with certain quality, e.g price of medium, large, and big The weight mean of a set of numbers X1, X2, ..., Xn, with corresponding weights w1, w2, ...,wn, is computed from the following formula: 3 $1.50 Big 4 $1.25 Large $0.90 Medium Weights Price Soft Drink EXAMPLE Weighted Mean 1
42
EXAMPLE Weighted Mean Go to Back Summary measures 1
The Carter Construction Company pays its hourly employees $16.50, $19.00, or $25.00 per hour. There are 26 hourly employees, 14 of which are paid at the $16.50 rate, 10 at the $19.00 rate, and 2 at the $25.00 rate. What is the mean hourly rate paid the 26 employees? Go to Summary measures Back 1
43
Summary Measures The geometric mean of a set of n
positive numbers is defined as the nth root of the product of n values. The formula for the geometric mean is written: The geometric mean used as the average percent increase over time n is calculated as: Useful in finding the average change of percentages, ratios, indexes, or growth rates over time. It has a wide application in business and economics because we are often interested in finding the percentage changes in sales, salaries, or economic figures, such as the GDP, which compound or build on each other. The geometric mean will always be less than or equal to the arithmetic mean. Example 1
44
Example of Geometric Mean
The return on investment by certain Company for four successive years was 30%, 20%, -40%, and 200%. Find the geometric mean rate of return on investment. Solution: The 1.3 represents the 30 percent return on investment, i.e original Investment of 1.0 plus the return of 0.3. So Which shows that the average return is 29.4 percent. If you earned $30000 in 1997 and $50000 in 2007, what is your annual rate of increase over the period? The annual rate of increase is 5.24 percent. Back Summary Measures 1
45
Median Example 1 Median is the midpoint of the values
If number of observations n is odd, the median is( n+1)/2th observation. If n is even the median is the average of n/2th and (n/2+1)th observations Example: Determine the median for each set of data. Arrange the set of data n=7 median is 4th observation that is 33. Median is the midpoint of the values after they have been ordered from the smallest to the largest, or the largest to the smallest 2) n=6, median is average of 3rd and 4th observation, that is (27+28)/2 = 27.5. Median for Grouped Data The median is obtained by using the formula: Where m is the group of n/2th obs. Lm, Im, fm, and cfm-1 are the lowest value, class width, frequency, and cumulative frequency respectively of the mth group. Example 1
46
Back Go to Summary Measures 1
Example (Median) n/2 = 20, so median group is Lm = 3.40, Im = 0.6, fm = 13, cfm-1 = 19 Find the Median for the following data. Example 1 L H f cf 1.60 < 2.20 2 2.80 4 6 3.40 13 19 4.00 32 4.60 38 5.20 40 Back Go to Summary Measures 1
47
Properties of the Median
There is a unique median for each data set. It is not affected by extremely large or small values and is therefore a valuable measure of central tendency when such values occur. It can be computed for ratio-level, interval-level, and ordinal-level data. It can be computed for an open-ended frequency distribution if the median does not lie in an open-ended class. Go to Summary Measures 1
48
Mode 1 Next The mode is the value of the observation that appears most
frequently. Region No. of Seniors New England 524 Middle Atlantic 818 E.N.Central 815 W.N.Central 367 S.Atlantic 679 E.S.Central 196 W.S.Central 436 Mountain 346 Pacific 783 Mode 1 Next
49
Mode (Example) Back Next 1
50
Mode Grouped Data Back 1 Calculating Mode for Grouped Data.
Calculate the mode of the following Distribution. Solution: Modal Group is fm = 14, fm-1 = 4, fm+1 = 12 and Im= 0.6 Group f 2 4 14 12 6 Back Go to Summary Measures 1
51
The Relative Positions of the Mean, Median and the Mode
Go to Summary Measures 1
52
Dispersion Next 1 Why Study Dispersion?
A measure of location, such as the mean or the median, only describes the center of the data. It is valuable from that standpoint, but it does not tell us anything about the spread of the data. For example, if your nature guide told you that the river ahead averaged 3 feet in depth, would you want to wade across on foot without additional information? Probably not. You would want to know something about the variation in the depth. A second reason for studying the dispersion in a set of data is to compare the spread in two or more distributions. Studying dispersion through display. Next 1
53
Range and Mean Deviation
Example The number of cappuccinos sold at the Starbucks location in the Orange Country Airport between 4 and 7p.m. for a sample of 5 days last year were 20, 40, 50, 60, and 80. Determine the mean deviation for the number of cappuccinos sold. Range = Largest value – Smallest value Range = Largest – Smallest value = 80 – 20 = 60 Next Back 1
54
Mean Deviation Example
Solution Number of Cappuccinos Absolute Deviation Sold Daily ( X ) 20 = -30 30 40 = -10 10 50 = 0 60 = 10 80 = 30 Total Example The number of cappuccinos sold at he Starbucks location in the Orange Country Airport between 4 and 7 p.m. for a sample of 5 days last year were 20, 40, 50,60, and 80. Determine the mean deviation for the number of cappuccinos sold. Next Back 1
55
Mean Deviation (Grouped Data)
Mean Deviation for Grouped Data 288.6 Total 22.8 11.4 34.5 2 33.6 8.4 31.5 4 43.2 5.4 28.5 8 2.4 25.5 18 10.2 -0.6 22.5 17 82.8 -3.6 19.5 23 52.8 -6.6 16.5 X f 80 Total 34.5 2 31.5 4 28.5 8 25.5 18 22.5 17 19.5 23 16.5 X f ($ thousands) Frequency Selling Price Back 1 Go to Summary Measures
56
Variance and Standard Deviation
Population variance and standard deviation. Let X1, X2,…, XN be N observations in the population. The variance is defined as: The standard deviation is defined as: The sample variance and Standard deviation. Let X1, X2,…, Xn be n observations in the sample. The variance is defined as: The standard deviation is defined as: Next 1
57
Example Variance and standard deviation
The number of traffic citations issued during the last five months in Beaufort County, South Carolina, is 38, 26, 13, 41, and 22. What is the population variance? The hourly wages for a sample of part-time employees at Home Depot are: $12, $20, $16, $18, and $19. What is the sample variance? Hourly Wage $ ( X ) 12 -5 25 20 3 9 16 -1 1 18 19 2 4 85 40 Next 2 Back
58
Example Grouped Data Next Back 2
The sample standard deviation is defined as: Example: For the following frequency distribution of prices of vehicle, compute the standard deviation of the prices. Next Back 2
59
Go to Measures of Dispersion
Example (continued) Alternate method of computing variance is: Example Group Mid pt (X) f fX fX2 1.75 2 3.5 6.125 2.25 4.5 10.13 2.75 5 13.75 37.81 3.25 15 48.75 158.4 3.75 8 30 112.5 4.25 6 25.5 108.4 4.75 9.5 45.13 Total 40 135.5 478.5 Back Go to Measures of Dispersion 2
60
Moments 2 Next Moments about Origin The rth moment about origin ‘a’ is
defined as: Moments about Mean The rth moment about mean is First moment about mean is Zero. Moments of Grouped Data The rth moment about origin ‘a’ is defined as: The rth moment about mean is First moment about mean is Zero. 2 Next
61
Example of Moments Next Back 2 Moments about Mean. Group Mid pt (X) f
fX 1.75 2 3.5 2.25 4.5 2.75 5 13.75 3.25 15 48.75 3.75 8 30 4.25 6 25.5 4.75 9.5 Total 40 135.5 Moments about Mean. 5.445 2.645 2.1125 0.3375 0.98 0.343 4.335 3.645 19.5 Next Back 2
62
Example of Moments (Continued)
-1.97 -9.84 19.37 -38.11 75.00 -1.17 -10.51 12.28 -14.34 16.75 -0.37 -5.52 2.03 -0.75 0.28 0.43 4.32 1.87 0.81 0.35 1.23 7.39 9.11 11.22 13.82 4.06 8.26 16.78 34.10 2.83 8.02 22.71 64.32 3.63 7.26 26.38 95.82 348.03 87.31 94.14 552.65 Class f X fX 5 0.4 2 9 1.2 10.8 15 30 10 2.8 28 6 3.6 21.6 4.4 8.8 1 5.2 12 Total 50 118.4 2 Back Go to Dispersion
63
Skewness Mean, median and mode are measures of central location for a set of observations and measures of data dispersion are range and the standard deviation. Another characteristic of a set of data is the shape. There are four shapes commonly observed: symmetric, positively skewed, negatively skewed, Bimodal The coefficient of skewness can range from -3 up to 3. A value near -3, such as -2.57, indicates considerable negative skewness. A value such as 1.63 indicates moderate positive skewness. A value of 0, which will occur when the mean and median are equal, indicates the distribution is symmetrical and that there is no skewness present. Next 2
64
Types of Skewness Next Back 2
65
Coefficient of Skewness
The Pearson coefficient of skewness is defined as: Example Following are the earnings per share for a sample of 15 software companies for the year The earnings per share are arranged from smallest to largest. Compute the mean, median, and standard deviation. Find the coefficient of skewness using Pearson’s estimate. What is your conclusion regarding the shape of the distribution? Solution The shape is moderately positively skewed. Next Back 2
66
Example of Skewness (Continued)
Class 5 2 20.65 8 13 9.6 12.14 14 27 28 2.61 11 38 30.8 1.49 7 45 25.2 9.55 47 8.8 7.75 1 48 5.2 7.66 50 12 25.46 Total 118.4 87.31 The skewness can also be measured with moments as: m2 = 1.75, m3 = 62 b = 0.492 The shape is slightly positively skewed 2 Go to Skewness Back Next
67
Go to Skewness Back 2 Next
Example Skewness Mode Median Mean Go to Skewness Back 2 Next
68
Empirical Rule Next Go to Skewness Back 2 Empirical Rule
For a symmetrical, bell-shaped frequency distribution: Approximately 68% of the observations will lie within plus and minus one standard deviations of the mean. ( mean ±s.d ) About 95% of the observations will lie within plus and minus two standard deviations of the mean. ( mean ± 2s.d ) Practically all (99.7%) wiill lie within plus and minus three standard deviations of the mean. ( mean ± 3s.d ) Let the mean of a symmetric distribution be 100 and standard deviation be 10, then the empirical rule is as follows: 68% 95% 99.7% Next Go to Skewness Back 2
69
Example Empirical Rule
Mean = 3.25 sd = 0.77 Mean ± sd = ( 2.48 – 4.05) ( 67.5%) Mean ± 2sd = ( 1.71 – 4.79 ) ( 97.5%) Mean ± 3sd = ( 0.94 – 5.56 ) ( 100%) Consider the following distribution: Check the empirical rule. Mean = s.d = 0.75 Mean ± sd = ( 2.45 – 3.95 ) ( 67.5%) Mean ± 2sd = ( 1.7 – 4.7 ) ( 97.5%) Mean ± 3sd = ( 0.89 – 5.45 ) (100%) Group f X fX fX^2 2 1.75 3.5 6.13 5 2.25 11.3 25.3 8 2.75 22 60.5 10 3.25 32.5 106 3.75 30 113 4.25 21.3 90.3 4.75 9.5 45.1 40 130 446 1.6 2.5 3 3.4 3.8 1.8 2.6 3.2 3.5 4.1 2 3.6 2.3 4.2 2.8 3.3 4.3 3.7 2.4 2.9 4.5 4.6 Next Back 2
70
Exercise Back 3 The following is the distribution of
Wages per thousand employees in a Certain factory. For the following data of examination marks find the Mean, Median, Mode, Mean Deviation and variance. Also find the Skewness. No. of Employees Daily Wages Marks 30 – 39 40 – 49 50 – 59 60 – 69 70 – 79 80 – 89 No. of students 8 87 190 304 211 85 20 22 24 26 28 30 32 34 36 38 40 42 44 3 13 43 102 175 220 204 139 69 25 6 1 Calculate the Modal and Median wages. Why is difference b/w the two. Back 3
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.