# Ka-fu Wong © 2003 Chap 2-1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.

## Presentation on theme: "Ka-fu Wong © 2003 Chap 2-1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data."— Presentation transcript:

Ka-fu Wong © 2003 Chap 2-1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data

Ka-fu Wong © 2003 Chap 2-2 Chapter Two Describing Data: Frequency Distributions and Graphic Presentation GOALS 1.Organize data into a frequency distribution. 2.Portray a frequency distribution in a histogram, frequency polygon, and cumulative frequency polygon. 3.Develop a stem-and-leaf display. 4.Present data using such graphic techniques as line charts, bar charts, and pie charts.

Ka-fu Wong © 2003 Chap 2-3 Frequency Distribution A Frequency distribution is a grouping of data into mutually exclusive categories showing the number of observations in each class.

Ka-fu Wong © 2003 Chap 2-4 Construction of a Frequency Distribution 1.Question to be addressed. 2.Collect data (raw data). 3.Organize data. Frequency distribution 4.Present data. 5.Draw conclusion.

Ka-fu Wong © 2003 Chap 2-5 Terms Related to Frequency Distribution In constructing a frequency distribution, data are divided into exhaustive and mutually exclusive classes. Mid-point: A point that divides a class into two equal parts. This is the average of the upper and lower class limits. Class frequency: The number of observations in each class. Class interval: The class interval is obtained by subtracting the lower limit of a class from the lower limit of the next class.

Ka-fu Wong © 2003 Chap 2-6 Four steps to construct frequency distribution 1.Decide on the number of classes. 2.Determine the class interval or width. 3.Set the individual class limits. 4.Tally the data into the classes and count the number of items in each. For illustration, it is convenient to carry an example with us – example 1.

Ka-fu Wong © 2003 Chap 2-7 EXAMPLE 1 Dr. Tillman is Dean of the School of Business Socastee University. He wishes to prepare a report showing the number of hours per week students spend studying. He selects a random sample of 30 students and determines the number of hours each student studied last week. 15.0, 23.7, 19.7, 15.4, 18.3, 23.0, 14.2, 20.8, 13.5, 20.7, 17.4, 18.6, 12.9, 20.3, 13.7, 21.4, 18.3, 29.8, 17.1, 18.9, 10.3, 26.1, 15.7, 14.0, 17.8, 33.8, 23.2, 12.9, 27.1, 16.6. Organize the data into a frequency distribution.

Ka-fu Wong © 2003 Chap 2-8 Step 1: Decide on the number of classes. The goal is to use just enough groupings or classes to reveal the shape of the distribution. “ Just enough ” Recipe –“ 2 to the k rule ” Select the smallest number (k) for the number of classes such that 2 k is greater than the number of observations (n).

Ka-fu Wong © 2003 Chap 2-9 2 to the k rule Sample size (n) = 80 2 1 =2; 2 2 =4; 2 3 =8; 2 4 =16; 2 5 =32; 2 6 =64; 2 7 =128; … The rule suggest 7 classes. Sample size (n) = 1000 2 1 =2; 2 2 =4; 2 3 =8; 2 4 =16; 2 5 =32; 2 6 =64; 2 7 =128; 2 8 =256; 2 9 =512; 2 10 =1024 … The rule suggest 10 classes. Select the smallest number (k) for the number of classes such that 2 k is greater than the number of observations (n).

Ka-fu Wong © 2003 Chap 2-10 2 to the k rule Sample size (n)=10000 2 1 =2; 2 2 =4; 2 3 =8; 2 4 =16; 2 5 =32; 2 6 =64; 2 7 =128; …,2 13 =8192; 2 14 = 16384 The rule suggest 14. Sample size (n)=100000 2 2 =4; 2 3 =8; 2 4 =16; 2 5 =32; 2 6 =64; 2 7 =128; … Select the smallest number (k) for the number of classes such that 2 k is greater than the number of observations (n).

Ka-fu Wong © 2003 Chap 2-11 2 to the k rule Select the smallest number (k) for the number of classes such that 2 k is greater than the number of observations (n). We want to find smallest k such that 2 k > n. Smallest k such that k log 2 > log n Smallest k such that k > (log n) / (log 2) Example: If n=10000, (log n) / (log 2) = 13.28. Hence the recipe suggest 14 classes. Note: Same result for base 10 log and natural log.

Ka-fu Wong © 2003 Chap 2-12 Step 1: Decide on the number of classes. 2 to the k rule Select the smallest number (k) for the number of classes such that 2 k is greater than the number of observations (n). Example 1 (continued): Sample size (n) = 30 2 1 =2; 2 2 =4; 2 3 =8; 2 4 =16; 2 5 =32; 2 6 =64; 2 7 =128; … The rule suggest 5 classes. Alternative, by computing (Log 30/log 2) = 4.91, we get the same suggestion of 5 classes.

Ka-fu Wong © 2003 Chap 2-13 Step 2: Determine the class interval or width. Generally the class interval or width should be the same for all classes. The classes all taken together must cover at least the distance from the lowest value in the raw data to the highest value. The classes must be mutually exclusive and exhaustive. Class interval ≥ (Highest value – lowest value) / number of classes. Usually we will chose some convenient number as class interval that satisfy the inequality.

Ka-fu Wong © 2003 Chap 2-14 Step 2: Determine the class interval or width. Example 1 (continued): Highest value = 33.8 hours Lowest value = 10.3 hours k=5. Hence, class interval ≥ (33.8-10.3)/5 = 4.7 We choose class interval to be 5, some convenient number. Class interval ≥ (Highest value – lowest value) / number of classes.

Ka-fu Wong © 2003 Chap 2-15 Step 3: Set the individual class limits The class limits must be set so that the classes are mutually exclusive and exhaustive. Round up so some convenient numbers.

Ka-fu Wong © 2003 Chap 2-16 Step 3: Set the individual class limits Example 1 (continue): Highest value = 33.8 hours. Lowest value = 10.3 hours. Range = highest – lowest = 23.5. K=5; Interval = 5. With k=5 and interval = 5, the classes will cover a range of 25. Let ’ s split the surplus in the lower and upper tail equally. (25-23.5)/2 = 0.75. Hence, the lower limit of the first class should be around (10.3 – 0.75)=9.55 and upper limit of the last class should be (33.8 + 0.75)=34.55. 9.55 and 34.55 look odd. Some convenient and close numbers would be 10 and 35.

Ka-fu Wong © 2003 Chap 2-17 Step 3: Set the individual class limits Example 1 (continue): We have determined K=5; Interval = 5. The lower limit of the first class = 10 and The upper limit of the last class = 35. “10 up to 15” means the interval from 10 to 15 that includes 10 but not 15.

Ka-fu Wong © 2003 Chap 2-18 Step 4: Tally the data into the classes and count the number of items in each 10 up to 15 Hours studying 15 up to 20 20 up to 25 25 up to 30 30 up to 35 15.0, 23.7, 19.7, 15.4, 18.3, 23.0, 14.2, 20.8, 13.5, 20.7, 17.4, 18.6, 12.9, 20.3, 13.7, 21.4, 18.3, 29.8, 17.1, 18.9, 10.3, 26.1, 15.7, 14.0, 17.8, 33.8, 23.2, 12.9, 27.1, 16.6. 7 12 7 3 1

Ka-fu Wong © 2003 Chap 2-19 EXAMPLE 1 (continued)

Ka-fu Wong © 2003 Chap 2-20 EXAMPLE 1 (continued) A relative frequency distribution shows the percent of observations in each class.

Ka-fu Wong © 2003 Chap 2-21 Stem-and-leaf Displays Stem-and-leaf display: A statistical technique for displaying a set of data. Each numerical value is divided into two parts: the leading digits become the stem and the trailing digits the leaf. Note: An advantage of the stem-and-leaf display over a frequency distribution is we do not lose the identity of each observation. An disadvantage is that it is not good for large data sets.

Ka-fu Wong © 2003 Chap 2-22 EXAMPLE 2 Colin achieved the following scores on his twelve accounting quizzes this semester: 86, 79, 92, 84, 69, 88, 91, 83, 96, 78, 82, 85. Construct a stem-and-leaf chart.

Ka-fu Wong © 2003 Chap 2-23 EXAMPLE 2 (continued) 86, 79, 92, 84, 69, 88, 91, 83, 96, 78, 82, 85. Stem Leaf 6 7 8 9 6 9 2 4 9 8 1 3 6 8 2 5

Ka-fu Wong © 2003 Chap 2-24 EXAMPLE 2 (continued) 86, 79, 92, 84, 69, 88, 91, 83, 96, 78, 82, 85. Stem Leaf 6 7 8 9 2 8 1 3 9 4 2 5 6 9 6 8

Ka-fu Wong © 2003 Chap 2-25 Graphic Presentation of a Frequency Distribution The three commonly used graphic forms are histograms, frequency polygons, and a cumulative frequency distribution. A Histogram is a graph in which the classes are marked on the horizontal axis and the class frequencies on the vertical axis. The class frequencies are represented by the heights of the bars and the bars are drawn adjacent to each other.

Ka-fu Wong © 2003 Chap 2-26 Graphic Presentation of a Frequency Distribution A frequency polygon consists of line segments connecting the points formed by the class midpoint and the class frequency. A cumulative frequency distribution is used to determine how many or what proportion of the data values are below or above a certain value.

Ka-fu Wong © 2003 Chap 2-27 Histogram for Hours Spent Studying Example 1 (continued): A Histogram is a graph in which the classes are marked on the horizontal axis and the class frequencies on the vertical axis.

Ka-fu Wong © 2003 Chap 2-28 Frequency Polygon for Hours Spent Studying Example 1 (continued): A frequency polygon consists of line segments connecting the points formed by the class midpoint and the class frequency.

Ka-fu Wong © 2003 Chap 2-29 Cumulative Frequency Distribution For Hours Studying Example 1 (continued): A cumulative frequency distribution is used to determine how many or what proportion of the data values are below or above a certain value.

Ka-fu Wong © 2003 Chap 2-30 Bar Chart A bar chart can be used to depict any of the levels of measurement (nominal, ordinal, interval, or ratio).

Ka-fu Wong © 2003 Chap 2-31 Example 3 Construct a bar chart for the number of unemployed per 100,000 population for selected cities during 2001

Ka-fu Wong © 2003 Chap 2-32 Bar Chart for the Unemployment Data Example 3 (continued):

Ka-fu Wong © 2003 Chap 2-33 Pie Chart A pie chart is useful for displaying a relative frequency distribution. A circle is divided proportionally to the relative frequency and portions of the circle are allocated for the different groups.

Ka-fu Wong © 2003 Chap 2-34 EXAMPLE 4 A sample of 200 runners were asked to indicate their favorite type of running shoe. Draw a pie chart based on the following information.

Ka-fu Wong © 2003 Chap 2-35 EXAMPLE 4 (continued) Compute the percentage and degree each type occupy out of 360 o. Degree occupied in a circle = percentage x 360

Ka-fu Wong © 2003 Chap 2-36 Pie Chart for Running Shoes Example 4 (continued):

Ka-fu Wong © 2003 Chap 2-37 - END - Chapter Two Describing Data: Frequency Distributions and Graphic Presentation