2FREQUENCY DISTRIBUTION Listing a large set of data does not present much of a picture to the reader. Sometimes we want to condense the data into a more manageble form.This can be accomplished with the aid of a frequency distribution.
3To demonstrate the concept of a frequency distribution, let’s use the following set of data: The frequency for x=1 is 3A frequency distribution is used to represent this set of data by listing the x values with their frequencies. For example, the value 1 occurs in the sample three times;
4frequency distribution The frequency f is the number of times the value x occurs in the sample.xf132854Ungroupedfrequency distributionWe say ungrouped because each value of x in the distribution stands alone.
5CONSTRUCTION OF A FREQUENCY TABLE Classes: When a large set of data has many different x values instead of a few repeated values, as in the previous example, we can group the data into a set of classes and construct a frequency table.Number of classes: It can be take a value between 8 and 15.Lower and upper class limits: Lower class limit is the smallest piece of data that could go into each class. The upper class limits are the largest values fitting into each class.
6Class interval is the difference between a lower class limit and the next lower class limit. Relative frequency is a propotional measure of the frequency of an occurence.Class mark (class mid-point) is the numerical value that is exactly in the middle of each class.Class boundaries (true class limits) are numbers that do not occur in the sample data but are halfway between the upper limit of one class and the lower limit of the next class.
7(there are some exceptions) The two basic guidelines that should be followed in constructing a grouped frequency distribution are:Each class should be of the same width.(there are some exceptions)Classes should be set up so that they do not overlap and so that each piece of data belongs to exactly one class
8n=155 (There are 155 observations) n=155 (There are 155 observations)PROCEDURE OF CLASSIFICATIONRank the data.Identify lowest (L) and highest (H) scores and find the range (range=H-L)Select the number of classes and find class widthL=116, H=315Range= =196#of classes=8Class Int.=196/8=24,525
9CONSTRUCTION OF THE FREQUENCY DISTRIBUTION TABLE Relative Frequency=100*(8/155)=5.2115.5315.5
16Graphic Presentation of Data We will learn how to present single-variable data by using graphical technique.There are several graphic ways to describe data.The method used is determined by the type of data and the idea to be presented.
17BAR GRAPH AND PIE GRAPHBar graph and pie (circle) graph are often used to summarize attribute data.Data are represented by frequency or proportion.In graphical presentation, proportion is more meaningful than frequency.In a bar graph; x axis represents the attribute, while y axis (bar’s height) represents proportion or frequency of each attribute.In a pie graph, each piece represents proportion of attribute.
18ExampleMarital status of woman are given below:Marital statusFreq.%Single6546.8Married3223.0Divorced2719.4Widowed107.2Separate53.6Total139100.0
19Bar chart of marital status of woman 504030Percent2010SingleMarriedDivorcedWidowedSeparateMarital StatusBar chart of marital status of woman
20Pie chart of marital status of woman 3,6%7,2%19,4%23,0%46,8%SeparateWidowedDivorcedMarriedSinglePie chart of marital status of woman
21STEM AND LEAF PLOTThis plot provides a convinient means of tallying the observations and can be used as a direct display of data or as a preliminary step in constructing a frequency table.The stem is leading digit(s) of the data, while the leaf is the trailing digit(s). For example, the numerical value 458 might be split into stem (45) and leaf (8).
22Let’s construct a stem-and-leaf display of following set of 20 test scores: At a quick glance we see that there are scores in 50s, 60s, 70s, 80s and 90s.Let’s use the first digit of score as the stem and second digit as the leaf.
23We will construct the display in a vertical position We will construct the display in a vertical position. Draw a vertical line and to the left of it locate the stems in order.Next we place each leaf on its stem. This is accomplished by placing the trailing digit on the right side of the vertical line opposite to its corresponding leading digit.567892 82 6 82 6
24All scores with the same tens digit are placed on the same branch All scores with the same tens digit are placed on the same branch. This may not always be desired. Suppose we construct the display; this time instead of grouping ten possible values on each stem, let’s group the values so that only five possible values could fall in each stem.(50-54) 5(55-59) 5(60-64) 6(65-69) 6(70-74) 7(75-79) 7(80-84) 8(85-89) 8(90-94) 9(95-99) 9286 84 4 22 4 28 66
25HISTOGRAMHistogram is a type of bar graph representing the frequency distribution of quantitative data.A histogram is made up of the following components:A title, which identifies the sample of concern.A vertical scale, which identifies the frequencies (relative frequencies) in the various classes.A horizantal scale, which identifies the variable x (class mid-points or true class limits or lower class limits).
30Symmetric Distribution Right-skewed DistributionLeft-skewed Distribution
31BOX PLOT (BOX AND WHISKER PLOT) The median and first and third quartiles of the distribution are used in constructing box plots.The location of the midpoint or median of the distribution is indicated with a horizontal line in the box.Straight lines or whiskers extend 1.5 times the interquartile range above and below the 75th and 25th percentiles when there are outliers or extreme observations. If they do not exist, lines represent minimum and maximum values.Cases with values between 1.5 and 3 box lengths from the upper or lower edge of the box are called outliers. Cases with values more than 3 box lengths from the upper or lower edge of the box are called extreme points.
35SCATTER PLOT WITH ONE VARIABLE Scatter plot displays the value of each observation by a small circle, on an invisible line which is parallel to the y-axis displaying original measurement.BWT600050004000300020001000
36LINE GRAPHIn line graph, individual data points are connected by a line. Line plots provide a simple way to visually present a sequence of many values.
37The distribution of measles cases among seansons in an area are as follows: SEASONSWinterFallSummerSpringFrequency12010080604020Spring75Summer25Fall50Winter100
38ERROR BARSError bars help you visualize distributions and dispersion by indicating the variability of the measure being displayed. The mean of a scale variable is plotted for a set of categories, and the length of the bar on either side of the mean value indicates standard deviations. Error bars can extend in one direction or both directions from the mean.Error bars are sometimes displayed in the same chart with other chart elements such as bars or lines.