Presentation is loading. Please wait.

Presentation is loading. Please wait.

Frequency Tables and Single variable Graphics

Similar presentations

Presentation on theme: "Frequency Tables and Single variable Graphics"— Presentation transcript:

1 Frequency Tables and Single variable Graphics

Listing a large set of data does not present much of a picture to the reader. Sometimes we want to condense the data into a more manageble form. This can be accomplished with the aid of a frequency distribution.

3 To demonstrate the concept of a frequency distribution, let’s use the following set of data:
The frequency for x=1 is 3 A frequency distribution is used to represent this set of data by listing the x values with their frequencies. For example, the value 1 occurs in the sample three times;

4 frequency distribution
The frequency f is the number of times the value x occurs in the sample. x f 1 3 2 8 5 4 Ungrouped frequency distribution We say ungrouped because each value of x in the distribution stands alone.

Classes: When a large set of data has many different x values instead of a few repeated values, as in the previous example, we can group the data into a set of classes and construct a frequency table. Number of classes: It can be take a value between 8 and 15. Lower and upper class limits: Lower class limit is the smallest piece of data that could go into each class. The upper class limits are the largest values fitting into each class.

6 Class interval is the difference between a lower class limit and the next lower class limit.
Relative frequency is a propotional measure of the frequency of an occurence. Class mark (class mid-point) is the numerical value that is exactly in the middle of each class. Class boundaries (true class limits) are numbers that do not occur in the sample data but are halfway between the upper limit of one class and the lower limit of the next class.

7 (there are some exceptions)
The two basic guidelines that should be followed in constructing a grouped frequency distribution are: Each class should be of the same width. (there are some exceptions) Classes should be set up so that they do not overlap and so that each piece of data belongs to exactly one class

8 n=155 (There are 155 observations)
n=155 (There are 155 observations) PROCEDURE OF CLASSIFICATION Rank the data. Identify lowest (L) and highest (H) scores and find the range (range=H-L) Select the number of classes and find class width L=116, H=315 Range= =196 #of classes=8 Class Int.=196/8=24,525

Relative Frequency =100*(8/155)=5.2 115.5 315.5

10 mi fi L A C=25

11 Measures of central tendency
Mean or Median Mode is 203

12 Measures of position What are the 25th, 75th percentiles and the median?

13 x-165.5 x=? 25=x1

14 25% of observations lie below 179.88.
X P25=Q1=179.88 25% of observations lie below

15 Standart deviation or Coefficient of variation

16 Graphic Presentation of Data
We will learn how to present single-variable data by using graphical technique. There are several graphic ways to describe data. The method used is determined by the type of data and the idea to be presented.

17 BAR GRAPH AND PIE GRAPH Bar graph and pie (circle) graph are often used to summarize attribute data. Data are represented by frequency or proportion. In graphical presentation, proportion is more meaningful than frequency. In a bar graph; x axis represents the attribute, while y axis (bar’s height) represents proportion or frequency of each attribute. In a pie graph, each piece represents proportion of attribute.

18 Example Marital status of woman are given below: Marital status Freq. % Single 65 46.8 Married 32 23.0 Divorced 27 19.4 Widowed 10 7.2 Separate 5 3.6 Total 139 100.0

19 Bar chart of marital status of woman
50 40 30 Percent 20 10 Single Married Divorced Widowed Separate Marital Status Bar chart of marital status of woman

20 Pie chart of marital status of woman
3,6% 7,2% 19,4% 23,0% 46,8% Separate Widowed Divorced Married Single Pie chart of marital status of woman

21 STEM AND LEAF PLOT This plot provides a convinient means of tallying the observations and can be used as a direct display of data or as a preliminary step in constructing a frequency table. The stem is leading digit(s) of the data, while the leaf is the trailing digit(s). For example, the numerical value 458 might be split into stem (45) and leaf (8).

22 Let’s construct a stem-and-leaf display of following set of 20 test scores:
At a quick glance we see that there are scores in 50s, 60s, 70s, 80s and 90s. Let’s use the first digit of score as the stem and second digit as the leaf.

23 We will construct the display in a vertical position
We will construct the display in a vertical position. Draw a vertical line and to the left of it locate the stems in order. Next we place each leaf on its stem. This is accomplished by placing the trailing digit on the right side of the vertical line opposite to its corresponding leading digit. 5 6 7 8 9 2 8 2 6 8 2 6

24 All scores with the same tens digit are placed on the same branch
All scores with the same tens digit are placed on the same branch. This may not always be desired. Suppose we construct the display; this time instead of grouping ten possible values on each stem, let’s group the values so that only five possible values could fall in each stem. (50-54) 5 (55-59) 5 (60-64) 6 (65-69) 6 (70-74) 7 (75-79) 7 (80-84) 8 (85-89) 8 (90-94) 9 (95-99) 9 2 8 6 8 4 4 2 2 4 2 8 6 6

25 HISTOGRAM Histogram is a type of bar graph representing the frequency distribution of quantitative data. A histogram is made up of the following components: A title, which identifies the sample of concern. A vertical scale, which identifies the frequencies (relative frequencies) in the various classes. A horizantal scale, which identifies the variable x (class mid-points or true class limits or lower class limits).

26 Birthweights of 60 infants are given below:

27 bwt 4 8 12 Count 1800,5 2226,5 2652,5 3078,5 3504,5 3930,5 4356,5 4756,5

28 bwt 0% 5% 10% 15% 20% Percent 1800,5 2226,5 2652,5 3078,5 3504,5 3930,5 4356,5 4756,5

29 20% 15% Percent 10% 5% 0% bwt 12 Count 8 4 bwt 1800,5 2226,5 2652,5
4 8 12 1800,5 2226,5 2652,5 3078,5 3504,5 3930,5 4356,5 4756,5 Percent bwt 0% 5% 10% 15% 20% 1800,5 2226,5 2652,5 3078,5 3504,5 3930,5 4356,5 4756,5 Count

30 Symmetric Distribution
Right-skewed Distribution Left-skewed Distribution

The median and first and third quartiles of the distribution are used in constructing box plots. The location of the midpoint or median of the distribution is indicated with a horizontal line in the box. Straight lines or whiskers extend 1.5 times the interquartile range above and below the 75th and 25th percentiles when there are outliers or extreme observations. If they do not exist, lines represent minimum and maximum values. Cases with values between 1.5 and 3 box lengths from the upper or lower edge of the box are called outliers. Cases with values more than 3 box lengths from the upper or lower edge of the box are called extreme points.

32 6000 5000 4000 3000 2000 1000

33 Since there are no outliers
BWT Since there are no outliers Maximum 4990 75 3677,5 Percentiles 50 (Median) 3110,0 25 2553,5 Minimum 1588 Range 3402

34 Mode Median Mean Left Skewed Right Skewed Simetric

Scatter plot displays the value of each observation by a small circle, on an invisible line which is parallel to the y-axis displaying original measurement. BWT 6000 5000 4000 3000 2000 1000

36 LINE GRAPH In line graph, individual data points are connected by a line. Line plots provide a simple way to visually present a sequence of many values.

37 The distribution of measles cases among seansons in an area are as follows:
SEASONS Winter Fall Summer Spring Frequency 120 100 80 60 40 20 Spring 75 Summer 25 Fall 50 Winter 100

38 ERROR BARS Error bars help you visualize distributions and dispersion by indicating the variability of the measure being displayed. The mean of a scale variable is plotted for a set of categories, and the length of the bar on either side of the mean value indicates standard deviations. Error bars can extend in one direction or both directions from the mean. Error bars are sometimes displayed in the same chart with other chart elements such as bars or lines.

39 BWT Mean  1 SD BWT 4000 3000 2000 BWT Mean  2 SD BWT 5000 4000 3000 2000 1000

Download ppt "Frequency Tables and Single variable Graphics"

Similar presentations

Ads by Google