Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 3: Organizing Data. Raw data is useless to us unless we can meaningfully organize and summarize it (descriptive statistics). Organization techniques.

Similar presentations


Presentation on theme: "Chapter 3: Organizing Data. Raw data is useless to us unless we can meaningfully organize and summarize it (descriptive statistics). Organization techniques."— Presentation transcript:

1 Chapter 3: Organizing Data

2 Raw data is useless to us unless we can meaningfully organize and summarize it (descriptive statistics). Organization techniques include: –T–Tables such as frequency distributions –G–Graphs such as histograms, bar graphs, line graphs, pie charts, stem-and-leaf plots, and scatterplots

3 Frequency Distribution A frequency distribution is a table that lists all the categories or values of a variable as well as the corresponding number of occurrences or responses for each category or value of the variable (its frequency, or how often the category occurs). Frequency distributions can be used for both categorical variables (nominal or ordinal) and numerical variables (interval or ratio).

4 To create a frequency distribution for categorical data: First create a list of all the categories or values of the variable and then count the number of times each different category or value occurred in the data. Then find the percentage of respondents for each category. Set up your basic table to have 3 columns: (1) the list of categories or the values of the variable of interest, (2) the frequency count, and (3) the percentage. When dealing with ordinal variables, make sure the categories are ranked (lowest to highest or highest to lowest).

5 Examples of frequency distributions for nominal variables:

6 Examples of frequency distributions for ordinal variables:

7 Statistical Software There are many statistical software packages that exist. Most of the output looks the same, so reading the output will be similar amongst programs. The textbook shows output from SPSS (Statistical Package for the Social Sciences). Some other useful programs are StatCrunch, SAS (Statistical Analytics System), and even Excel, amongst many others. Let’s look at SPSS output for a frequency distribution.

8 Notice this is similar to our frequency distributions, but has a few extra columns. Percent is calculated the same way it was calculated prior. Valid percent accounts for the missing data. It divides the frequency by the total minus the missing data (1485 for this example). Cumulative percent is a running total (based on valid %).

9 It is important to be able to utilize the frequency distribution to interpret the data and answer questions. Utilize the valid percent column when answer percent questions.

10 The two common ways to represent frequency distributions of categorical data are bar graphs and pie charts. For a bar graph, place the categories on the horizontal axis and either the frequency or the percent on the vertical axis.

11 For a pie chart, make sure each sector is labeled, appropriately sized, and contains the percent.

12 Simple Frequency Distribution for Numerical Data Frequency distributions for numerical data are either simple (the individual values are displayed with their frequencies) or grouped (list grouped frequencies, or equal sized classes). Grouped frequency distributions are used when we have a large number of observations.

13 Examples:

14 To construct a simple frequency distribution: 1)Find the lowest and highest numbers. 2)In column form, write in ascending order all the consecutive numbers from the lowest to highest. 3)Count the frequency.

15 Example: Construct a simple frequency distribution. 7957975796 107657810696 8 1288757687 5697687556

16 In a grouped frequency, the numbers are usually grouped into equal-sized ranges called class intervals. Each class interval contains a lower class limit and an upper class limit. The class width is how wide each interval is, and should usually be equal amongst intervals. If the data contains an extremely small or large value, it might not be possible to have intervals of equal width. In this case, use an open-end class interval.

17 Example: Identify the class width for the following class intervals.

18 The class mid-point is the average of the lower and upper class limits. The mid-point of a class interval 10-15 would be: What is the class mid-point for a class interval of 20-40?

19 Steps to Construct a Grouped Frequency Table 1)Find the highest and lowest values in the dataset and subtract to find the range. 2)Decide on the desired number of classes (it should be between 5 and 20) and then compute the class width by dividing the range by the desired number of classes. Note: There is no clear right or wrong answer for the number of classes. 3)Select a starting point (lower class limit). Use either the lowest number or a convenient number slightly lower than the lowest score. 4)Add the class width to the starting point to get the second lower limit. List the lower limits in a vertical column and enter the corresponding upper limits. Then fill in the values.

20 Example: The daily high temperature in degrees F for the month of July in Carucciville was as follows: 858396101979010610282 106104728985978594100 92961041027599927994 1027699 Construct a grouped frequency distribution for the data.

21 858396101979010610282 106104728985978594100 92961041027599927994 1027699 Note: There are 8 classes instead of the desired 7 due to rounding.

22 We can also find the relative frequencies for each class.

23 Sometimes, if a variable is continuous, the values recorded in the study may be rounded off. In these instances, we want to know the real limits or class boundaries. The real limits of a continuous variable are usually the values that are above or below the recorded value by one-half of the place value to which the numbers were rounded. Example: Say we are examining height and rounding to the nearest centimeter. If 130 cm is recorded, the real limits are 129.5-130.5 cm, because anything in those limits would result in us recording 130 cm.

24 Example: Find the real class limits or the class boundaries for the following class intervals:

25 A histogram is a graph in which the areas in the form of vertical bars represent the frequency of occurrence in a distribution of scores.

26 A relative frequency histogram has the same shape as a histogram, but the vertical scale is marked with relative frequencies instead of actual frequencies.


Download ppt "Chapter 3: Organizing Data. Raw data is useless to us unless we can meaningfully organize and summarize it (descriptive statistics). Organization techniques."

Similar presentations


Ads by Google