# Sexual Activity and the Lifespan of Male Fruitflies

## Presentation on theme: "Sexual Activity and the Lifespan of Male Fruitflies"— Presentation transcript:

Sexual Activity and the Lifespan of Male Fruitflies
Exploring, Organizing, and Describing, Quantitative Data

Essentials: Quantitative Data Know this stuff - (stuff: a useful filler term in stats.)
Characteristics of quantitative variables. Building a quantitative frequency table. From within a quantitative frequency table, be able to identify: classes, class widths, class midpoints, class limits, boundaries (cutpoints) Identify and construct appropriate charts/graphs for quantitative data.

EXPLORING, ORGANIZING, DESCRIBING, AND COMPARING DATA
Before beginning to analyze data, it is important to know three things: 1. Did the data come from a sample or a population? 2. Is the data qualitative or quantitative? 3. In what measurement scale is the data reported?

Important Characteristics of a Data Set
Center – an “average” value that indicates where the middle of the data is located. Variation – a measure of the amount that the values vary among themselves. Distribution – the “shape” of the distribution of data. Outliers – values that are far away from the majority of values. Time – changing characteristics of data over time.

FREQUENCY DISTRIBUTIONS
A Frequency Distribution represents the range over which a variable’s values occur. A Frequency Table lists classes (or categories) of values, along with the frequencies (counts) of the number of values that fall into each class. In addition, a frequency table may show cumulative frequencies, relative frequencies, and cumulative relative frequencies. Frequency Tables are derived from RAW DATA and a TALLY process.

Quantitative Frequency Table Terms
Class – a grouping of data values Lower Class Limits – the smallest number belonging to a class. Upper Class Limits – the largest number belonging to a class. Class Boundaries – numbers used to separate classes without the gaps created by class limits. (Also referred to as Cutpoints) Class Midpoints – the midpoints of the classes. Class Width – the difference between two consecutive lower class limits or lower class boundaries. Using the Old Faithful frequency table: Lower Class Limits – 40, 50, 60, 70, 80, 90, 100 Upper Class Limits – 49, 59, 69, 79, 89, 99, 109 To obtain class boundaries – 1. Find the size of the gap between the upper class limit of one class, and the lower class limit of the next class. 2. Add half that amount to each upper class limit to find the upper class boundaries. 3. Subtract half that amount from each lower class limit to find the lower class boundaries. 50-49=1, ½=.5, =39.5, =49.5 Class Boundaries are – 39.5, 49.5, 59.5, 69.5, 79.5, 89.5, 99.5, 109.5 Class Midpoints – add the lower class limit to the upper class limit and divide the sum by 2. (40+49)/2=44.5, the midpoint of the first class. Class Width – using lower class limits: 50-40=10 using lower class boundaries: =10

Quantitative Frequency Distributions
Ungrouped (or Single- Value) Frequency Distributions contain a Class (grouping) for each value of the variable Generally, a small number of Discrete values are presented Grouped Frequency Distributions include a series of consecutive values into a Class (grouping) Discrete or Continuous data may be presented

Ungrouped or Single-Value Data Example Data for: Number of School-Age Children

Single-Value Grouped Data Table
Each value is represented as a class. Note that each class has only one value, as opposed to an interval of values.

Frequency Table of a Quantitative Variable (Grouped Data Example)
Old Faithful (length of time in minutes, between eruptions for 200 observations) TIME f _________rf___ Totals Classes represent ranges of discrete or continuous values. Here the class values represent the Lower and Upper Class Limits.

Histograms A way to graphically represent quantitative continuous data
Horizontal scale represents classes. Vertical scale represents frequencies or relative frequencies. Heights of the bars correspond to the frequencies (or relative frequencies). Bars are adjacent to each other. That is, there are no gaps between bars, (as occurs with bar charts). Let’s look at a histogram of the Old Faithful data.

Histogram for Single Value Data
Note that each discrete value is represented by a bar equaling the value’s frequency or relative frequency and that the bars touch. Here the class values are the midpoints of the bars.

Histogram of Old Faithful Data
(Continuous Data) Time between Eruptions of Old Faithful Geyser Here the midpoints of the classes are presented.

Anatomy of a Histogram Title Note that there are
no spaces between bars. (continuous data) Number of observations. Height of each bar represents the frequency in each class. Number of occurrences (frequencies) are shown on the vertical axis. Empty Class: No data were recorded between 75 and 80. The numbers shown on the horizontal axis are the boundaries of each class. (Also known as cutpoints.) Each bar represents a class. The number of classes is usually between 5 and 20. Here, there are 17 classes. The width of each class is determined by dividing the range of the data set by the number of classes, and rounding up. In this data set, the range is 82. 82/17 = 4.8, rounded up to 5. This class goes from 5 to 10. Label both horizontal and vertical axes. NOTE: Sometimes the numbers shown on the horizontal axis are the midpoints of each class. (A class midpoint is also referred to as the mark of the class.)

Dotplot Here all 200 time periods are represented. In larger data sets each dot may represent multiple occurrences of a value. Dotplot - each dot represents one observation. For example, on two occasions in this sample, the length of time between eruptions was 90 minutes. SPSS does not do Dotplots. = Minutes

Stem-and-Leaf Plot (single stem)
Stem-and-leaf of C N = 200 Leaf Unit = 1.0 10 1 Stem-and-Leaf (double stem) Stem-and-leaf of C N = 200 Leaf Unit = 1.0 4 122 7 012 7 569 10 1 Old Faithful Data used for both stem-and-leafs

Percent of Day Spent Sleeping by Male Fruitflies
1 8 12 17 20 23 28 35 66 2 9 13 24 36 71 4 29 73 14 21 37 81 25 83 5 10 22 30 38 6 18 26 31 40 15 27 32 42 43 33 49 34 7 16 19 50 62 Create a frequency table containing 9 classes, with the first lower class limit at 0.

Dotplot - each dot represents one observation (here, one fruitfly)
For example 2 of the 125 fruitflies spent 40% of the day sleeping.

Single Stem-and-Leaf -
Ordered versus Non-ordered Depth -

Quantitative Presentations
The following data represents the ages of 30 students in a statistics class. Construct a frequency distribution that has five classes. Graphically present these data as a histogram, stem-&-leaf plot and a dot plot. Ages of Students 18 20 21 27 29 19 30 32 34 24 37 38 22 39 44 33 46 54 49 51

End of Slides

(percent of the day spent sleeping)
MALE FRUITFLIES (percent of the day spent sleeping) n=125 40 35 30 Frequency 25 20 15 10 5 Number of occurrences (frequencies) are shown on the vertical axis. If this were a relative frequency histogram, relative frequencies would be shown on the vertical axis. Classes are labeled along the horizontal axis. Each bar represents a class. Label horizontal and vertical axes. Height of each bar represents frequency in each class. Note there are no spaces between bars. < < < < < < < < < 90 Percent

Population Distribution Sample Distribution
Population Distribution - distribution of population data. Sample Distribution - distribution of sample data. Note that sample distributions vary from sample to sample, but there is only one population distribution which is the distribution of the variable under consideration. The distribution of the sample approximates the population distribution.

Stem-and-Leaf Plot (single stem)
Stem-and-leaf of C N = 200 Leaf Unit = 1.0 10 1

Stem-and-Leaf (double stem)
Stem-and-leaf of C N = 200 Leaf Unit = 1.0 4 122 7 012 7 569 10 1 Double Stem-and Leaf

Organizing Data Recall: a variable is a characteristic that varies from one person or thing to another.

Frequency Distribution (fruitfly data)
Lower Cutpoint Upper Cutpoint Midpoint Width Chart here is a frequency distribution table. It is also referred to by Weiss as a grouped data table. Each observation must belong to one and only one class. All classes should be of the same width. Lower Cutpoint - the smallest value that can go in a class. In the 30<40 class, 30 is the lower cutpoint. Upper Cutpoint - the smallest value that can go in the next higher class. The upper cutpoint of a class is the same as the lower cutpoint of the next higher class. In the 30<40 class, 40 is the upper cutpoint. Midpoint - the middle of a class, obtained by adding the lower and upper cutpoints together, and then dividing by 2 (the average). In the 30<40 class, (30+40)/2 = 35 Width - The difference between the upper and lower cutpoints of a class. Using the 30<40 class = 10