Presentation on theme: "Sexual Activity and the Lifespan of Male Fruitflies"— Presentation transcript:
1 Sexual Activity and the Lifespan of Male Fruitflies Exploring, Organizing, and Describing, Quantitative Data
2 Essentials: Quantitative Data Know this stuff - (stuff: a useful filler term in stats.) Characteristics of quantitative variables.Building a quantitative frequency table.From within a quantitative frequency table, be able to identify: classes, class widths, class midpoints, class limits, boundaries (cutpoints)Identify and construct appropriate charts/graphs for quantitative data.
3 EXPLORING, ORGANIZING, DESCRIBING, AND COMPARING DATA Before beginning to analyze data, it is important to know three things:1. Did the data come from a sample or a population?2. Is the data qualitative or quantitative?3. In what measurement scale is the data reported?
4 Important Characteristics of a Data Set Center – an “average” value that indicates where the middle of the data is located.Variation – a measure of the amount that the values vary among themselves.Distribution – the “shape” of the distribution of data.Outliers – values that are far away from the majority of values.Time – changing characteristics of data over time.
5 FREQUENCY DISTRIBUTIONS A Frequency Distribution represents the range over which a variable’s values occur.A Frequency Table lists classes (or categories) of values, along with the frequencies (counts) of the number of values that fall into each class. In addition, a frequency table may show cumulative frequencies, relative frequencies, and cumulative relative frequencies.Frequency Tables are derived from RAW DATA and a TALLY process.
6 Quantitative Frequency Table Terms Class – a grouping of data valuesLower Class Limits – the smallest number belonging to a class.Upper Class Limits – the largest number belonging to a class.Class Boundaries – numbers used to separate classes without the gaps created by class limits. (Also referred to as Cutpoints)Class Midpoints – the midpoints of the classes.Class Width – the difference between two consecutive lower class limits or lower class boundaries.Using the Old Faithful frequency table:Lower Class Limits – 40, 50, 60, 70, 80, 90, 100Upper Class Limits – 49, 59, 69, 79, 89, 99, 109To obtain class boundaries –1. Find the size of the gap between the upper class limit of one class, and the lower class limit of the next class.2. Add half that amount to each upper class limit to find the upper class boundaries.3. Subtract half that amount from each lower class limit to find the lower class boundaries.50-49=1, ½=.5, =39.5, =49.5Class Boundaries are – 39.5, 49.5, 59.5, 69.5, 79.5, 89.5, 99.5, 109.5Class Midpoints – add the lower class limit to the upper class limit and divide the sum by 2.(40+49)/2=44.5, the midpoint of the first class.Class Width – using lower class limits: 50-40=10using lower class boundaries: =10
7 Quantitative Frequency Distributions Ungrouped (or Single- Value) Frequency Distributions contain a Class (grouping) for each value of the variableGenerally, a small number of Discrete values are presentedGrouped Frequency Distributions include a series of consecutive values into a Class (grouping)Discrete or Continuous data may be presented
8 Ungrouped or Single-Value Data Example Data for: Number of School-Age Children
9 Single-Value Grouped Data Table Each value is represented as a class.Note that each class has only one value, as opposed to an interval of values.
10 Frequency Table of a Quantitative Variable (Grouped Data Example) Old Faithful (length of time in minutes, between eruptions for 200 observations)TIME f _________rf___TotalsClasses represent ranges of discrete or continuous values. Here the class values represent the Lower and Upper Class Limits.
11 Histograms A way to graphically represent quantitative continuous data Horizontal scale represents classes.Vertical scale represents frequencies or relative frequencies.Heights of the bars correspond to the frequencies (or relative frequencies).Bars are adjacent to each other. That is, there are no gaps between bars, (as occurs with bar charts).Let’s look at a histogram of the Old Faithful data.
12 Histogram for Single Value Data Note that each discrete value is represented by a bar equaling the value’s frequency or relative frequency and that the bars touch.Here the class values are the midpoints of the bars.
13 Histogram of Old Faithful Data (Continuous Data)Time between Eruptions of Old Faithful GeyserHere the midpoints of the classes are presented.
14 Anatomy of a Histogram Title Note that there are no spaces between bars.(continuous data)Number ofobservations.Height of each barrepresents thefrequency ineach class.Number ofoccurrences (frequencies)are shown on thevertical axis.Empty Class: No data were recorded between 75 and 80.The numbers shown on thehorizontal axis are the boundaries ofeach class. (Also known as cutpoints.)Each bar represents a class. Thenumber of classes is usually between 5 and 20.Here, there are 17 classes. The width of each classis determined by dividing the range of the dataset by the number of classes, and rounding up.In this data set, the range is 82.82/17 = 4.8, rounded up to 5. This class goesfrom 5 to 10.Label bothhorizontal andverticalaxes.NOTE: Sometimes the numbers shown on thehorizontal axis are the midpoints of each class. (A class midpoint is also referred to as the mark of the class.)
15 DotplotHere all 200 time periods are represented. In larger data sets each dot may represent multiple occurrences of a value.Dotplot - each dot represents one observation. For example, on two occasions in this sample, the length of time between eruptions was 90 minutes.SPSS does not do Dotplots.= Minutes
16 Stem-and-Leaf Plot (single stem) Stem-and-leaf of C N = 200Leaf Unit = 1.010 1Stem-and-Leaf (double stem)Stem-and-leaf of C N = 200Leaf Unit = 1.04 1227 0127 56910 1Old Faithful Data used for both stem-and-leafs
17 Percent of Day Spent Sleeping by Male Fruitflies 18121720232835662913243671429731421378125835102230386182631401527324243334934716195062Create a frequency table containing 9 classes, with the first lower class limit at 0.
18 Dotplot - each dot represents one observation (here, one fruitfly) For example 2 of the 125 fruitflies spent 40% of the day sleeping.
19 Single Stem-and-Leaf - Ordered versus Non-orderedDepth -
20 Quantitative Presentations The following data represents the ages of 30 students in a statistics class. Construct a frequency distribution that has five classes. Graphically present these data as a histogram, stem-&-leaf plot and a dot plot.Ages of Students1820212729193032342437382239443346544951
22 (percent of the day spent sleeping) MALE FRUITFLIES(percent of the day spent sleeping)n=125403530Frequency252015105Number of occurrences (frequencies) are shown on the vertical axis.If this were a relative frequency histogram, relative frequencies would be shown on the vertical axis.Classes are labeled along the horizontal axis.Each bar represents a class.Label horizontal and vertical axes.Height of each bar represents frequency in each class.Note there are no spaces between bars.< < < < < < < < < 90Percent
23 Population Distribution Sample Distribution Population Distribution - distribution of population data.Sample Distribution - distribution of sample data.Note that sample distributions vary from sample to sample, but there is only one population distribution which is the distribution of the variable under consideration.The distribution of the sample approximates the population distribution.
24 Stem-and-Leaf Plot (single stem) Stem-and-leaf of C N = 200Leaf Unit = 1.010 1
25 Stem-and-Leaf (double stem) Stem-and-leaf of C N = 200Leaf Unit = 1.04 1227 0127 56910 1Double Stem-and Leaf
26 Organizing DataRecall: a variable is a characteristic that varies from one person or thing to another.
27 Frequency Distribution (fruitfly data) Lower CutpointUpper CutpointMidpointWidthChart here is a frequency distribution table. It is also referred to by Weiss as a grouped data table.Each observation must belong to one and only one class.All classes should be of the same width.Lower Cutpoint - the smallest value that can go in a class. In the 30<40 class, 30 is the lower cutpoint.Upper Cutpoint - the smallest value that can go in the next higher class. The upper cutpoint of a class is the same as the lower cutpoint of the next higher class. In the 30<40 class, 40 is the upper cutpoint.Midpoint - the middle of a class, obtained by adding the lower and upper cutpoints together, and then dividing by 2 (the average). In the 30<40 class, (30+40)/2 = 35Width - The difference between the upper and lower cutpoints of a class. Using the 30<40 class = 10