Download presentation
Presentation is loading. Please wait.
1
Descriptive Statistics
Tabular and Graphical Displays Frequency Distribution - List of intervals of values for a variable, and the number of occurrences per interval Relative Frequency - Proportion (often reported as a percentage) of observations falling in the interval Histogram/Bar Chart - Graphical representation of a Relative Frequency distribution Stem and Leaf Plot - Horizontal tabular display of data, based on 2 digits (stem/leaf)
2
Comparing Groups Side-by-side bar charts 3 dimensional histograms
Back-to-back stem and leaf plots Goal: Compare 2 (or more) groups wrt variable(s) being measured Do measurements tend to differ among groups?
3
Sample & Population Distributions
Distributions of Samples and Populations- As samples get larger, the sample distribution gets smoother and looks more like the population distribution U-shaped - Measurements tend to be large or small, fewer in middle range of values Bell-shaped - Measurements tend to cluster around the middle with few extremes (symmetric) Skewed Right - Few extreme large values Skewed Left - Few extreme small values
4
Measures of Central Tendency
Mean - Sum of all measurements divided by the number of observations (even distribution of outcomes among cases). Can be highly influenced by extreme values. Notation: Sample Measurements labeled Y1,...,Yn
5
Median, Percentiles, Mode
Median - Middle measurement after data have been ordered from smallest to largest. Appropriate for interval and ordinal scales Pth percentile - Value where P% of measurements fall below and (100-P)% lie above. Lower quartile(25th), Median(50th), Upper quartile(75th) often reported Mode - Most frequently occurring outcome. Typically reported for ordinal and nominal data.
6
Measures of Variation Measures of how similar or different individual’s measurements are Range -- Largest-Smallest observation Deviation -- Difference between ith individual’s outcome and the sample mean: Variance of n observations Y1,...,Yn is the “average” squared deviation:
7
Measures of Variation Standard Deviation - Positive square root of the variance (measure in original units): Properties of the standard deviation: s 0, and only equals 0 if all observations are equal s increases with the amount of variation around the mean Division by n-1 (not n) is due to technical reasons (later) s depends on the units of the data (e.g. $1000s vs $)
8
Empirical Rule If the histogram of the data is approximately bell-shaped, then: Approximately 68% of measurements lie within 1 standard deviation of the mean. Approximately 95% of measurements lie within 2 standard deviations of the mean. Virtually all of the measurements lie within 3 standard deviations of the mean.
9
Other Measures and Plots
Interquartile Range (IQR)-- 75th%ile - 25th%ile (measures the spread in the middle 50% of data) Box Plots - Display a box containing middle 50% of measurements with line at median and lines extending from box. Breaks data into four quartiles Outliers - Observations falling more than 1.5IQR above (below) upper (lower) quartile
10
Sample Statistics/Population Parameters
Sample Mean and Standard Deviations are most commonly reported summaries of sample data. They are random variables since they will change from one sample to another. Population Mean (m) and Standard Deviation (s) computed from a population of measurements are fixed (unknown in practice) values called parameters.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.