Presentation is loading. Please wait.

Presentation is loading. Please wait.

Descriptive Statistics Tabular and Graphical Displays –Frequency Distribution - List of intervals of values for a variable, and the number of occurrences.

Similar presentations


Presentation on theme: "Descriptive Statistics Tabular and Graphical Displays –Frequency Distribution - List of intervals of values for a variable, and the number of occurrences."— Presentation transcript:

1 Descriptive Statistics Tabular and Graphical Displays –Frequency Distribution - List of intervals of values for a variable, and the number of occurrences per interval –Relative Frequency - Proportion (often reported as a percentage) of observations falling in the interval –Histogram/Bar Chart - Graphical representation of a Relative Frequency distribution –Stem and Leaf Plot - Horizontal tabular display of data, based on 2 digits (stem/leaf)

2 Comparing Groups Side-by-side bar charts 3 dimensional histograms Back-to-back stem and leaf plots Goal: Compare 2 (or more) groups wrt variable(s) being measured Do measurements tend to differ among groups?

3 Sample & Population Distributions Distributions of Samples and Populations- As samples get larger, the sample distribution gets smoother and looks more like the population distribution –U-shaped - Measurements tend to be large or small, fewer in middle range of values –Bell-shaped - Measurements tend to cluster around the middle with few extremes (symmetric) –Skewed Right - Few extreme large values –Skewed Left - Few extreme small values

4 Measures of Central Tendency Mean - Sum of all measurements divided by the number of observations (even distribution of outcomes among cases). Can be highly influenced by extreme values. Notation: Sample Measurements labeled Y 1,...,Y n

5 Median, Percentiles, Mode Median - Middle measurement after data have been ordered from smallest to largest. Appropriate for interval and ordinal scales P th percentile - Value where P% of measurements fall below and (100-P)% lie above. Lower quartile(25 th ), Median(50 th ), Upper quartile(75 th ) often reported Mode - Most frequently occurring outcome. Typically reported for ordinal and nominal data.

6 Measures of Variation Measures of how similar or different individual’s measurements are –Range -- Largest-Smallest observation –Deviation -- Difference between i th individual’s outcome and the sample mean: – Variance of n observations Y 1,...,Y n is the “average” squared deviation:

7 Measures of Variation Standard Deviation - Positive square root of the variance (measure in original units): Properties of the standard deviation: s  0, and only equals 0 if all observations are equal s increases with the amount of variation around the mean Division by n-1 (not n) is due to technical reasons (later) s depends on the units of the data (e.g. $1000s vs $)

8 Empirical Rule If the histogram of the data is approximately bell-shaped, then: –Approximately 68% of measurements lie within 1 standard deviation of the mean. –Approximately 95% of measurements lie within 2 standard deviations of the mean. –Virtually all of the measurements lie within 3 standard deviations of the mean.

9 Other Measures and Plots Interquartile Range (IQR)-- 75 th % ile - 25 th % ile (measures the spread in the middle 50% of data) Box Plots - Display a box containing middle 50% of measurements with line at median and lines extending from box. Breaks data into four quartiles Outliers - Observations falling more than 1.5IQR above (below) upper (lower) quartile

10 Dependent and Independent Variables Dependent variables are outcomes of interest to investigators. Also referred to as Responses or Endpoints Independent variables are Factors that are often hypothesized to effect the outcomes (levels of dependent variables). Also referred to as Predictor or Explanatory Variables Research ??? Does I.V.  D.V.

11 Example - Clinical Trials of Cialis Clinical trials conducted worldwide to study efficacy and safety of Cialis (Tadalafil) for ED Patients randomized to Placebo, 10mg, and 20mg Co-Primary outcomes: –Change from baseline in erectile dysfunction domain if the International Index of Erectile Dysfunction (Numeric) –Response to: “Were you able to insert your P… into your partner’s V…?” (Nominal: Yes/No) –Response to: “Did your erection last long enough for you to have succesful intercourse?” (Nominal: Yes/No) Source: Carson, et al. (2004).

12 Example - Clinical Trials of Cialis Population: All adult males suffering from erectile dysfunction Sample: 2102 men with mild-to-severe ED in 11 randomized clinical trials Dependent Variable(s): Co-primary outcomes listed on previous slide Independent Variable: Cialis Dose: (0, 10, 20 mg) Research Questions: Does use of Cialis improve erectile function?

13 Sample Statistics/Population Parameters Sample Mean and Standard Deviations are most commonly reported summaries of sample data. They are random variables since they will change from one sample to another. Population Mean (  ) and Standard Deviation (  ) computed from a population of measurements are fixed (unknown in practice) values called parameters.


Download ppt "Descriptive Statistics Tabular and Graphical Displays –Frequency Distribution - List of intervals of values for a variable, and the number of occurrences."

Similar presentations


Ads by Google