Statistics
Review of Statistics Levels of Measurement Descriptive and Inferential Statistics
Levels of Measurement Nature of the variable affects rules applied to its measurement Qualitative Data n Nominal n Ordinal Quantitative Data n Interval n Ratio
Nominal Measurement n Lowest Level n Sorting into categories n Numbers merely symbols--have no quantitative significance n Assign equivalence or nonequivalence Examples, gender, marital status, etc
Male / female smoker /nonsmoker alive/dead 1 2
Rules of Nominal system n All of members of one category are assigned same numbers n No two categories are assigned the same number (mutual exclusivity) n Cannot treat the numbers mathematically n Mode is the only measure of central tendency
The Ordinal Scale n Sorting variations on the basis of their relative standing to each other n Attributes ordered according to some criterion (e.g. best to worst) n Intervals are not necessarily equal Should not treat mathematically, frequencies and modes ok
Ordinal scale
Interval Scale n Researcher can specify rank ordering of variables and distance between n Intervals are equal but no rational zero point (example IQ scale, Fahrenheit scale) n Data can be treated mathematically, most statistical tests are possible
Ratio Scale n Highest level of measurement n Rational meaningful zero point n Absolute magnitude of variable (e.g., mgm/ml of glucose in urine) n Ideal for all statistical tests
Descriptive Statistics Used to describe data n Frequency distributions, histograms, polygons n Measures of Central Tendency n Dispersion n Position within a sample
Frequency Distributions Imposing some order on a mass of numerical data by a systematic arrangement of numerical values from lowest to highest with a count of the number of times each value was obtained--Most frequently represented as a frequency polygon
Frequency distribution
Shapes of distributions n Symmetry n Modality n Kurtosis
Symmetry n Normal curve symmetrical n If non symmetrical skewed (peak is off center) –positively skewed –negatively skewed
Positive skew
Negative skew
Modality n Describes how many peaks are in the distribution –unimodal –bimodal –multimodal
unimodal
bimodal
multimodal
Kurtosis n Peakedness of distribution –platykurtic –mesokurtic –leptokurtic
Mesokurtic
Platykurtic
Leptokurtic
Measures of Central Tendency Overall summary of a group’s characteristics “What is the average level of pain described by post hysterectomy pts.?” “How much information does the typical teen have about STDs?”
Mean n Arithmetic average n Most widely reported meas. of CT n Not trustworthy on skewed distributions
Median n The point on a distribution above which 50% of observations fall n Shows how central the mean really is since the median is the number which divides the sample in half n Does not take into account the quantitative values of individual scores n Preferred in a skewed distribution
Mode n The most frequently occuring score or number value within a distribution n Not affected by extreme values n Shows where scores cluster n There may be more than one mode in a distribution n Arrived at through inspection n limited usefulness in computations
Which measures of central tendency is represented by each of these lines?
Variability or Dispersion Measures n Percentile rank-the point below which a % of scores occur n Range --highest-lowest score n Standard deviation--master measure of variability--average difference of scores from the mean--allows one to interpret a score as it relates to others in the distribution
Normal (Gaussian) Distribution n Mathematical ideal –68.3% of scores within +/- 1sd –95.4% of scores within +/- 2sd –99.7% of scores within +/- 3sd unimodal mesokurtic symmetrical
Normal curve 1% 13.5% 34% 34% 13.5 % 1 %
Inferential Statistics Used to make inferences about entire population from data collected from a sample Two classifications based on their underlying assumptions n Parametric n Nonparametric
Parametric n Based on population parameters n Have numbers of assumptions (requirements) n Level of measurement must be interval or ratio –t-test –Pearson product moment correlation ® –ANOVA –Multiple regression analysis
Parametric n Preferable because they are more powerful--better able to detect a significant result if one exists.
Nonparametric n Not as powerful n Have fewer assumptions n Level of measurement is nominal or ordinal –Chi squared
Some examples of Statistical tests and their use
Hypothesis testing n Research Hypothesis H r --Statement of the researcher’s prediction n Alternate Hypothesis H a --Competing explanation of results n Null Hypothesis H o -- Negative Statement of hypothesis tested by statistical tests
Research Hypotheses n Method A is more effective than method B in reducing pain (directional) n Method A will differ from Method B in pain reducing effectiveness (nondirectional)
Null Hypothesis n Method A equals Method B in pain reduction effectiveness.(any difference is due to chance alone This must be statistically tested to say that something else beside chance is creating any difference in results
Type I and Type II errors n Type I--a decision to reject the null hypothesis when it is true. A researcher conludes that a relationship exists when it does not. n Type II--a decisioon to accept the null hypothesis when it is false. The researcher concludes no relationship exists when it does.
Level of Significance n Degree of risk of making a Type one error. (saying a treatment works when it doesn’t or that a relationship exists when there is none) n Signifies the probability that the results are due to chance alone. n p=.05 means that the probability of the results being due to chance are 5%