Presentation on theme: "Hypothesis Testing Chapter 13. Hypothesis Testing Decision-making process Statistics used as a tool to assist with decision-making Scientific hypothesis."— Presentation transcript:
Hypothesis Testing Decision-making process Statistics used as a tool to assist with decision-making Scientific hypothesis is a statement of the predicted relationship amongst the variables Null hypothesis is a statement of no relationship amongst the variables
Null Hypothesis Not Rejected Total Population Sample reared in enriched environment Sample reared in sterile environment
Null Hypothesis Rejected Total population of rats reared in sterile environment Sample used in study Total population of rats reared in enriched environment Sample used in study
Hypothesis Testing In Experimental Studies Your research design determines the kind of statistical test you will use. Experimental studies test hypotheses while quasi-experimental studies tend to focus more on generating hypotheses.
Research Designs/Approaches TypePurposeTime frame Degree of control Examples Experi- mental Test for cause/ effect relationships currentHighComparing two types of treatments for anxiety. Quasi- experi- mental Test for cause/ effect relationships without full control Current or past Moderate to high Gender differences in visual/spatial abilities
Research Designs/Approaches TypePurposeTime frame Degree of control Examples Non- experime ntal - corre- lational Examine relationship between two variables Current (cross- sectional) or past Low to medium Relationship between studying style and grade point average. Ex post facto Examine the effect of past event on current functioning. Past & current Low to medium Relationship between history of child abuse & depression.
Research Designs/Approaches TypePurposeTime frame Degree of control Examples Non- experime ntal - corre- lational Examine relat. betw. 2 var. where 1 is measured later. Future - predictive Low to moderate Relat. betw. history of depression & development of cancer. Cohort- sequen- tial Examine change in a var. over time in overlapping groups. FutureLow to moderate How mother- child negativity changed over adolescence.
Research Designs/Approaches TypePurposeTime frame Degree of control Examples SurveyAssess opinions or characteristics that exist at a given time. CurrentNone or low Voting preferences before an election. Quali- tative Discover potential relationships; descriptive. Past or current None or Low People’s experiences of quitting smoking.
Tests of Significance The QuestionNull HypothesisStatistical Test Group Difference between means of 2 diff. groups H0: g1 = g2 t-independent Diff. betw. 2 means of related groups H0: g1a = g1b t-dependent Diff. betw. means of 3 groups H0: g1 = g2 = g3 ANOVA Group Relationships: betw. 2 variables H0: xy = 0 t-test for sig. Of correlation Group Relationships: betw. 2 correlations H0: ab = cd t-test for sig. Of diff. betw. 2 corr.
Experimental Designs Examines differences between experimentally manipulated groups or variables (e.g., one group gets a certain drug and the other gets a placebo). At minimum, experimental (independent) variable has two levels (e.g., drug vs. placebo). – Advantage is that you can determine causality. – Disadvantage is cost and many variables cannot be experimentally manipulated (e.g., smoke exposure over time).
Null Hypothesis Significance Testing Null hypothesis – Results are due to “chance” – H0 Alternative (scientific) hypothesis – Results are due to a true “effect” – H1
Null Hypothesis Significance Testing Null hypothesis – Results are due to “chance” (H0) Alternative (scientific) hypothesis – Results are due to a true “effect” (H1) Assess – Assuming H0 is true, what is the probability or “chance” of obtaining the data we did?
Null Hypothesis Significance Testing Null hypothesis – Results are due to “chance” (H0) Alternative (scientific) hypothesis – Results are due to a true “effect” (H1) Assess – Assuming H0 is true, what is the probability or “chance” of obtaining the data we did? Decide – If the chance is small enough, reject H0 and infer the “effect” is real.
Parametric Vs. Non-Parametric Statistics: Two-Sample Cases Level of measurement Related SamplesIndependent Samples Nominal McNemar test Fisher exact X 2 test Ordinal Sign test Wilcoxon matched- pairs sign test Median test Mann-Witney U test Interval T-test for matched pairs T-independent test
Parametric Vs. Non-Parametric Statistics: > 2-Sample Cases Level of measurement Related SamplesIndependent Samples Nominal Cochran Q test X 2 test Ordinal Friedman 2-way ANOVA Kruskal-Wallis one- way ANOVA Interval Repeated measures ANOVA ANOVA
Parametric Vs. Non-Parametric Statistics: > 2-Sample Cases Level of measurement Correlation Nominal Contingency coefficient Ordinal Spearman rank correlation Kendall rank correlation, etc. Interval Pearson’s Correlation Coefficient
Sampling Distribution of Mean Difference Scores 95% of all cases 99% of all cases 0
Critical Values of T Need to determine the degrees of freedom – df = N-2 Need to determine the p value for rejecting the null hypothesis (alpha) Need to determine if this is a 1-tailed or 2- tailed level of significance.
What is one of the major criticisms of employing statistical tests of the null hypothesis to determine if effects are true?
Limitations of Statistical Tests of the Null Hypothesis Does not take into account the size of the difference between means (effect size)
Analysis of Variance (ANOVA) F-ratio = MS bet MS within Essentially is the between group variance divided by the within group variance. If the groups come from similar populations, the variances between the groups will be similar to the variance within groups (null hypothesis is not rejected).
ANOVA Between group variance consists of: – Variability due to the effect of the independent variable (treatment effect) – Variability due to chance factors Within group variance consists of: – Variability in data with the treatment groups that is due to chance since if treatment effect was consistent, all subjects within a treatment group would experience similar magnitude of effect.
Analysis of Variance (ANOVA) F-ratio = MS bet MS within The MS refers to the mean square and is the sums of squares divided by the appropriate degrees of freedom. Df for MS bet is the number of groups minus 1. Df for MS within is the total number of scores in the experiment minus the number of groups.
ANOVA MS bet = treatment effect + chance variability MS within = chance variability Ratio will be 1 if there is no treatment effect F (2,144) = 5.56, p < 0.05.
Two-Way ANOVA Where you have 2 independent variables, each having at least 2 levels. For example, – Drug dose (none vs. 5 mg) – Delivery mood (intravenous vs. oral) Factorial design so you can test both main effects and interaction effects
Mixed Model: 2 Between Subject Factors 1 within Subject Factor Where you have 2 independent variables, each having at least 2 levels. For example, – Drug dose (none vs. 5 mg) – Delivery mood (intravenous vs. oral) One within subject factor with for example 3 levels – Pre-treatment, 3 and 6 months follow-up Factorial design so you can test both main effects and interaction effects (3-way interaction effects)
Rejecting the Null Hypothesis Null hypothesis can be rejected but not accepted Arguments made for allowing some flexibility in being able to conclude the null hypothesis is true; – No other studies of the phenomenon have rejected the null hypothesis – P value for the test of the null hypothesis is large (e.g., >.20 or.40). – Research design is sufficiently powerful
Errors in Statistical Decision-Making Type I error – falsely reject the null hypothesis – At p <.05 there is a 5% chance (5 in 100) of falsely rejecting null hypothesis Type II error – failing to reject the null hypothesis when it is false
Goals of Psychology Research Goal is to understand the underlying laws governing the behaviour of organisms. The extent to which the results of your study help inform one about these underlying laws, the more valuable the findings. Limits to the importance of the findings are the internal/external validity.
External Validity Extent to which the results of the study can be generalized across different persons, settings, and times. Typically think of generalizing to specific populations (e.g., North American elementary school students) than world at large. Best safeguard is random selection but not usually feasible.
Threats to External Validity Lack of population validity Lack of ecological validity Lack of time validity
Population Validity Generalizing to the defined population (i.e., target population) from which the sample was drawn. Sample is the experimentally accessible population.
Population Validity Target Population Experimentally accessible population Sample
Population Validity Threatened by a selection by treatment interaction: – Treatment results may not be exactly reproducible in target population. Even willingness to volunteer for studies have been shown to result in a selection by treatment interaction effect.
Ecological Validity Extent to which the results can be generalized across settings or environmental conditions. – E.g., Would the treatment effect observed in patients recruited from a 1 st class medical centre be the same as the the treatment effect observed in patients recruited from a local community hospital?
Ecological Validity Multiple-Treatment Interference – Sequencing effect whereby exposure to one treatment influences responses to another treatment; or – Exposure to one experiment influences response in another experiment (e.g., sophisticated participants).
Ecological Validity Hawthorne Effect – Knowing one is in a study can affect one’s behaviour – Participant bias effects (e.g., social acceptability, compliance) Novelty or Disruption Effect – Effects are simply due to novelty and wear off once novelty diminishes.
Ecological Validity Experimenter Effect – Enthusiastic experimenter/clinician may get different effects than a clinician who is implementing the treatment in routine care. Pre-testing Effect – Administering a pre-test may sensitive the participant in such a way that he/she may respond differently to the experiment than what would have occurred without a pre-test.
Temporal Validity Extent to which the results would generalize to other times – Results might vary depending on the time elapsed between presentation of the independent variable and the measurement of the dependent variable.
Temporal Validity Seasonal Variation – Variation that appears regularly over time (e.g., change in traffic accident rates between daylight savings time and non-daylight savings time). – Fixed-time variation – variation at specific, predictable time points – Variable-time variation – don’t know when variation will occur but when it occurs, there are predictable responses.
Temporal Validity Cyclical Variation – Predictable variation within people or other organisms Personological Variation – Variation in the characteristics of the individual over time
Internal Vs. External Validity Tends to be an inverse relationship – Internal validity ; external validity In testing for between group differences, you want to minimize within group variability and maximize between group differences To do so you want to ensure high control over factors that could confound the results but this often results in increasingly artificial experimental conditions.
When Is External Validity Less Important When you don’t need to demonstrate that “X” will happen but rather “X” can happen. Sometimes the main goal is to test a theory and extent to which it reflects “real-life” is less important.