For a mean Standard deviation changes to standard error now that we are dealing with a sampling distribution of the mean.

Slides:



Advertisements
Similar presentations
T-tests continued.
Advertisements

Issues in factorial design
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Introduction to Factorial ANOVA Designs
C82MST Statistical Methods 2 - Lecture 4 1 Overview of Lecture Last Week Per comparison and familywise error Post hoc comparisons Testing the assumptions.
C82MST Statistical Methods 2 - Lecture 7 1 Overview of Lecture Advantages and disadvantages of within subjects designs One-way within subjects ANOVA Two-way.
Part I – MULTIVARIATE ANALYSIS
Hypothesis Testing Comparisons Among Two Samples.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples.
Lecture 9: One Way ANOVA Between Subjects
PSY 1950 Confidence and Power December, Requisite Quote “The picturing of data allows us to be sensitive not only to the multiple hypotheses that.
Today Concepts underlying inferential statistics
Chapter 14 Inferential Data Analysis
Two-Way Analysis of Variance STAT E-150 Statistical Methods.
Standard Error of the Mean
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
ANOVA Chapter 12.
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
LEARNING PROGRAMME Hypothesis testing Intermediate Training in Quantitative Analysis Bangkok November 2007.
Repeated Measures Design
Extension to ANOVA From t to F. Review Comparisons of samples involving t-tests are restricted to the two-sample domain Comparisons of samples involving.
Repeated Measures Design
ANOVA Greg C Elvers.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 23, Slide 1 Chapter 23 Comparing Means.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Comparing Three or More Means 13.
Stats Lunch: Day 7 One-Way ANOVA. Basic Steps of Calculating an ANOVA M = 3 M = 6 M = 10 Remember, there are 2 ways to estimate pop. variance in ANOVA:
Lecture 8 Analysis of Variance and Covariance Effect of Coupons, In-Store Promotion and Affluence of the Clientele on Sales.
t(ea) for Two: Test between the Means of Different Groups When you want to know if there is a ‘difference’ between the two groups in the mean Use “t-test”.
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
1 Psych 5500/6500 t Test for Two Independent Means Fall, 2008.
Effect Size Estimation in Fixed Factors Between-Groups ANOVA
Effect Size Estimation in Fixed Factors Between- Groups Anova.
Psychology 301 Chapters & Differences Between Two Means Introduction to Analysis of Variance Multiple Comparisons.
Testing Hypotheses about Differences among Several Means.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
Hypothesis testing Intermediate Food Security Analysis Training Rome, July 2010.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Within Subjects Analysis of Variance PowerPoint.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
ANOVA: Analysis of Variance.
Chapter 13 - ANOVA. ANOVA Be able to explain in general terms and using an example what a one-way ANOVA is (370). Know the purpose of the one-way ANOVA.
Adjusted from slides attributed to Andrew Ainsworth
Analysis of Variance and Covariance Effect of Coupons, In-Store Promotion and Affluence of the Clientele on Sales.
Mixed designs. We’ve discussed between groups designs looking at differences across independent samples We’ve also talked about within groups designs.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: One-way ANOVA Marshall University Genomics Core.
ANCOVA. What is Analysis of Covariance? When you think of Ancova, you should think of sequential regression, because really that’s all it is Covariate(s)
General Linear Model.
Chapter 13 Repeated-Measures and Two-Factor Analysis of Variance
Comparing Two Means Chapter 9. Experiments Simple experiments – One IV that’s categorical (two levels!) – One DV that’s interval/ratio/continuous – For.
Handout Twelve: Design & Analysis of Covariance
Introduction to ANOVA Research Designs for ANOVAs Type I Error and Multiple Hypothesis Tests The Logic of ANOVA ANOVA vocabulary, notation, and formulas.
Comparing Means Chapter 24. Plot the Data The natural display for comparing two groups is boxplots of the data for the two groups, placed side-by-side.
ANCOVA.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 11 Testing for Differences Differences betweens groups or categories of the independent.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Chapter 10: The t Test For Two Independent Samples.
Confidence Intervals.
Comparing Three or More Means
Analysis of Covariance (ANCOVA)
12 Inferential Analysis.
I. Statistical Tests: Why do we use them? What do they involve?
12 Inferential Analysis.
Psych 231: Research Methods in Psychology
Experimental Design: The Basic Building Blocks
Psych 231: Research Methods in Psychology
STATISTICS INFORMED DECISIONS USING DATA
Presentation transcript:

For a mean Standard deviation changes to standard error now that we are dealing with a sampling distribution of the mean

Most situations we do not know the population variance The standard deviation has properties that make it a good estimate of the population’s value We can use our sample standard deviation to estimate the population standard deviation

Which leads to: Where

Previously compared sample mean to a known population mean Now want to compare two samples Null hypothesis: the mean of the population of scores from which one set of data is drawn is equal to the mean of the population of the second data set  H 0 :  1 =  2 or  1 -  2 = 0

Usually we do not know population variance (standard deviation) Again use sample to estimate it Result is distributed as t (rather than z)

All of which leads to:

Paired sample t-test Comparing two distributions that are likely to be correlated in some way Transform the paired scores into a single set of scores  Find the difference between X 1 and X 2 for each case If the null hypothesis is true then on average there should be no difference  Thus mean difference is 0 (  D = 0 ) Now we have a single set of scores and a known population mean Use the one sample t test concept

Where = Mean of difference scores S D = Standard deviation of the difference scores n = Number of difference scores (# pairs)

What if your question regards equivalence rather than a difference between scores?  Example: want to know if two counseling approaches are similarly efficacious Just do a t-test, and if non-significant conclude they are the same Wrong! It would be logically incorrect.  Can you think of why?

Recall confidence intervals The mean or difference between means is a point estimate We should also desire an interval estimate reflective of the variability and uncertainty of our measurement seen in the data

Draw a sample, gives us a mean. is our best guess at µ For most samples will be close to µ Point estimate Give a range or interval estimate

Where = sample mean t.95 = t critical value = standard error of the mean

A 95% confidence interval means that: 95% of the confidence intervals calculated on repeated sampling of the same population will contain µ Note that the population value does not vary i.e. it’s not a 95% chance that it falls in that specific interval per se There are an infinite number of 95% CIs that are consistent with a fixed population parameter In other words, the CI attempts to capture the mean, and the one you have is just one of many that are correct

As suggested by many leading quantitative psych guys, using confidence intervals in lieu of emphasizing p-values may make for a better approach for evaluating group differences etc. Confidence intervals can provide all the relevant NHST info as well as a means of graphically displaying statically significant differences One approach is to simply see if our interval for the statistic of interest contains the H 0 value. If not, reject H 0

Confidence intervals are an important component statistical analysis and should always be reported Non-significance on a test of difference does not allow us to assume equivalence Methods exist to test the group equivalency, and should be implemented whenever that is the true goal of the research question  More on that later

Most popular analysis in psychology  Ease of implementation  Allows for analysis of several groups at once  Allows analysis of interactions of multiple independent variables.

SSTotalSSTreatSSerror

If the assumptions of the test are not met we may have problems in its interpretation Assumptions  Independence of observations  Normally distributed variables of measure  Homogeneity of Variance

Approach One-way ANOVA much as you would a t- test  Same assumptions and interpretation taken to 3 or more groups  One would report similar info:  Effect sizes, confidence intervals, graphs of means etc. With ANOVA one must run planned comparisons or post hoc analyses to get to the good stuff as far as interpretation Turn to nonparametric robust options in the face of yucky data and/or violations of assumptions

One-way ANOVA What does it tell us?  Means are different  How?  Don’t know What if I want to know the specifics?  Multiple comparisons

We can do multiple comparisons with a correction for type I error or planned contrasts to test the differences we expect based on theory A prior vs. Post hoc  Before or after the fact A priori  Do you have an expectation of the results based on theory?  A priori  Few comparisons  More statistically powerful Post hoc  Look at all comparisons while maintaining type I error rate

Do we need an omnibus F test first?  Current thinking is no  Most multiple comparison procedures are maintain type I error rates without regard to omnibus F results  Multivariate analogy  We do multivariate and then interpret in terms of uni anovas  Begs the question of why the multivariate approach was taken in the first place

Some are better than others However which are better may depend on situation Try alternatives, but if one is suited specifically for your situation use it. Some suggestions  Assumptions met: Tukey’s or REWQ of the traditional options, FDR for more power  Unequal n: Gabriel’s or Hochberg (latter if large differences)  Unequal variances: Games-Howell More later

The point of these type of analyses is that you had some particular comparison in mind before even collecting data. Why wouldn’t one do a priori all the time?  Though we have some idea, it might not be all that strong theoretically  Might miss out on other interesting comparisons

Let theory guide which comparisons you look at  Have a priori contrasts in mind whenever possible Test only comparisons truly of interest Use more recent methods for post hocs for more statistical power

Factorial Anova Repeated Measures Mixed

With factorial Anova we have more than one independent variable The terms 2-way, 3-way etc. refer to how many IVs there are in the analysis The analysis of interactions constitutes the focal point of factorial design

Total variability comes from:  Differences between groups  Differences within groups

With factorial designs we have additional sources of variability to consider Main effects  Mean differences among the levels of a particular factor Interaction: what it means  Differences among cell means not attributable to main effects  Interactions in this sense are residual effects (i.e. whatever is left over after main effects are accounted for)  Note the GLM for a factorial ANOVA  When the effect of one factor on a DV is influenced by the levels of another

Total variability Between-treatments var.Within-treatments var. Factor A variability Factor B variability Interaction variability

What are we looking for? Do the lines behave similarly (are parallel) or not? Does the effect of one factor depend on the level of the other factor? No interactionInteraction

Note that with a significant interaction, the main effects are understood only in terms of that interaction In other words, they cannot stand alone as an explanation and must be qualified by the interaction’s interpretation

However, interpretation depends on common sense, and should adhere to theoretical considerations  Plot your results in different ways If main effects are meaningful, then it makes sense to at least talk about them, whether or not an interaction is statistically significant or not To help you interpret results, test simple effects  Is simple effect of A significant within specific levels of B?  Is simple effect of B significant within specific levels of A?

Analysis of the effects of one factor at one level of the other factor Example:  Levels of A 1-3 at B 1 and A 1-3 at B 2 Note that the simple effect represents a partitioning of SS main effect and SS interaction  Not just a breakdown of the interaction!

What if there was no significant interaction, can I still test for simple effects? Technically yes* A significant simple effect suggests that at least one of the slopes across levels is significantly different than zero However, one would not conclude that the interaction is ‘close enough’ just because there was a significant simple effect The nonsig interaction suggests that the slope seen is not significantly different from the other(s) under consideration.

Instead of having one score per subject, experiments are frequently conducted in which multiple scores are gathered for each case Repeated Measures or Within-subjects design Advantages  Design – nonsystematic variance (i.e. error, that not under experimental control) is reduced  Take out variance due to individual differences  Efficiency – fewer subjects are required  More sensitivity/power

Measuring performance on the same variable over time  for example looking at changes in performance during training or before and after a specific treatment The same subject is measured multiple times under different conditions  for example performance when taking Drug A and performance when taking Drug B The same subjects provide measures/ratings on different characteristics  for example the desirability of red cars, green cars and blue cars Note how we could do some RM as regular between subjects designs  Ex. Randomly assign to drug A or B

Analysis of variance as discussed previously assumes cells are independent But here we have a case in which that is unlikely  For example, those subjects who perform best in one condition are likely to perform best in the other conditions

SS total SS b/t subjects SS w/in subjects Ss treatment SS error

As with a regular one-way Anova, the omnibus RM analysis tells us that there is some difference among the treatments (drugs) Often this is not a very interesting outcome, or at least, not where we want to stop in our analysis

DeviationCompares the mean of one level to the mean of all levels (grand mean) SimpleCompares each mean to some reference mean (either the first or last category e.g. a control group) Difference (reverse Helmert) Compares level 1 to 2, level 3 with the mean of the previous two etc. HelmertCompares level 1 with all later, level 2 with the mean of all later, level 3 etc. RepeatedCompares level 1 to level 2, level 2 to level 3, 3 to 4 and so on PolynomialTests for trends (e.g. linear) across levels If you had some particular relationship in mind you want to test due to theoretical reasons (e.g. a linear trend over time) one could test that by doing contrast analyses (e.g. available in the contrast option in SPSS before clicking ok). This table compares the different contrasts available in SPSS  Deviation, simple, difference, Helmert, repeated, and polynomial.

Observations may covary, but the degree of covariance must remain the same If covariances are heterogeneous, the error term will generally be an underestimate and F tests will be positively biased Such circumstances may arise due to carry-over effects, practice effects, fatigue and sensitization

Suppose the factor TIME had 3 levels – before, after and follow-up RM ANOVA assumes that the 3 correlations  r ( Before-After )  r ( Before-Follow up )  r ( After-Follow up ) are all about the same in size

A x (B x S) At least one between, one within subjects factor Each level of factor A contains a different group of randomly assigned subjects. On the other hand, each level of factor B at any given level of factor A contains the same subjects

Partitioning the variance is done as for a standard ANOVA Within subjects error term used for repeated measures What error term do we use for the interaction of the two factors?

Again we adopt the basic principle we have followed previously in looking for effects. We want to separate between treatment effects and error  A part due to the manipulation of a variable, the treatment part (treatment effects)  A second part due to all other uncontrolled sources of variability (error) The deviation associated with the error can be divided into two different components:  Between Subjects Error  Estimates the extent to which chance factors are responsible for any differences among the different levels of the between subjects factor.  Within Subjects Error  Estimates the extent to which chance factors are responsible for any differences observed within the same subject

SS total SS b/t subjects SS w/in subjects SS A SS b/t subjects SS B SS AxB SS error (Bxsubject) a-1a(s-1) b-1 (a-1)(b-1)a(b-1)(s-1) df =

Note that the between groups outcome is the same in the mixed and b/t groups design The same is true for the within groups design, except in the mixed the ‘subjects’ are nested within the factor of A B/t groupsW/in groupsMixed SS A SS A/S SS S SS A/S SS B SS B SS BxS SS AxB SS BxS

The SS b/t subjects reflects the deviation of subjects from the grand mean while the SS w/in reflects their deviation from their own mean The mixed design is the conjunction of a randomized single factor experiment and a single factor experiment with repeated measures  Though this can be extended to have more than one b/t groups or repeated factor