# ARE OBSERVATIONS OBTAINED DIFFERENT?. ARE OBSERVATIONS OBTAINED DIFFERENT? You use different statistical tests for different problems. We will examine.

## Presentation on theme: "ARE OBSERVATIONS OBTAINED DIFFERENT?. ARE OBSERVATIONS OBTAINED DIFFERENT? You use different statistical tests for different problems. We will examine."— Presentation transcript:

ARE OBSERVATIONS OBTAINED DIFFERENT?

ARE OBSERVATIONS OBTAINED DIFFERENT? You use different statistical tests for different problems. We will examine some basic tests ( χ 2, t-test, Regression, ANOVA, ANCOVA, χ 2 ) We expect you to use these basic tests in your research. Your research project should not be so complicated that more advanced tests are required. Always state your hypothesis – what you are testing.

BASIC PREMISE OF STATISTICAL TESTING: You observe 60 heads. Is the coin fair? sd away from mean = (60 – 50)/5 = 2 sd 2 sd is 5% chance, but in one direction so 2.5% chance (5%/2) Proportion of heads Frequency Toss a coin 100 times A fair coin: x = 50 heads sd = 5 heads (√(½ x ½ x 100)) What if you set the probability to claim it to be unfair to be 5%? What if you set the probability to claim it to be unfair to be 25%? NULL HYPOTHESISACCEPTEDREJECTED TRUECORRECTTYPE I ERROR FALSETYPE II ERRORCORRECT NULL HYPOTHESIS Null Hypothesis: The coin is fair

NONPARAMETRIC TESTS: (data does not have to be normally distributed) strata RED 8/10 RED 1/11 RED 1/13 #1#2 #3  2 CONTINGENCY TABLE: STRATA #1 #2#3 SPECIES RED NOT RED 8 1 1 10 2 10 12 24 10 11 13 34 Expected for each cell = (R x C)/TOTAL (2.94 ) (3.24) ( 3.82) (7.06) (7.76) ( 9.18) (O – E) 2  2 =  E 8.71 + 3.63 +1.55 +.65 + 2.08 +.87 = 17.49 = P < 0.001; df = (r-1)(c-1) = 2 Data must be counts and you test proportional distribution of counts. Null hypothesis: no difference in proportion of red among strata

 2 CONTINGENCY TABLE: Make a spreadsheet with table categories and counts in each, and then have MYSTAT use as frequencies (Data … Case weighting … By frequencies) Depending on table, use One-way frequency tables (one category – e.g., tree type) or Tables (more than one category – e.g., tree type and strata) in Analyze in MYSTAT

PARAMETRIC TESTS (data is normally distributed) strata RED 0.79 + 0.25 RED 0.08 + 0.17 RED 0.08 + 0.17 #1#2 #3 t = [(0.71)(1.41)]/.214 = 4.68 P < 0.005, degrees of freedom = 6 T-TEST: (x 1 - x 2 )√n 1 n 2 /(n 1 + n 2 ) √[(n 1 – 1)s 1 2 + (n 2 – 1)s 2 2 ]/(n 1 + n 2 – 2) t = Data do not have to be counts. Easier to see differences (more powerful) than nonparametric statistics. Null hypothesis: no difference in proportion of red between strata #1 and #2. Proportion red Frequency

T-TEST: Use Hypothesis testing in Analyze in MYSTAT for means

PARAMETRIC TESTS (data is normally distributed) EVEN MORE POWERFUL IF A PRIORI BASIS TO PAIR OBSERVATIONS. strata RED 0.79 + 0.25 RED 0.08 + 0.17 RED 0.08 + 0.17 #1#2 #3 P < 0.001, degrees of freedom = n-1 = 3 PAIRED T-TEST: Pairs: 0.5 – 0 = 0.5; 1.0 – 0 = 1.0; 1.0 - 0.33 = 0.67; 0.67 – 0 = 0.67 mean = 0.71, sd = 0.21 t = 0.71/(0.21/√4) = 6.76 Data do not have to be counts. Easier to see differences (more powerful) than nonparametric statistics. Null hypothesis: no difference in relative abundance of red between strata #1 and #2 for matched plots based on similarity.

PARAMETRIC TESTS (data is normally distributed) strata RED 0.79 + 0.25 RED 0.08 + 0.17 RED 0.08 + 0.17 #1#2 #3 Strata #1: mean = 2.0, sd = 0.82, n = 4 Strata #2: mean = 0.25, sd = 0.5, n =4 t = [(2 – 0.25)(1.41)]/ 0.68 = 3.63 P < 0.01, degrees of freedom = 6 T-TEST: Data do not have to be counts. Easier to see differences (more powerful) than nonparametric statistics. Null hypothesis: no difference in absolute abundance of red between strata #1 and #2. Now use numbers not proportions.

strata RED 0.79 + 0.25 RED 0.08 + 0.17 RED 0.08 + 0.17 #1#2 #3 STATISTICAL TESTS Null hypothesis: there is no relationship between red vs. blue + green in plots. REGRESSION ANALYSIS: 5 RED = 2.33 – 0.75(BLUE or GREEN) r 2 = 0.75, r = -0.88 Degrees of freedom = 12 – 2 = 10 P < 0.001

REGRESSION ANALYSIS: Use Regression … Linear … Least squares in Analyze in MYSTAT Select dependent (y) and independent (x) variables

WHAT IF MULTIPLE COMPARISONS OF A CATEGORY (ANOVA) strata RED 0.79 + 0.25 RED 0.08 + 0.17 RED 0.08 + 0.17 #1#2 #3 Three possible t-test comparisons: #1 vs. #2 #1 vs. #3 #2 vs. #3 PROBLEM: As number of comparisons increases, the likelihood of finding at least one significant difference by chance increases. ANOVA takes this into account to compare differences in mean values. F = 19.75 df = 2, 9 (strata -1, samples – strata) p < 0.001 PARAMETRIC TESTS (data is normally distributed) Null hypothesis: no difference in relative abundance of red among all strata. 1-WAY ANOVA:

ANOVA: Use Analysis of variance … Estimate model in Analyze in MYSTAT Select continuous dependent (y) variable and categorical independent (x) variables

MULTIPLE COMPARISONS (ANOVA): (Which specific differences are significant?) strata RED 0.79 + 0.25 RED 0.08 + 0.17 RED 0.08 + 0.17 #1#2 #3 Post –hoc analysis: Must compensate for number of comparisons and the fact that a difference is already known to be significant. Bonferroni test : (t-test adjusted for # of comparisons) #1 vs. #2 – p < 0.001 #1 vs. #3 – p < 0.001 #2 vs. #3 – p < 1.0

ANOVA – POST HOC: (cannot do with MYSTAT, but will with SYSTAT) Use Analysis of variance … Estimate model … Hypothesis test in Analyze in SYSTAT

MULTIPLE COMPARISONS (ANOVA): (several independent categorical variables) strata RED 0.79 + 0.25 RED 0.08 + 0.17 RED 0.08 + 0.17 #1#2 #3 DISTANCE FROM EDGE near far Strata: F = 15.65; df = 2,6; p < 0.001 Distance: F = 0.12; df = 1,6; p < 0.74 Strata X Distance Interaction: F = 0.51; df = 2,6; p < 0.63 Null hypothesis: no difference in relative abundance of red between strata and with distance into the woods. TWO-WAY ANOVA: COULD HAVE N-WAY ANOVA, YOUR PROJECT SHOULD NOT EXCEED A 2-WAY.

SEASONABCmean I 1232 II 2222 III 3212 mean222 2 LOCATION NO MAIN EFFECTS (SEASON or LOCATION – no differences) INTERACTION IS SIGNIFICANT (greatest at A:III and C:I) THE INTERACTION TERM’S MEANING (no variety)

SEASONABCmean I 1232 II 4565 III 7898 mean456 5 LOCATION MAIN EFFECTS (SEASON or LOCATION -- differences) NO INTERACTION (highest always in C and III) (wider variety) THE INTERACTION TERM’S MEANING

MULTIPLE COMPARISONS (ANCOVA): (several independent variables: one categorical and one continuous ) strata RED 0.79 + 0.25 RED 0.08 + 0.17 RED 0.08 + 0.17 #1#2 #3 DISTANCE FROM EDGE near far ANCOVA: Blue + Green: F = 36.10; df = 1,9; p < 0.0002 Distance: F = 0.78; df = 1,9; p < 0.40 Interaction (slope): F = 0.08; df = 1,8; p < 0.08 Null hypothesis: no difference in relative abundance of red with blue + green and distance into the woods (assume equal slopes). COULD HAVE N-WAY ANCOVA,

ANCOVA: Use Analysis of variance … Estimate model in Analyze in MYSTAT. In SYSTAT use General linear model … Estimate model in Analyze Select continuous dependent (y) variable and categorical independent (x 1 ) variable and covariate (x 2 ). In SYSTAT, create interaction term to test slope.

DATA TRANSFORMATIONS (can normalize data or make it continuous so parametric statistics can be used, or make data linear for regression) Data are not always normally distributed, but a transformation may make it normal (e.g., log). If it cannot be normalized then must use non-parametric statistics (less powerful). Data are not always continuous, percentages or proportions are not continuous because they cannot be less than 0 or greater than 100 or 1. To make them continuous from 0 to infinity or –infinity to +infinity, you can use transforms: arcsine transform = arcsin  proportion; logarithmic transform = log(proportion)* logit transform = log (proportion/1-proportion)*. This stretches both tails and compresses the peak to approximate a continuous normal distribution. * If some proportions = 0 or 1, then add a small constant to all values (e.g, 0.001) Data for regression are not always linear, various transformations, especially log x, log y or both, can transform a curve into a straight line. What do logarithmic transforms imply about the linear function?

DATA TRANSFORMATIONS Use Data … Transform … Let in MYSTAT.

ARE OBSERVATIONS OBTAINED DIFFERENT? Different statistical tests for different problems. You will use these basic tests in your research ( χ 2, t-test, Regression, ANOVA, ANCOVA ) Your research project should not be so complicated that more advanced tests are required. Always graph your data and state your hypothesis.

USE MYSTAT WITH DATA FILES CREATED LAST WEEK (be sure to set 6 decimal places -- Edit … Options … Output in MYSTAT so p values are exact) Meadow vole (Microtus pennsylvanicus) Yellowbellied marmot (Marmota flaviventris) UNDERC-WEST (National Bison Range)

Does snap-trapping lead to a sex bias in Microtus? What is the relationship between length and mass for Microtus? (hint: need to useData … Transform … Let) Do Microtus and Marmota exhibit similar length and mass growth relationships? (hint: think about question above) Does Marmota mass vary with month? Explain ecologically what you see. Does reproductive status of female Microtus differ with mass? Why do you observe this? (hint: need to use Data … Select cases) Does the reproductive status of male and female Microtus with mass differ? Due in two weeks! WITH MYSTAT ANSWER THESE QUESTIONS: (you will use χ 2, regression, t-test, 2-way ANOVA, ANCOVA)

Download ppt "ARE OBSERVATIONS OBTAINED DIFFERENT?. ARE OBSERVATIONS OBTAINED DIFFERENT? You use different statistical tests for different problems. We will examine."

Similar presentations