Presentation is loading. Please wait.

Presentation is loading. Please wait.

ANNOUNCEMENTS Ecology job fair: March 1 st (tomorrow!) 10:00-2:00, Birge Hall Atrium FOR TODAY Grab all 4 handouts in front Get computer, download “stats.

Similar presentations


Presentation on theme: "ANNOUNCEMENTS Ecology job fair: March 1 st (tomorrow!) 10:00-2:00, Birge Hall Atrium FOR TODAY Grab all 4 handouts in front Get computer, download “stats."— Presentation transcript:

1 ANNOUNCEMENTS Ecology job fair: March 1 st (tomorrow!) 10:00-2:00, Birge Hall Atrium FOR TODAY Grab all 4 handouts in front Get computer, download “stats examples” worksheet from website I will show stats examples in Excel 2007, so only use your own computer if you have this program

2 Week 6: Making use of the Badger Mill Creek Field trip Data Analysis, Your Research Questions, and Writing Zoo 511 Spring 2012

3 Outline Field trip review Badger Mill research questions/hypotheses Writing a scientific paper Statistics and data analysis (with examples in Excel) Lab: Enter data

4 Today’s goals Provide a basic background on how to use and interpret common statistical tests Prepare you to generate questions for your paper, and to analyze data to answer these questions Get all data entered!

5 Part 1: Your Questions Read the handout!!!!

6 Your questions should be specific and answerable Does sculpin CPUE differ among geomorphic units? Is brown trout density related to flow velocity? In what kind of stream are brown trout most likely to be found? What habitat do fish prefer? WRONG RIGHT

7 Current Velocity (m/s) Brown Trout/m 2 POOLRUNRIFFLE Sculpin CPUE Example Questions Does sculpin CPUE differ among geomorphic units? Is brown trout density related to flow velocity?

8 Other data sources Previous years’ data: all of the same information was collected from the same place, around the same time of year. Replication! USGS: http://waterdata.usgs.gov/nwis/uv?05435943http://waterdata.usgs.gov/nwis/uv?05435943 Think about these data sources as you generate your questions.

9 Two questions with a supporting paragraph for each are due Sunday 3/4 by 5:00 pm via email. Name your file: Classday_Lastname_Questions.doc (e.g., Wednesday_Latzka_Questions.doc)

10 Part 2: Writing Read the handout!!!!

11 Why Write? Gain experience articulating thoughts Writing is a learning experience It is the currency of communication (in science, law, business, etc…)

12 Order of a scientific paper (see handout!) 1.Title 2.Abstract 3.Introduction – set up your study 4.Methods – study site, data analyses 5.Results –analyses, reference tables and figures here 6.Discussion – interpret results 7.Literature Cited 8.Tables and figures This is the order a paper is presented in – it should not be the order in which you write it

13 Think before you write Analysis  results: figures & numbers Search the literature  context

14 Outline Start with basic parts Add subsections Add topic sentences This will take some time, but will make your paper much easier to write and of much higher quality!!

15 Writing Start with what you know – Results Report the findings What did your analyses reveal? FIGURES SHOULD STAND ALONE!!!! – Methods: two parts Sampling: site description and sampling techniques relevant to your hypothesis Statistical analysis Only what is relevant! – This depends on what you put in your results!

16 Note on results Make ecology the subject of your sentences, not statistics. Statistics help you tell your story, they are not your story in themselves. WRONG: Linear regression showed that there was a significant positive relationship with a p-value of 0.04 and an R 2 of 0.81 between brown trout abundance and flow velocity. RIGHT: Brown trout abundance increased with increasing flow velocity (R 2 =0.81, p=0.04).

17 Intro and discussion: Why does it matter? What does it mean? Introduction – What is the context of the study (past research) – Set up the experiment Discussion – What do the results mean? – Was your hypothesis correct? – What is interesting/exciting about your findings? – Future research directions

18 Writing The last steps Abstract: – for most, the hardest part of writing a scientific paper – Short summary of the important points of the paper Title – Short, sweet, descriptive Literature Cited

19 In summary: WWAD? (what would Alex do?) 1) Think! – this means literature exploration relevant to your question to get a feel for what studies have been done 2) Explore your data – make lots of figures and run enough stats that you start to get a feel for what your “story” is going to be 3) Narrow down your figures and results to those relevant to your story 4) Write results (referencing your figures!) 5) Write methods 6) Write discussion 7) Write intro 8) Abstract 9) Title! 10) Literature Cited

20 Peer Review Criticism is important…”constructive criticism” is best! Two types: Internal and External. Point of internal review is to make external review go well Reviews need to be taken seriously

21 Part 3: Statistics

22 Why use statistics? Are there more green sunfish in pools or runs? Run 5 4 1 Pool 2 7 3 12 10

23 Why use statistics? Are there more orange spotted sunfish in pools or runs? Run 5 1 Pool 2 3 5 6 Statistics help us find patterns in the face of variation, and draw inferences beyond our sample sites. Statistics help us tell our story; they are not the story in themselves!

24 Important note “Data” is the plural form of datum. WRONG: Data was analyzed using Microsoft Excel. RIGHT: Data were analyzed using Microsoft Excel. When in doubt, substitute the word “apples” for “data”, and ask if your sentence makes sense. 125 674587 85790 =

25 A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements along a continuum, such as Flow Velocity What type of variable is “Mottled Sculpin /meter 2” ? What type of variable is “Substrate Type”?

26 A Few Necessary Terms Explanatory/Predictor Variable: Independent variable. On x-axis. The variable you use to predict another variable. Response Variable: Dependent variable. On y-axis. The variable that is hypothesized to depend on/be predicted by the explanatory variable.

27 A Few Necessary Terms Mean: The most likely value of a random variable or set of observations if data are normally distributed (the average) Variance: A measure of how far the observed values differ from the expected variables (Standard deviation is the square root of variance). Normal distribution: a symmetrical probability distribution described by a mean and variance. An assumption of many standard statistical tests. N~(μ 1,σ 1 ) N~(μ 1,σ 2 )N~(μ 2,σ 2 )

28 Statistical Tests Hypothesis Testing: In statistics, we are always testing a Null Hypothesis (H o ) against an alternate hypothesis (H a ). p-value: The probability of observing our data or more extreme data assuming the null hypothesis is correct Statistical Significance: We reject the null hypothesis if the p-value is below a set value (α), usually 0.05.

29 Statistical Tests: Appropriate Use For our data, the response variable will always be continuous. T-test: A categorical explanatory variable with only 2 options. ANOVA: A categorical explanatory variable with >2 options. Regression: A continuous explanatory variable

30 Tests the statistical significance of the difference between means from two independent samples Student’s T-Test Null hypothesis: No difference between means.

31 Cross Plains Salmo Pond Mottled Sculpin/m 2 Compares the means of 2 samples of a categorical variable

32 Precautions and Limitations Meet Assumptions Observations from data with a normal distribution (histogram) Samples are independent Assumed equal variance (this assumption can be relaxed) No other sample biases Interpreting the p-value

33 Walk through t-test

34 Analysis of Variance (ANOVA) Tests the statistical significance of the difference between means from two or more independent groups Riffle Pool Run Mottled Sculpin/m 2 Null hypothesis: No difference between means.

35 Precautions and Limitations Meet Assumptions Samples are independent and identically distributed (iid). Assumed equal variance among groups Residuals are normally distributed Groups are classified correctly No other sample biases Interpreting the p-value

36 Walk through ANOVA

37 Simple Linear Regression Analyzes relationship between two continuous variables: predictor and response Null hypothesis: there is no relationship (slope=0)

38 Residuals Least squared line (regression line: y=mx+b)

39 Residuals Residuals are the distances from observed points to the best-fit line Residuals always sum to zero Regression chooses the best-fit line to minimize the sum of square-residuals. It is called the Least Squares Line.

40 Precautions and Limitations Meet Assumptions Relationship is linear (not exponential, quadratic, etc) X is measured without error For any given value of X, sampled Y’s are independent Normal distribution of residual errors Interpret the p-value and R-squared value.

41 P-value: probability of observing your data (or more extreme data) if no relationship existed. Indicates the strength of the relationship, tells you if your slope (i.e. relationship) is non-zero (i.e. real) R-Squared indicates how much variance in the response variable is explained by the explanatory variable. If this is low, other variables likely play a role. If this is high, it DOES NOT INDICATE A SIGNIFICANT RELATIONSHIP!

42 R-Squared and P-value High R-Squared Low p-value (significant relationship)

43 R-Squared and P-value Low R-Squared Low p-value (significant relationship)

44 R-Squared and P-value High R-Squared High p-value (NO significant relationship)

45 R-Squared and P-value Low R-Squared High p-value (No significant relationship)

46 Walk through Regression 1

47 Residual vs. Fitted Value Plots Model Values (Line) Observed Values (Points)

48 Residual Plots Can Help Test Assumptions 0 “Normal” Scatter 0 Fan Shape: Unequal Variance 0 Curve (linearity)

49 Have we violated any assumptions?

50 If assumptions are violated: Try transforming data (log transformation, square root transformation) Most of these tests are fairly robust to violations of assumptions of normality and equal variance (only be concerned if obvious problems exist) Diagnostics (residual plots, histograms) should NOT be reported in your paper. Rather, a statement that diagnostic tests were performed to assure that assumptions of a linear regression were not violated is sufficient.

51 Walk through regression 2, with residual plots

52 Statistical significance Darters/m2 Flow Velocity 0.5 0.4 0.3 0.2 0.1 0.0 1324 R 2 =0.85 p=0.045 Y=0.02+0.1X Darters/m2 Flow Velocity 0.5 0.4 0.3 0.2 0.1 0.0 1324 R 2 =0.6 p=0.055 Y=0.02+0.1X Take home message: using a cutoff of 0.05 as a cutoff for significance is ARBITRARY! Use your p-values as one of multiple tools for interpreting your results (especially because you will likely have small sample sizes).

53 Statistical vs. biological significance Darters/m2 Flow Velocity 0.5 0.4 0.3 0.2 0.1 0.0 1324 R 2 =0.85 p=0.045 Y=0.02+0.1X For each increase in flow of 1 m/s, you would expect an increase of 0.1 fish per m 2. If your reach contained 100 m 2 of habitat, you would expect a difference of 10 fish. Take home message: there is no magic number to determine biological significance. YOU need to think about what your results mean, and interpret them in an ecological context.

54 Last notes Pivot tables Using R

55 Enter Badger Mill Creek Data (individual ID and diet column, only for trout) Fish abbreviations Double substrate Reach names: w01, w02, etc


Download ppt "ANNOUNCEMENTS Ecology job fair: March 1 st (tomorrow!) 10:00-2:00, Birge Hall Atrium FOR TODAY Grab all 4 handouts in front Get computer, download “stats."

Similar presentations


Ads by Google