BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Lesson 10: Linear Regression and Correlation
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Brief introduction on Logistic Regression
Correlation and regression
Forecasting Using the Simple Linear Regression Model and Correlation
Inference for Regression
Regression Analysis Simple Regression. y = mx + b y = a + bx.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
EPI 809/Spring Probability Distribution of Random Error.
Simple Linear Regression and Correlation
Objectives (BPS chapter 24)
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L10.1 CorrelationCorrelation The underlying principle of correlation analysis.
Chapter 13 Multiple Regression
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
The Simple Linear Regression Model: Specification and Estimation
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Chapter 12 Simple Regression
Chapter 12 Multiple Regression
Statistics for Business and Economics
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
The Simple Regression Model
Chapter Topics Types of Regression Models
Chapter 11 Multiple Regression.
Linear Regression Example Data
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Introduction to Regression Analysis, Chapter 13,
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Chapter 12 Section 1 Inference for Linear Regression.
Correlation and Linear Regression
Practical Model Selection and Multi-model Inference using R Presented by: Eric Stolen and Dan Hunt.
Regression and Correlation Methods Judy Zhong Ph.D.
Linear Regression and Correlation
Correlation and Linear Regression
BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS.
Confidence Intervals Nancy D. Barker, M.S.. Statistical Inference.
Statistics for Business and Economics Chapter 10 Simple Linear Regression.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Examining Relationships in Quantitative Research
Topic 10 - Linear Regression Least squares principle - pages 301 – – 309 Hypothesis tests/confidence intervals/prediction intervals for regression.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Lecture 10: Correlation and Regression Model.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Correlation & Regression Analysis
Chapter 8: Simple Linear Regression Yang Zhenlin.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
The Simple Linear Regression Model: Specification and Estimation  Theory suggests many relationships between variables  These relationships suggest that.
Model Comparison. Assessing alternative models We don’t ask “Is the model right or wrong?” We ask “Do the data support a model more than a competing model?”
PO 141: INTRODUCTION TO PUBLIC POLICY Summer I (2015) Claire Leavitt Boston University.
15 Inferential Statistics.
Chapter 4 Basic Estimation Techniques
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
CHAPTER 7 Linear Correlation & Regression Methods
Model Comparison.
Product moment correlation
Presentation transcript:

BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS

The mean (x) of random variable x is: where n is the number of observations, the variance (s 2 ) is: Mathematical expectation

The standard deviation (s) is: The coefficient of variation is:

Precision, bias, and accuracy

Basic probability The probability of a event occurring is expressed as: P(event) The probability of the event not occurring, 1- P(event) or P(~event). If events are independent, the probability of events A and B occurring is estimated as: p(A) * p(B).

The probability of catching a fish during a single event: p(capture) On all three sampling occasions is: p(capture)*p(capture)*p(capture) = p(capture) 3, the probability of not catching it during any of the 3 occasions is: (1- p(capture))*(1- p(capture))*(1- p(capture)) = (1- p(capture)) 3, and the probability of catching it on at least 1 occasion is the complement of not catching it during any of the occasions: 1- (1- p(capture)) 3. Probability example

Models and fisheries management “True” Models Fundamental assumption: there is no “true” model that generates biological data Truth in biological sciences has essentially infinite dimension; hence, full reality cannot be revealed with finite samples. Biological systems are complex with many small effects, interactions, individual heterogeneity, and environmental covariates. Thus all models are approximations of reality Greater amounts of data are required to model smaller effects.

Several models can represent a single hypotheses Models = hypotheses Models are tools for evaluating hypotheses Models are very explicit representations of hypotheses Hypotheses are unproven theories, suppositions that are tentatively accepted to explain facts or as the basis for further investigation Models and hypotheses

Hypothesis: shoal bass reproduction success is greater when there are more reproductively active adults Y = aN Y = aN/(1+bN) Number of young is proportional to the number of adults Number of young increases with the number of adults until nesting areas are saturated Y = aNe -bN Number of young is increases until the carrying capacity of nesting and rearing areas is reached Models and hypotheses: example

Number of shoal bass Number of YOY Y = aN Y = aN/(1+bN) Y = aNe -bN

Tapering Effect Sizes Biological systems there are often large important effects, followed by smaller effects, and then yet smaller effects. These effects might be sequentially revealed as sample size increases because information content increases Rare events yet are more difficult to study (e.g. fire, flood, volcanism) Big effects small effects

Model selection Determine what is the best explanation given the data Determine what is the best model for predicting the response Two approaches in fisheries/ecology Null hypothesis testing Information theoretic approaches

Null hypothesis testing Develop an a priori hypothesis Deduce testable predictions (i.e., models) Carry out suitable test (experiment) Compare test results with predictions Retain or reject hypothesis

Hypothesis testing example: Density independence for lake sturgeon populations Hypothesis: lake sturgeon reproduction is density independent Prediction: there is no relation between adult density and age 0 density Test: measure age 0 density for various adult densities over time Compare: Linear regression between age 0 and adult sturgeon densities, P value = Result: Retain hypothesis lake sturgeon reproduction is density independent Using a critical  -level = 0.05, we conclude no significant relationship Model: Y = B 0

Model selection based on p-values No theoretical basis for model selection P-values ~ precision of estimate P-values strongly dependent on sample size P(the data (or more extreme data)| Model) vs. L(model | the data) JUST SAY NO TO STATISTICAL SIGNIFICANCE TESTING

If you really need a p-value…. Mark implements likelihood ratio tests -nested models only e.g., Full model: S = f(temperature, flow) Nested model: S = f(flow) Ho: survival related flow Ha: survival related to temperature and flow

Information theory If full reality cannot be included in a model, how do we tell how close we are to truth. Entropy is synonymous with uncertainty truth Kullback-Leibler distance based on information theory The measures how much information is in accounted for in a model

K,L distance (information) is represented by: I (truth| model) AIC is based on the concept of minimizing K-L distance It represents information lost when the candidate model is used to Approximate truth thus SMALL values mean better fit Akaike noticed that the maximum log likelihood Log( L (model or parameter estimate | the data) ) was related to K-L distance Information theory

Sums of squares in regression also is a measure of the relative fit of a model What a maximum likelihood estimate? It is those parameter values that maximize the value of the likelihood, given the data SSE =  deviations 2

The maximum log likelihood (and SSE) is a biased estimate of K-L distance AIC = -2ln( L (model | the data)) + 2K Akaike’s contribution was that he showed that: AIC = -2ln(likelihood) + 2*K Measures model lack of fitPenalty for increasing model size (enforces parsimony) It is based on the principle of parsimony Number of parameters FewMany Variance Bias 2 Heuristic interpretation

If ratio of n/K is < 40 then use AIC c AIC: Small sample bias adjustment AICc = -2*ln(likelihood | data) + 2*K + (2*K*(K+1))/(n-K-1) As n gets big…. (2*K*(K+1))/(n-K-1) = 1/very large number (2*K*(K+1))/(n-K-1) = 0 So…. AICc = AIC

AIC by itself is relatively meaningless. Recall that we find the best model by comparing various models and examining Their relative distance to the “truth” Model selection with AIC What is model selection? We do this by calculating the difference between the best fitting model (lowest AIC) and the other models. Model selection uncertainty Which model is the best? What about if you collect data at the same spot next year, next week, next door? AIC weights-- long run interpretation vs. Bayesian. Confidence set of models analogous to confidence intervals

Where do we get AIC? K -2ln(L (model | the data))

Interpreting AIC Best model (lowest AICc) Difference between lowest AIC and model (relative distance from truth)

Interpreting AIC AICc weight, ranges 0-1 with 1 = best model Interpreted a relative likelihood that model is best, given the data and the other models in the set

Interpreting AIC Ratio of 2 weights interpreted as the strength of evidence for one model over another Here the best model is / = 6.64 times more likely to be The best model for estimating striped bass population size

Confidence model set Using a 1/8 (0.12) rule for weight of evidence, my confidence set includes the top two models (both model likelihoods > 0.12). Analogous to a confidence interval for a parameter estimate

Linear models review Y: response variable (dependent variable) X: predictor variable (independent variable) Y =  0 +  1 *X + e  0 is the intercept  1 is the slope (parameter) associated with X e is the residual error

Linear models review When Y is a probability it is bounded by 0, 1 Y =  0 +  1 *X Can provide values 1, we need to transform or use a link function For probabilities, the logit link is the most useful

Logit link  = ln( ) p 1- p  is the log odds p is the probability of an event

Log linear models (logistic regression)  =  0 +  1 *X  is the log odds  0 is the intercept  1 is the slope (parameter) associated with X Betas are on a logit scale and the log-odds needs to be back transformed

Back transformation: Inverse logit link -- 1 1+exp( )  is the log odds p is the probability of an event p =

Back transformation example  =  0 +  1 *X  0 =  1 = 0.5 X = 2

Back transformation example  = *2  = exp(1.5) = 0.18 or 18%

Interpreting beta estimates exp(0.5) = 1.65 Betas are on a logit scale, to interpret calculate odds ratios Using the exponential function  1 = 0.5 Interpretation: for each 1 unit increase in X, the event is 1.65 times more likely to occur For example, for each 1 inch increase in length, a fish is 1.65 times more likely to be caught

Goodness-of-fit C

Goodness-of-fit MARK output

Overdispersion Extra variability Missing covariates Heterogeneity in S, p, etc. Possible solutions Include additional covariates Heterogeneity models c-hat adjustment in MARK quasi-AIC (QAIC) variances and confidence intervals

Goodness of fit Bootstrap GOF Median c-hat Residual plots