Why Model? Make predictions or forecasts where we don’t have data.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
The Multiple Regression Model.
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Forecasting Using the Simple Linear Regression Model and Correlation
Hypothesis Testing Steps in Hypothesis Testing:
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Introduction to Regression Analysis
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Correlation and Autocorrelation
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Chapter 10 Simple Regression.
Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
9. SIMPLE LINEAR REGESSION AND CORRELATION
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Topic 3: Regression.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Pertemua 19 Regresi Linier
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Chapter 7 Forecasting with Simple Regression
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
Linear Regression/Correlation
Correlation & Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Chapter 11 Simple Regression
Chapter 14 Simple Regression
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Introduction to Linear Regression
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Chapter 5: Regression Analysis Part 1: Simple Linear Regression.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Why Model? Make predictions or forecasts where we don’t have data.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Lecture 10: Correlation and Regression Model.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Quantitative Methods Residual Analysis Multiple Linear Regression C.W. Jackson/B. K. Gordor.
Stats Methods at IC Lecture 3: Regression.
Outline Sampling Measurement Descriptive Statistics:
Chapter 13 Simple Linear Regression
Regression and Correlation
Spatial statistics: Spatial Autocorrelation
Regression Analysis AGEC 784.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Statistics for Managers using Microsoft Excel 3rd Edition
Linear Regression and Correlation Analysis
Simple Linear Regression
Chapter 11 Simple Regression
Chapter 13 Simple Linear Regression
Correlation and Regression
Linear Regression/Correlation
Multiple Regression Models
Product moment correlation
Linear Regression Summer School IFPRI
Chapter 6 Predicting Future Performance
3.2. SIMPLE LINEAR REGRESSION
Introduction to Regression
Chapter 13 Simple Linear Regression
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Why Model? Make predictions or forecasts where we don’t have data

Linear Regression wikipedia

Modeling Process Observe Select Model Define Theory/ Type of Model Estimate Parameters Design Experiment Evaluate the Model Collect Data Publish Results Qualify Data

Definitions Horizontal axis: Used to create prediction Vertical axis: What we are trying to predict Independent variable Predictor variable Covariate Explanatory variable Control variable Typically a raster Examples: Temperature, aspect, SST, precipitation Dependent variable Response variable Measured value Explained Outcome Typically an attribute of points Examples: Height, abundance, percent, diversity, …

Definitions The Model – the specific algorithm that predicts our dependent variable values Parameters – the values in the model we estimate (i.e. a/b, m/b for linear regression) Aka, coefficients Performance measures – show how well the model fits the data Aka, descriptive stats

Parameter Estimation Excel spreadsheet X, Y columns Add “trend line” Number of samples Max height Minimum height Height of a bounce

Linear Regression: Assumptions Predictors are error free Linearity of response to predictors Constant variance within and for all predictors (homoscedasticity) Independence of errors Lack of multi-colinearity Also: All points are equally important Residuals are normally distributed (or close).

Multiple Linear Regression    

Normal Distribution To negative infinity To positive infinity    

Linear Data Fitted w/Linear Model Should be a diagonal line for normally distributed data

Non-Linear Data Fitted with a Linear Model This shows the residuals are not normally distributed

Homoscedasticity Residuals have the same normal distribution throughout the range of the data

Ordinary Least Squares  

Linear Regression       Residual  

Parameter Estimation        

Evaluate the Model  

“Goodness of fit”  

 

 

Good Model? - What is the models “predictive power” Anscombe's quartet, nearly identical descriptive statistics

Two Approaches Hypothesis Testing Which is the best model? Data mining Is a hypothesis supported or not? What is the chance that what we are seeing is random? Which is the best model? Assumes the hypothesis is true (implied) Model may or may not support the hypothesis Data mining Discouraged in spatial modeling Can lead to erroneous conclusions

Significance (p-value) H0 – Null hypothesis (flat line) Hypothesis – regression line not flat The smaller the p-value, the more evidence we have against H0 Our hypothesis is probably true It is also a measure of how likely we are to get a certain sample result or a result “more extreme,” assuming H0 is true The chance the relationship is random The problem with “disproving the null hypothesis” is that is it commonly misunderstood The problem with “p” values is that they are overused, especially for applied research http://www.childrensmercy.org/stats/definitions/pvalue.htm

Confidence Intervals 95 percent of the time, values will fall within a 95% confidence interval Methods: Moments (mean, variance) Likelihood Significance tests (p-values) Bootstrapping

Model Evaluation Parameter sensitivity Ground truthing Uncertainty in data AND predictors Spatial Temporal Attributes/Measurements Alternative models Alternative parameters

Model Evaluation?

Robust models Domain/scope is well defined Data is well understood Uncertainty is documented Model can be tied to phenomenon Model validated against other data Sensitivity testing completed Conclusions are within the domain/scope or are “possibilities” See:https://www.youtube.com/watch?v=HuyMQ-S9jGs

Modeling Process II Investigate Select Model Estimate Parameters Evaluate the Model Find Data Publish Results Qualify Data

Three Model Components Trend (correlation) We have just been talking about these Random “Noise” that is truly random or an effect on our data we do not understand (or are ignoring) Auto-correlated Values that are correlated with themselves in space and/or time

First Law of Geography "Everything is related to everything else, but near things are more related than distant things.“ Geographer Waldo Tobler (1930-) In our data, we may see patterns of spatial autocorrelation.

Measures of Auto-Correlation Moran’s I – most common measure 1 = perfect correlation 0 = zero correlation -1 = negative correlation https://docs.aurin.org.au

Patches of Aspen http://www.shutterstock.com/

Process of Correlation Modeling Find the trends that can be correlated with a known data set. Model and remove them. Find any auto-correlation. Model and remove it? What is left is the residuals (i.e. noise, error, random effect). Characterize them.

Research Papers Introduction Methods Results Discussion Conclusion Background Goal Methods Area of interest Data “sources” Modeling approaches Evaluation methods Results Figures Tables Summary results Discussion What did you find? Broader impacts Related results Conclusion Next steps Acknowledgements Who helped? References Include long URLs