Presentation on theme: "A Short Introduction to Curve Fitting and Regression by Brad Morantz"— Presentation transcript:
1 A Short Introduction to Curve Fitting and Regression by Brad Morantz
2 Overview What can regression do for me? Example Model building Error MetricsOLS RegressionRobust RegressionCorrelationRegression TableLimitationsAnother method - Maximum Likelihood
3 ExampleYou have the mass, payload mass, and distance of a number of missiles.You need an equation for maximum distance based upon mass of missile & payloadRegression will let you build a model, based upon these observations that will be of the formMax distance = B1 * total mass + B2 * payload mass + constantWith this equation, you can answer questions
4 Model Building Why build a model? Two ways Regression is the latter It can help us to forecastCan assist in designTwo waysCausal factorsData drivenRegression is the latterModel based upon observations
5 MSEResidualDifference between model and actuale = Y - ŶIf we summed them, the negative and positive would cancel outIf we took Absolute value, would not be differentiable about originSo we take sum of squaresMSE is mean of sum of squaresPlot of valuesLine is the modelDots are the actualWait a second, mean of sum of square errors. Sounds a lot like mean of sum of squared differences.MSE is an estimator of the variance, if we assume that the model line is the estimate of the central value.MSE is a good way to see how well a line fits a set of points. Good metric for quality of fit.
6 RMSERMSE = sqrt(MSE)If we accept that MSE is estimator of variance, then RMSE is estimator of standard deviation.It is also a metric of how much error there is, or how well a model fits a set of dataMSE and RMSE are often used as cost functions in many mathematical operationsCost function is function that program tries to optimize in solving the problem. So in this case, it tries to minimize error as the goal of the operation and builds a model that has the smallest MSE.
7 OLS Regression Ordinary Least Squares Yhat = B0 + B1X BnXn + eB0 is the Y interceptBn is the coefficient for each variateXn is the variatee is the errorThe program calculates the coefficients to produce a line/model with the least amount of squared error (MSE)Depending on what the results are, some experts do transformations on a particular variate, like maybe taking the square or the log, or square root, or such, and using that as the variate. It is now harder to interpret the results, because the reverse transformation has to be done.
8 Robust Regression Based on work by Kaufmann & Rousseuw Uses median instead of meanNot standard practiceNo automatic routines to do itIs less affected by outliersThe big advantage is that it is not affected by outliers/extreme values. May sometimes create a better model, especially with smaller data sets where the outlier tends to ‘pull’ the mean.
9 Simple Test of Model Plot out the residuals Look at this plot Are the residuals approximately constant?Or do you see a trend that they are growing or attenuating? (heteroscedasticity)If it is this latter situation, the causal factors are changing and the model will need to be modifiedIf they are approximately constant over the range that you will be working with, then it is OK.You should also look to see that there is about as much error on one side of the line as the other, otherwise it is a biased model
10 Linear Correlation Coefficient Shows the relationship between two variablesPositive means that they travel in the same directionNegative means that they go in opposite directionsLike covariance, but it has no scaleCan use for comparisonsRange is from -1 to +1Zero means no relationship, or independentRxx = XTX (which becomes easy to implement in a matrix language like Fortran or Matlab)This is unscaled so it can be used to compare different things. Covariance is scaled, so can not be used for comparing different models
11 Correlation and Dependence If two items are correlated, thenr ≠ 0They are not independentIf they are independent, thenr = 0P(a) = P(a|b); b is of no effect on aNeed to have some theory to support aboveOld standard subject in regression class is seeming correlation between skirt length and stock market, yet there is NO theory to connect the two.The correlation matrix is (should be) similar about the diagonal as correlation between X1 and Y1 should be the same as between Y1 and X1
12 R2 Pearsonian correlation coefficient Square of r, the correlation coefficientFraction of the variability in the system that is explained by this modelReal world is usually 10% to 70%Adjusted R2Only use for model buildingPenalizes for additional causal variablesThis tells us how well the model explains the data and how good the model is.The adjusted R square is because typically the more factors that are put into the model, the better the fit. But that can sometimes cause problems, so the adjusted one penalizes for additional variables and can be used while trying to add variables, as in step-wise regression.
13 Reading a Regression Table This residual plot looks acceptableAbout as much on top as on bottomThe R Square is very good, so is the adjusted R SquareThis model is very explanatoryThis regression needs to get rerun, with the intercept turned off and forcing the line through the originThis regression had but one independent variable, but could have had more.This also shows the confidence interval for the coefficients. Notice that the intercept has a low t statistic and the interval INCLUDES zero. That tells us that we need to rerun the program with intercept forced thru zeroThe model is good as the F value is >> 3 and the significance is << than 0.05X is a good coefficient, as it is statistically significant, but the intercept is not with a P of 0.68 and a t <3
14 Limitations of OLS Built on the assumption of linear relationship Error grows when non-linearCan use variable transformationHarder to interpretLimited to one dependent variableLimited to range of data, can not accurately interpolate out of it
15 Maximum Likelihood Can calculate regression coefficients If probability distribution of error terms is availableCalculates the minimum variance unbiased estimatorsCalculates the best way to fit a mathematical model to the dataChoose the estimate for unknown population parameter which would maximize the probability of getting what we have