Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nonlinear Regression Probability and Statistics Boris Gervits.

Similar presentations


Presentation on theme: "Nonlinear Regression Probability and Statistics Boris Gervits."— Presentation transcript:

1 Nonlinear Regression Probability and Statistics Boris Gervits

2 Topics of Discussion Definition of NLR Picking a Regression Model
- Linear versus Nonlinear models Techniques - Loss Functions - Function Minimization Algorithm - Regression methods Example

3 General form of NLR model: Yi = F(xi, Ө) + ei,
NLR is a popular statistical tool used to fitting data into a model and computing the relationship among independent and dependent variables General form of NLR model: Yi = F(xi, Ө) + ei, where i = 1, …, n is the number of measurement, Yi are responses, xi is the vector (xi1, …, xik) of measurements of k independent variables, Ө is the parameter vector (Ө1, …, Өp), and ei are random errors, usually assumed to have mean 0 and constant variance.

4 Linear versus Nonlinear Models
y = a + b1*x1 + b2 *x bn*xn Polynomial Regression y = a + b1*x + b2*x2 The non-linearity of this model is expressed in the term x2. However, the nature of the model is linear. We are interested in finding the best fits for parameters, and while doing estimation, we’d square the measure of x. Making nonlinear models linear Since linear regression is simpler and more straight-forward, it may be preferable to nonlinear. When a nonlinear model can be converted to a linear one?

5 Example Consider the relationship between a human's age from birth (the x variable) and his or her growth rate (the y variable). Growth = Exp(-b*Age) We can easily transfer this into a linear model: log(Growth) = -b*Age Can we really do this without affecting the quality of the fit?

6 When we transformed from a nonlinear to linear model, We forgot about the random error in the dependent variable. The growth rate is, of course, affected by many variables other than age, so we should expect considerable fluctuation: Growth = Exp(-b*Age) + error Additive Error Here, we assume that the error is independent of age. That is, the error distribution is the same at any age. Because the error in the original nonlinear equation is now additive, we cannot linearize this model by taking the log on both sides. Multiplicative Error However, the error variability is not likely to be constant at all ages. There are greater fluctuations of the growth rate during the earlier ages than the later ages, when growth eventually stops anyway. Here is a more realistic model including error: Growth = exp(-b*Age) * error Now, if we take a log, the residual error becomes an additive factor in linear equation: Log (Growth) = -b*Age + error

7 NLR Techniques Loss Functions Residual is the deviation of a particular point (observed response) from the regression line (predicted response). Residuals signify some loss in the accuracy of the prediction. The goal of NLR is to minimize a loss function. Least Squares is the most common loss function minimizing the sum of squared deviations of the observed values for the dependent variable from those predicted by the model

8 Other loss functions: Absolute deviations. This could be useful in order to de-emphasize outliers. When least squares function is used, a squared large residual will affect regression coefficients more. Weighted Least Squares Ordinary least squares technique assumes that residual variance is the same across all values of independent variables. That is, the error variance is not dependent on the measurements. This often fails to be the case. Example: relationship between the projected cost of construction project, and the actual cost. In this case, the absolute magnitude of the error is proportional to the size of the project. Here, it is appropriate to use weighted least squares technique. The loss function would be: Loss = (Obs-Pred)2 * (1/y2)

9 Function Minimization Algorithms
Start with an initial estimated value for each parameter in the equation. Set a step size and convergence criteria. Generate the regression curve (get a list of predicted response variables) Calculate the loss function Adjust the parameters to minimize loss function (more on this) Repeat steps 2-4 until adjustments satisfy convergence criteria Report the best-fit result

10 Nonlinear Regression Methods (step 4 from above)
When fitting just one parameter, no special algorithm is needed. You can always use the brute force method: calculate loss function for the initial value Move a step (assuming you picked the direction correctly), calculate loss function again, compare with the previous result If it’s better, get rid of the previous result and keep moving in the same direction If it’s worse, we’ve gone too far. Check the convergence criteria, if it’s not satisfied, reduce the step and move in the opposite direction

11 If there is more than one parameter to fit, the number of possible combinations of values is infinite. We need an efficient way to find the best fit. Some methods are based on calculating first and/or second order derivatives to identify the slope and the slope of the slope of the function and determine how fast and in which direction the slope is changing at any point. Others recalculate how much the sum of the squares changes if the values of parameters are changed slightly. Mathematica uses the following methods: - Gradient (steepest decent) method - Newton method - QuaziNewton - Levenberg / Marquardt Simplex method takes n + 1 initial values, if n is the number of parameters to fit. Less likely to be confused by local minimum, but does not compute standard error or confidence interval.

12 Example: Fitting Dose-Response Curves
Dose-response curves can describe the results of many kinds of experiments X axis plots concentration of a drug or hormone Y axes plots response. Examples of meanings of Y - change in heart rate - contraction of a muscle - secretion of a hormone - membrane potential and ion channels

13 Extract from lecture 2:

14 Equation for a dose-response curve:
Bottom - base line response Top - maximum response EC50 - drug concentration that provokes a halfway response HillSlope - the slope of the curve [Drug] - drug dosage or concentration (independent variable) Note: EC50 may not be the same as concentration that provokes a 50% A more commonly used equation:

15 Fitting Data with Nonlinear Regression
Choose your model Decide which parameters to fit and which to constrain Choose weighting scheme, if appropriate Choose initial values Perform the curve fit and interpret the best-fit parameter values - Does the curve go near your data? - Are the best-fit parameter values acceptable? What is the standard error and 95% confidence interval? Check the parameters’ correlation matrix Could the fit be a local minimum?

16 Troubleshooting Bad Fits
Poorly-defined parameters Problem: The response was measured at 5 concentrations between the bottom and the top; the data was normalized to return response from 0% to 100%. NLR shows very wide confidence intervals. Diagnosis: not enough data. Top and Bottom cannot be defined precisely. Solution: Since the data was normalized to run from 0 to 100, NLR does not have to calculate these parameters. Bad Initial Values Problem: We feed a bunch of data to an NLR model and get back an error message “Does not converge”. The data and the equation look valid. What could be wrong? Diagnosis: try to plot the curve generated by the initial values, together with the data. If it’s clearly off, no wonder it can’t converge. Solution: Change the initial values Redundant Parameters Problem: Suppose you came up with a model Y = b1*(E^(-(b2 + b3)*b4). Possible error message: “Bad model”. However, the model seems to be correct. Diagnosis: The model is ambiguous. B2 and b3 parameters cannot be strictly defined by our data. Solution: Determine the value of either b2 or b3 from another experiment.


Download ppt "Nonlinear Regression Probability and Statistics Boris Gervits."

Similar presentations


Ads by Google