Presentation is loading. Please wait.

Presentation is loading. Please wait.

Basis Expansions and Generalized Additive Models Basis expansion Piecewise polynomials Splines Generalized Additive Model MARS.

Similar presentations


Presentation on theme: "Basis Expansions and Generalized Additive Models Basis expansion Piecewise polynomials Splines Generalized Additive Model MARS."— Presentation transcript:

1 Basis Expansions and Generalized Additive Models Basis expansion Piecewise polynomials Splines Generalized Additive Model MARS

2 Basis expansion  f(X) = E(Y |X) can often be nonlinear and non-additive in X  However, linear models are easy to fit and interpret  By augmenting the data, we may construct linear models to achieve non-linear regression/classification.

3 Basis expansion Some widely used transformations:  h m (X) = X m, m = 1,..., p  the original linear model.  h m (X) = X j 2, h m (X) = X j X k or higher order polynomials  augment the inputs with polynomial terms the number of variables grows exponentially in the degree of the polynomial: O(p d ) for a degree-d polynomial  h m (X) = log(X j ),...  other nonlinear transformations  h m (X) = I(L m ≤ X k < U m ), breaking the range of X k up into non-overlapping regions  piecewise constant

4 Basis expansion More often, we use the basis expansions as a device to achieve more flexible representations for f(X) Polynomials are global – tweaking functional forms to suite a region causes the function to flap about madly in remote regions. Red: 6 degree polynomial Blue: 7 degree polynomial

5 Basis expansion Piecewise-polynomials and splines allow for local polynomial representations Problem: the number of basis functions can grow too large to fit using limited data. Solution:  Restriction methods - limit the class of functions Example: additive model

6 Basis expansion  Selection methods Allow large numbers of basis functions, adaptively scan the dictionary and include only those basis functions h m () that contribute significantly to the fit of the model. Example: multivariate adaptive regression splines (MARS)  Regularization methods where we use the entire dictionary but restrict the coefficients. Example: Ridge regression Lasso (both regularization and selection)

7 Piecewise Polynomials  Assume X is one-dimensional.  Divide the domain of X into contiguous intervals, and represent f(X) by a separate polynomial in each interval.  Simplest – piecewise constant

8 Piecewise Polynomials  piecewise linear Three additional basis functions are needed:

9 Piecewise Polynomials  piecewise linear requiring continuity

10 Piecewise Polynomials Lower-right: Cubic spline

11 Spline  An order-M spline with knots ξ j, j = 1,...,K is a piecewise-polynomial of order M, and has continuous derivatives up to order M − 2. Cubic spline is order 4; piecewise-constant function an order-1 spline  Basis functions:  In practice the most widely used orders are M = 1, 2 and 4.

12 Natural Cubic Splines  polynomials fit to data tends to be erratic near the boundaries, and extrapolation can be dangerous.  With splines, the polynomials fit beyond the boundary knots behave even more wildly than global polynomials in that region.  A natural cubic spline adds additional constraints - the function is linear beyond the boundary knots.

13 Natural Cubic Splines

14 FIGURE 5.4. Fitted natural-spline functions for each of the terms in the final model selected by the stepwise procedure. Included are pointwise standard-error bands. South African Heart Disease data.

15 Smoothing Splines  Avoids the knot selection problem completely.  Uses a maximal set of knots.  The complexity of the fit is controlled by regularization.  Setup: among all functions f(x) with two continuous derivatives, find one that minimizes the penalized residual sum of squares  Lambda: smoothing parameter.  The second term penalizes curvature in the function

16 Smoothing Splines  The solution is a natural cubic spline with knots at the unique values of the x i, i = 1,...,N  the penalty term translates to a penalty on the spline coefficients  shrink toward the linear fit

17 Smoothing Splines

18 effective degrees of freedom of a smoothing spline:

19 Smoothing Splines Bias-variance trade-off

20 Multidimensional Splines  Basis of functions h 1k (X 1 ), k = 1,...,M 1 for X 1  Basis of functions h 2k (X 2 ), k = 1,...,M 2 for X 2  The coefficients can be fit by least squares, as before.  But the dimension of the basis grows exponentially fast.

21 Multidimensional Splines

22 Generalized Additive Models  f i () are unspecified smooth functions  If model each function using an expansion of basis functions, the model could be fit by simple least squares.  g(μ) = μ identity link, used for linear and additive models for Gaussian response data.  g(μ) = logit(μ) as above, or g(μ) = probit(μ), for modeling binomial probabilities.  g(μ) = log(μ) for log-linear or log-additive models for Poisson count data.

23 Generalized Additive Models  The penalized least squares: where the λj ≥0 are tuning parameters  The minimizer of (9.7) is an additive cubic spline model  Each f j is a cubic spline in the component X j, with knots at each of the unique values of x ij, i = 1,...,N.  To make solution unique,

24 Generalized Additive Models  Equivalent to multiple regression for linear models: S j represents the spline. > Can use other univariate regression smoothers such as local polynomial regression and kernel methods as S j

25 Multidimensional Splines

26 MARS: Multivariate Adaptive Regression Splines  an adaptive procedure for regression, well suited for high-dimensional problems  MARS uses expansions in piecewise linear basis functions of the form “a reflected pair”

27 MARS: Multivariate Adaptive Regression Splines  The idea is to form reflected pairs for each input X j with knots at each observed value x ij of that input.  The collection of basis functions:  If all of the input values are distinct, there are 2Np basis functions altogether.  Model: where each h m (X) is a function in C, or a product of two or more such functions.

28 MARS: Multivariate Adaptive Regression Splines  Model building – forward stepwise: in each iteration, select a function from the set C or their products.  coefficients β m are estimated by standard linear regression.  Add terms in the form:

29 MARS: Multivariate Adaptive Regression Splines In model Candidates At each stage we consider all products of a candidate pair with a basis function in the model. The product that decreases the residual error the most is added into the current model.

30 MARS: Multivariate Adaptive Regression Splines

31  At the end of this process we have a large model that typically overfits the data.  A backward deletion procedure is applied. Remove the term whose removal causes the smallest increase in residual squared error, one at a time. This produces the best model of each size (number of terms) λ. Use (generalized) cross-validation to compare the models and select the best λ.

32 MARS: Multivariate Adaptive Regression Splines


Download ppt "Basis Expansions and Generalized Additive Models Basis expansion Piecewise polynomials Splines Generalized Additive Model MARS."

Similar presentations


Ads by Google