Basis Expansions and Generalized Additive Models (2)

Slides:

Advertisements

Similar presentations

Pattern Recognition and Machine Learning

Advertisements

What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Lecture 4. Linear Models for Regression

Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.

Chapter Outline 3.1 Introduction

6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.

Model assessment and cross-validation - overview

Data mining and statistical learning - lecture 6

Basis Expansion and Regularization Presenter: Hongliang Fei Brian Quanz Brian Quanz Date: July 03, 2008.

1 Chapter 4 Interpolation and Approximation Lagrange Interpolation The basic interpolation problem can be posed in one of two ways: The basic interpolation.

Vector Generalized Additive Models and applications to extreme value analysis Olivier Mestre (1,2) (1) Météo-France, Ecole Nationale de la Météorologie,

1 Curve-Fitting Spline Interpolation. 2 Curve Fitting Regression Linear Regression Polynomial Regression Multiple Linear Regression Non-linear Regression.

Basis Expansion and Regularization

Kernel methods - overview

Linear Methods for Regression Dept. Computer Science & Engineering, Shanghai Jiao Tong University.

Lecture Notes for CMPUT 466/551 Nilanjan Ray

Missing at Random (MAR)  is unknown parameter of the distribution for the missing- data mechanism The probability some data are missing does not depend.

Additive Models and Trees

Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman.

Data mining and statistical learning - lecture 12 Neural networks (NN) and Multivariate Adaptive Regression Splines (MARS)  Different types of neural.

Comp 540 Chapter 9: Additive Models, Trees, and Related Methods

Chapter 9 Additive Models，Trees，and Related Models

Classification and Prediction: Regression Analysis

PATTERN RECOGNITION AND MACHINE LEARNING

Outline Separating Hyperplanes – Separable Case

Chapter 11 – Neural Networks COMP 540 4/17/2007 Derek Singer.

Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.

Data Mining Volinsky - Columbia University 1 Chapter 4.2 Regression Topics Credits Hastie, Tibshirani, Friedman Chapter 3 Padhraic Smyth Lecture.

Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.

Basis Expansions and Regularization Part II. Outline Review of Splines Wavelet Smoothing Reproducing Kernel Hilbert Spaces.

Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.

Machine Learning 5. Parametric Methods.

Additive Models ， Trees ， and Related Models Prof. Liqing Zhang Dept. Computer Science & Engineering, Shanghai Jiaotong University.

Kernel Methods Arie Nakhmani. Outline Kernel Smoothers Kernel Density Estimators Kernel Density Classifiers.

Basis Expansions and Generalized Additive Models Basis expansion Piecewise polynomials Splines Generalized Additive Model MARS.

LECTURE 17: BEYOND LINEARITY PT. 2 March 30, 2016 SDS 293 Machine Learning.

Model Selection and the Bias–Variance Tradeoff All models described have a smoothing or complexity parameter that has to be considered: multiplier of the.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

CS Statistical Machine learning Lecture 7 Yuan (Alan) Qi Purdue CS Sept Acknowledgement: Sargur Srihari’s slides.

Support Vector Machines

Piecewise Polynomials and Splines

Statistical Modelling

Chapter 4.2 Regression Topics

ECE3340 Numerical Fitting, Interpolation and Approximation

Probability Theory and Parameter Estimation I

Curve-Fitting Spline Interpolation

Dept. Computer Science & Engineering, Shanghai Jiao Tong University

Boosting and Additive Trees (2)

CSE 4705 Artificial Intelligence

CH 5: Multivariate Methods

Machine learning, pattern recognition and statistical data modelling

Additive Models，Trees，and Related Models

Machine learning, pattern recognition and statistical data modelling

Generalized Linear Models

Statistical Methods For Engineers

Human Growth: From data to functions

Spline Interpolation Class XVII.

Linear regression Fitting a straight line to observations.

What is Regression Analysis?

Linear Model Selection and regularization

Biointelligence Laboratory, Seoul National University

Basis Expansions and Generalized Additive Models (1)

Neural networks (1) Traditional multi-layer perceptrons

SKTN 2393 Numerical Methods for Nuclear Engineers

Support Vector Machines 2

Theory of Approximation: Interpolation

Presentation transcript:

Basis Expansions and Generalized Additive Models (2) Splines Generalized Additive Model MARS

Piecewise Polynomials Lower-right: Cubic spline

Spline An order-M spline with knots ξj, j = 1,...,K is a piecewise-polynomial of order M, and has continuous derivatives up to order M − 2. Cubic spline is order 4; piecewise-constant function an order-1 spline Basis functions: In practice the most widely used orders are M = 1, 2 and 4.

Natural Cubic Splines polynomials fit to data tends to be erratic near the boundaries, and extrapolation can be dangerous. A natural cubic spline adds additional constraints - the function is linear beyond the boundary knots. This requires extra constraints: the second derivative to be zero at the first and the last point Right: logistic regression.

Natural Cubic Splines

Natural Cubic Splines How many knots, and where to place them ?? Evenly spaced knots over the range of data (or percentiles) Examining the fit visually Can select the best knot setting by cross-validations.

Example: Logistic regression on South African Heart Disease Data X1 representing sbp, h1(X1) is a basis consisting of four basis functions…… Backward stepwise deletion process while preserving the group structure of each term

Smoothing Splines Avoids the knot selection problem completely. Uses a maximal set of knots. The complexity of the fit is controlled by regularization. Setup: among all functions f(x) with two continuous derivatives, find one that minimizes the penalized residual sum of squares Lambda: smoothing parameter. The second term penalizes curvature in the function

Smoothing Splines The solution is a natural cubic spline with knots at the unique values of the xi, i = 1,...,N the penalty term translates to a penalty on the spline coefficients  shrink toward the linear fit

Smoothing Splines

Smoothing Splines effective degrees of freedom of a smoothing spline:

Smoothing Splines Bias-variance trade-off

Multidimensional Splines Basis of functions h1k(X1), k = 1,...,M1 for X1 Basis of functions h2k(X2), k = 1,...,M2 for X2

Multidimensional Splines The coefficients can be fit by least squares, as before. But the dimension of the basis grows exponentially fast.

Generalized Additive Models fi() are unspecified smooth functions If model each function using an expansion of basis functions, the model could be fit by regression. g(μ) = μ identity link, used for linear and additive models for Gaussian response data. g(μ) = logit(μ) as above, or g(μ) = probit(μ), for modeling binomial probabilities. g(μ) = log(μ) for log-linear or log-additive models for Poisson count data.

Generalized Additive Models The penalized least squares: where the λj ≥0 are tuning parameters The minimizer is an additive cubic spline model Each fj is a cubic spline in the component Xj, with knots at each of the unique values of xij, i = 1,...,N. To make solution unique,

Generalized Additive Models Equivalent to multiple regression for linear models: > Can use other univariate regression smoothers such as local polynomial regression and kernel methods as Sj Current residual against all other variables Sj represents the spline Making the average zero

Multidimensional Splines

MARS: Multivariate Adaptive Regression Splines an adaptive procedure for regression, well suited for high-dimensional problems MARS uses expansions in piecewise linear basis functions of the form “a reflected pair”

MARS: Multivariate Adaptive Regression Splines The idea is to form reflected pairs for each input Xj with knots at each observed value xij of that input. The collection of basis functions: If all of the input values are distinct, there are 2Np basis functions altogether. Model: where each hm(X) is a function in C, or a product of two or more such functions.

MARS: Multivariate Adaptive Regression Splines Model building – forward stepwise: in each iteration, select a function from the set C or their products . coefficients βm are estimated by standard linear regression. Add terms in the form:

MARS: Multivariate Adaptive Regression Splines In model Candidates At each stage we consider all products of a candidate pair with a basis function in the model. The product that decreases the residual error the most is added into the current model.

MARS: Multivariate Adaptive Regression Splines

MARS: Multivariate Adaptive Regression Splines At the end of this process we have a large model that typically overfits the data. A backward deletion procedure is applied. Remove the term whose removal causes the smallest increase in residual squared error, one at a time. This produces the best model of each size (number of terms) λ. Use (generalized) cross-validation to compare the models and select the best λ. A link function can be used for discrete or other types of outcomes

MARS: Multivariate Adaptive Regression Splines Computers and Geotechnics 48 (2013) 82–95

MARS: Multivariate Adaptive Regression Splines