Data mining and statistical learning - lecture 6

Name: Data mining and statistical learning - lecture 6
Uploaded: 2017-08-19T19:27:47+00:00
Duration: PTM13S16
Channel: Caitlin Robinson
Description: Data mining and statistical learning - lecture 6

Data mining and statistical learning - lecture 6
Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression Multidimensional splines Wavelets Data mining and statistical learning - lecture 6

Linear basis expansion (1)
Linear regression True model: Question: How to find ? Answer: Solve a system of linear equations to obtain x1 x2 x3 y 1 -3 6 12 … Data mining and statistical learning - lecture 6

Nonlinear model True model: Question: How to find ? Answer: A) Introduce new variables x1 x2 x3 y 1 -3 -1 12 … Data mining and statistical learning - lecture 6

Nonlinear model B) Transform the data set True model: C) Apply linear regression to obtain u1 u2 u3 u4 y -3 -1.1 -0.84 1 12 … Data mining and statistical learning - lecture 6

Conclusion: We can easily fit any model of the type i.e., we can easily undertake a linear basis expansion in X Example: If the model is known to be nonlinear, but the exact form is unknown, we can try to introduce interaction terms Data mining and statistical learning - lecture 6

Piecewise polynomial functions
Assume X is one-dimesional Def. Assume the domain [a, b] of X is split into intervals [a, ξ1], [ξ 1, ξ 2], ..., [ξ n, b]. Then f(X) is said to be piecewise polynomial if f(X) is represented by separate polynomials in the different intervals. Note The points ξ1,..., ξ n are called knots Data mining and statistical learning - lecture 6

Piecewise polynomials
Example. Continuous piecewise linear function Alternative A. Introduce linear functions on each interval and a set of constraints (4 free parameters) INS. FIG lower left Alternative B. Use a basis expansion (4 free parameters) Theorem. The given formulations are equivalent. Data mining and statistical learning - lecture 6

Splines Definition A piecewise polynomial is called order-M spline if it has continuous derivatives up to order M-1 at the knots. Alternative definition An order-M spline is a function which can be represented by basis functions ( K= #knots ) Theorem. The definitions above are equivalent. Terrminology. Order-4 spline is called cubic spline INS. FIG 5.2 LR (look at basis and compare #free parameters) Note. Cubic splines: knot-discontinuity is not visible Data mining and statistical learning - lecture 6

Variance of spline estimators – boundary effects
INSERT FIG 5.3 Data mining and statistical learning - lecture 6

Natural cubic spline Def. A cubic spline f is called natural cubic spline if the its 2nd and 3rd derivatives are zero at a and b Note It implies that f is linear on extreme intervals Basis functions of natural cubic splines Data mining and statistical learning - lecture 6

Fitting smooth functions to data
Minimize a penalized sum of squared residuals where λ is smoothing parameter. λ=0 : any function interpolating data λ=+ : least squares line fit Data mining and statistical learning - lecture 6

Optimality of smoothing splines
Theorem The function f minimizing RSS for a given  is a natural cubic spline with knots at all unique values of xi (NOTE: N knots!) The optimal spline can be computed as follows. Data mining and statistical learning - lecture 6

A smoothing spline is a linear smoother
The fitted function is linear in the response values. Data mining and statistical learning - lecture 6

Degrees of freedom of smoothing splines
The effective degrees of freedom is dfλ = trace(Sλ) i.e., the sum of the diagonal elements of S. Data mining and statistical learning - lecture 6

Smoothing splines and eigenvectors
It can be shown that where K is the so-called penalty matrix Furthermore, the eigen-decomposition is Note: dk and uk are eigenvalues and eigenvectors, respectively, of K Data mining and statistical learning - lecture 6

Smoothing splines and shrinkage
Smoothing spline decomposes vector y with respect to basis of eigenvectors and shrinks respective contributions The eigenvectors ordered by ρ increase in complexity. The higher the complexity, the more the contribution is shrunk. Data mining and statistical learning - lecture 6

Smoothing splines and local curve fitting
Eigenvalues are reverse functions of λ. The higher λ, the higher penalization. Smoother matrix is has banded nature -> local fitting method INSERT fig 5.8 Data mining and statistical learning - lecture 6

Fitting smoothing splines in practice (1)
Reinsch form: Theorem. If f is natural cubic spline with values at knots f and second derivative at knots then where Q and R are band matrices, dependent on ξ only. Theorem. Data mining and statistical learning - lecture 6

Fitting smoothing splines in practice (2)
Reinsch algorithm Evaluate QTy Compute R+λQTQ and find Cholesky decomposition (in linear time!) Solve matrix equation (in linear time!) Obtain f=y-λQγ Data mining and statistical learning - lecture 6

Automated selection of smoothing parameters (1)
What can be selected: Regression splines Degree of spline Placement of knots ->MARS procedure Smoothing spline Penalization parameter Data mining and statistical learning - lecture 6

Fixing the degrees of freedom If we fix dfλ then we can find λ by solving the equation numerically One could try two different dfλ and choose one based on F-tests, residual plots etc. Data mining and statistical learning - lecture 6

The bias-variance trade off INSERT FIG. 5.9 EPE – integrated squared prediction error, CV- cross validation Data mining and statistical learning - lecture 6

Nonparametric logistic regression
Logistic regression model Note: X is one-dimensional What is f: Linear -> ordinary logistic regression (Chapter 4) Enough smooth -> nonparametric logistic regression (splines+others) Other choices are possible Data mining and statistical learning - lecture 6

Problem formulation: Minimize penalized log-likelihood Good news: Solution is still a natural cubic spline. Bad news: There is no analytic expression of that spline function Data mining and statistical learning - lecture 6

How to proceed? Use Newton-Rapson to compute spline numerically, i.e Compute (analytically) Compute Newton direction using current value of parameter and derivative information Compute new value of parameter using old value and update formula Data mining and statistical learning - lecture 6

Multidimensional splines
How to fit data smoothly in higher dimensions? Use basis of one dimensional functions and produce basis by tensor product Problem: Exponential INS FIG. 6.10 growth of basis with dim Data mining and statistical learning - lecture 6

Multidimensional splines
How to fit data smoothly in higher dimensions? B) Formulate a new problem The solution is thin-plate splines The similar properties for λ=0. The solution in 2 dimension is essentially sum of radial basis functions Data mining and statistical learning - lecture 6

Wavelets Introduction The idea: to fit bumpy function by removing noise Application area: Signal processing, compression How it works: The function is represented in the basis of bumpy functions. The small coefficients are filtered. Data mining and statistical learning - lecture 6

Wavelets Basis functions (Haar Wavelets, Symmlet-8 Wavelets) INSERT FIG 5.13 Data mining and statistical learning - lecture 6

Wavelets Example Insert FIG 5.14 Data mining and statistical learning - lecture 6

Data mining and statistical learning - lecture 6

Similar presentations

Presentation on theme: "Data mining and statistical learning - lecture 6"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Data mining and statistical learning - lecture 6

Similar presentations

Presentation on theme: "Data mining and statistical learning - lecture 6"— Presentation transcript:

Similar presentations

About project

Feedback