Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dept. Computer Science & Engineering, Shanghai Jiao Tong University

Similar presentations


Presentation on theme: "Dept. Computer Science & Engineering, Shanghai Jiao Tong University"— Presentation transcript:

1 Dept. Computer Science & Engineering, Shanghai Jiao Tong University
Kernel Methods Dept. Computer Science & Engineering, Shanghai Jiao Tong University

2 Outline One-Dimensional Kernel Smoothers Local Regression
Local Likelihood Kernel Density estimation Naive Bayes Radial Basis Functions Mixture Models and EM 2018/7/4 Kernel Methods

3 One-Dimensional Kernel Smoothers
k-NN: 30-NN curve is bumpy, since is discontinuous in x. The average changes in a discrete way, leading to a discontinuous 2018/7/4 Kernel Methods

4 One-Dimensional Kernel Smoothers
Nadaraya-Watson Kernel weighted average: Epanechnikov quadratic kernel: 2018/7/4 Kernel Methods

5 One-Dimensional Kernel Smoothers
More general kernel: : width function that determines the width of the neighborhood at x0. For quadratic kernel For k-NN kernel Variance constant The Epanechnikov kernel has compact support 2018/7/4 Kernel Methods

6 One-Dimensional Kernel Smoothers
Three popular kernel for local smoothing: Epanechnikov kernel and tri-cube kernel are compact but tri-cube has two continuous derivatives Gaussian kernel is infinite support 2018/7/4 Kernel Methods

7 Local Linear Regression
Boundary issue Badly biased on the boundaries because of the asymmetry of the kernel in the region. Linear fitting remove the bias to first order 2018/7/4 Kernel Methods

8 Local Linear Regression
Locally weighted linear regression make a first-order correction Separate weighted least squares at each target point x0: The estimate: b(x)T=(1,x); B: Nx2 regression matrix with i-th row b(x)T; 2018/7/4 Kernel Methods

9 Local Linear Regression
The weights combine the weighting kernel and the least squares operations——Equivalent Kernel 2018/7/4 Kernel Methods

10 Local Linear Regression
The expansion for , using the linearity of local regression and a series expansion of the true function f around x0 For local regression The bias depends only on quadratic and higher-order terms in the expansion of . 2018/7/4 Kernel Methods

11 Local Polynomial Regression
Fit local polynomial fits of any degree d 2018/7/4 Kernel Methods

12 Local Polynomial Regression
Bias only have components of degree d+1 and higher. The reduction for bias costs the increased variance. 2018/7/4 Kernel Methods

13 Kernel Width Selection
is a parameter in kernel , which controls the width of the kernel. takes the radius of supporting region for compact supporting kernel; takes the variance for Gaussian Kernel; takes k/N for KNN method. Kernel width is related to model selection Wide width leads to large bias and small var. Narrow width leads to small bias and large var. 2018/7/4 Kernel Methods

14 Structured Local Regression
Structured kernels Introduce structure by imposing appropriate restrictions on A Structured regression function Introduce structure by eliminating some of the higher-order terms 2018/7/4 Kernel Methods

15 Local Likelihood & Other Models
Any parametric model can be made local: Parameter associated with : Log-likelihood: Model likelihood local to : A varying coefficient model 2018/7/4 Kernel Methods

16 Local Likelihood & Other Models
Logistic Regression Local log-likelihood for the J class model Center the local regressions at 2018/7/4 Kernel Methods

17 Kernel Density Estimation
A natural local estimate The smooth Parzen estimate For Gaussian kernel The estimate become 2018/7/4 Kernel Methods

18 Kernel Density Estimation
A kernel density estimate for systolic blood pressure. The density estimate at each point is the average contribution from each of the kernels at that point. 2018/7/4 Kernel Methods

19 Kernel Density Classification
Bayes’ theorem: The estimate for CHD uses the tri-cube kernel with k-NN bandwidth. 2018/7/4 Kernel Methods

20 Kernel Density Classification
The population class densities and the posterior probabilities 2018/7/4 Kernel Methods

21 Naïve Bayes Naïve Bayes model assume that given a class G=j, the features Xk are independent: is kernel density estimate, or Gaussian, for coordinate Xk in class j. If Xk is categorical, use Histogram. 2018/7/4 Kernel Methods

22 Radial Basis Function & Kernel
Radial basis function combine the local and flexibility of kernel methods. Each basis element is indexed by a location or prototype parameter and a scale parameter , a pop choice is the standard Gaussian density function. 2018/7/4 Kernel Methods

23 Radial Basis Function & Kernel
For simplicity, focus on least squares methods for regression, and use the Gaussian kernel. RBF network model: Estimate the separately from the . A undesirable side effect of creating holes——regions of IRp where none of the kernels has appreciable support. 2018/7/4 Kernel Methods

24 Radial Basis Function & Kernel
Renormalized radial basis functions. The expansion in renormalized RBF Gaussian radial basis function with fixed width can leave holes. Renormalized Gaussian radial basis function produce basis functions similar in some respects to B-splines. 2018/7/4 Kernel Methods

25 Mixture Models & EM Gaussian Mixture Model EM algorithm for mixtures
are mixture proportions, EM algorithm for mixtures Given log-likelihood: Suppose we observe Latent Binary Bad Good 2018/7/4 Kernel Methods

26 Mixture Models & EM Given ,compute In Example 2018/7/4 Kernel Methods

27 Mixture Models & EM Application of mixtures to the heart disease risk factor study. 2018/7/4 Kernel Methods

28 Mixture Models & EM Mixture model used for classification of the simulated data 2018/7/4 Kernel Methods

29 2018/7/4 Kernel Methods


Download ppt "Dept. Computer Science & Engineering, Shanghai Jiao Tong University"

Similar presentations


Ads by Google