Presentation is loading. Please wait.

Presentation is loading. Please wait.

Kernel Methods Dept. Computer Science & Engineering, Shanghai Jiao Tong University.

Similar presentations


Presentation on theme: "Kernel Methods Dept. Computer Science & Engineering, Shanghai Jiao Tong University."— Presentation transcript:

1 Kernel Methods Dept. Computer Science & Engineering, Shanghai Jiao Tong University

2 2016-1-3Kernel Methods2 Outline One-Dimensional Kernel Smoothers Local Regression Local Likelihood Kernel Density estimation Naive Bayes Radial Basis Functions Mixture Models and EM

3 2016-1-3Kernel Methods3 One-Dimensional Kernel Smoothers k-NN: 30-NN curve is bumpy, since is discontinuous in x. The average changes in a discrete way, leading to a discontinuous.

4 2016-1-3Kernel Methods4 Nadaraya-Watson Kernel weighted average: Epanechnikov quadratic kernel: One-Dimensional Kernel Smoothers

5 2016-1-3Kernel Methods5 One-Dimensional Kernel Smoothers More general kernel: – : width function that determines the width of the neighborhood at x 0. –For quadratic kernel –For k-NN kernel Variance constant –The Epanechnikov kernel has compact support

6 2016-1-3Kernel Methods6 Three popular kernel for local smoothing: Epanechnikov kernel and tri-cube kernel are compact but tri-cube has two continuous derivatives Gaussian kernel is infinite support One-Dimensional Kernel Smoothers

7 2016-1-3Kernel Methods7 Boundary issue –Badly biased on the boundaries because of the asymmetry of the kernel in the region. –Linear fitting remove the bias to first order Local Linear Regression

8 2016-1-3Kernel Methods8 Local Linear Regression Locally weighted linear regression make a first- order correction Separate weighted least squares at each target point x 0 : The estimate: b(x) T =(1,x); B: Nx2 regression matrix with i- th row b(x) T ;

9 2016-1-3Kernel Methods9 Local Linear Regression The weights combine the weighting kernel and the least squares operations——Equivalent Kernel

10 2016-1-3Kernel Methods10 The expansion for, using the linearity of local regression and a series expansion of the true function f around x 0 For local regression The bias depends only on quadratic and higher-order terms in the expansion of. Local Linear Regression

11 2016-1-3Kernel Methods11 Local Polynomial Regression Fit local polynomial fits of any degree d

12 2016-1-3Kernel Methods12 Local Polynomial Regression Bias only have components of degree d+1 and higher. The reduction for bias costs the increased variance.

13 2016-1-3Kernel Methods13 选择核的宽度 核 中, 是参数,控制核宽度: – 对于有紧支集的核, 取其支集区域的半径 – 对于高斯核, 取其方差 – 对 k- 对近邻域法, 取 k/N 窗口宽度导致偏倚 - 方差权衡: – 窗口较窄,方差误差大,均值误差偏倚小 – 窗口较宽,方差误差小,均值误差偏倚大

14 2016-1-3Kernel Methods14 Structured Local Regression Structured kernels –Introduce structure by imposing appropriate restrictions on A Structured regression function –Introduce structure by eliminating some of the higher-order terms

15 2016-1-3Kernel Methods15 Any parametric model can be made local: –Parameter associated with : –Log-likelihood: –Model likelihood local to : –A varying coefficient model Local Likelihood & Other Models

16 2016-1-3Kernel Methods16 Logistic Regression –Local log-likelihood for the J class model –Center the local regressions at Local Likelihood & Other Models

17 2016-1-3Kernel Methods17 A natural local estimate The smooth Parzen estimate –For Gaussian kernel –The estimate become Kernel Density Estimation

18 2016-1-3Kernel Methods18 Kernel Density Estimation A kernel density estimate for systolic blood pressure. The density estimate at each point is the average contribution from each of the kernels at that point.

19 2016-1-3Kernel Methods19 Bayes’ theorem: The estimate for CHD uses the tri-cube kernel with k-NN bandwidth. Kernel Density Classification

20 2016-1-3Kernel Methods20 Kernel Density Classification The population class densities and the posterior probabilities

21 2016-1-3Kernel Methods21 Naïve Bayes Naïve Bayes model assume that given a class G=j, the features X k are independent: – is kernel density estimate, or Gaussian, for coordinate X k in class j. –If X k is categorical, use Histogram.

22 2016-1-3Kernel Methods22 Radial Basis Function & Kernel Radial basis function combine the local and flexibility of kernel methods. –Each basis element is indexed by a location or prototype parameter and a scale parameter –, a pop choice is the standard Gaussian density function.

23 2016-1-3Kernel Methods23 Radial Basis Function & Kernel For simplicity, focus on least squares methods for regression, and use the Gaussian kernel. RBF network model: Estimate the separately from the. A undesirable side effect of creating holes—— regions of IR p where none of the kernels has appreciable support.

24 2016-1-3Kernel Methods24 Gaussian radial basis function with fixed width can leave holes. Renormalized Gaussian radial basis function produce basis functions similar in some respects to B-splines. Renormalized radial basis functions. The expansion in renormalized RBF Radial Basis Function & Kernel

25 2016-1-3Kernel Methods25 Mixture Models & EM Gaussian Mixture Model – are mixture proportions, EM algorithm for mixtures –Given log-likelihood: –Suppose we observe Latent Binary Bad Good

26 2016-1-3Kernel Methods26 Mixture Models & EM Given,compute In Example

27 2016-1-3Kernel Methods27 Mixture Models & EM Application of mixtures to the heart disease risk factor study.

28 2016-1-3Kernel Methods28 Mixture Models & EM Mixture model used for classification of the simulated data


Download ppt "Kernel Methods Dept. Computer Science & Engineering, Shanghai Jiao Tong University."

Similar presentations


Ads by Google