Recursively Adapted Radial Basis Function Networks and its Relationship to Resource Allocating Networks and Online Kernel Learning Weifeng Liu, Puskal.

Recursively Adapted Radial Basis Function Networks and its Relationship to Resource Allocating Networks and Online Kernel Learning Weifeng Liu, Puskal Pokharel, Jose Principe CNEL, University of Florida Acknowledgment: This work was partially supported by NSF grant ECS and ECS

Outlines One framework Two algorithms Convergence analysis
RA-RBF-1 RA-RBF-2 (kernel least-mean-square) Convergence analysis Well-posedness analysis Experiments

Learning problem Desired signal D Input signal U Problem statement
find a function f in a hypothesis space (a reproducing kernel Hilbert space) H, such that the following empirical risk is minimized

Radial Basis Function Network
By regularization theory, the well-known solution is where the coefficients satisfy the following linear equation G is the Gram matrix. The RBF network boils down to a matrix inversion problem

A general learning model
Loop over the following iteration start from an initial estimation ft-1 then test this estimation on the training data for deviation measures {ei, i} improve the estimation as ft by combining the previous estimation ft-1 and the deviation {ei, i} End loop if | ft-1 – ft | < є

Two algorithms: RA-RBF-1
Algorithm 1: RA-RBF-1 Initialization: learning step: loop over convergence { 1. evaluate network output at every training point 2. compute error 3. update estimate }

Two algorithms: RA-RBF-2
Algorithm 2: RA-RBF-2 Initialization learning step: loop over input-output pairs (ut, yt) { 1. evaluate network output at the present point 2. computer present error 3. improve the estimate }

Similarity and difference
recursive RBF network structure use the error directly to compose the network RA-RBF-2 is online whereas RA-RBF-1 not RA-RBF-2 uses the ‘apriori’ error whereas RA-RBF-2 uses the ‘global’ error information

Convergence of RA-RBF-1
Theorem 1: The sufficient and necessary condition for the RA-RBF-1 to converge is: where is the largest eigenvalue of G.

Convergence of RA-RBF-2
RA-RBF-2 is the least-mean-square algorithm in the RKHS, so it is also named kernel LMS (KLMS). Mercer’s theorem is a nonlinear mapping and is the transformed feature vector lying in the feature space F.

Convergence of RA-RBF-2 (cont’d)
Denote the weight vector in F by . 

Convergence of RA-RBF-2 (cont’d)
Theorem 2: By the small-step-size theory, the RA-RBF-2 (KLMS) converges if where is the largest eigenvalue of the auto-correlation matrix

Well-posedness of RA-RBF-1
Theorem 3: The RA-RBF-1 converges uniquely to the following regularized RBF solution The reciprocal of the stepsize serves as the regularization parameter.

Well-posedness of RA-RBF-2
Theorem 4: Under the H∞ stable condition, the norm of the apriori errors in the RA-RBF-2 and further the norm of the solution are upper-bounded. Assume the transformed data in the feature space satisfy the following multiple linear regression model

Well-posedness of RA-RBF-2 (cont’d)
Further the solution norm where is the largest eigen-value of the Gram matrix G. The significance of an upper bound for the solution norm is well studied by Poggio and Girosi in the context of regularization network theory.

Relation to resource allocating network (RAN) and online kernel learning (OKL)
RAN and OKL are variants of the proposed learning model RA-RBF-2 is a special case of RAN and OKL. The OKL employs explicit regularization The understanding here about the well-posedness of RA-RBF-2 brings new insights into the existing two algorithms.

Simulation: Chaotic signal prediction
Mackey-Glass chaotic time series with parameter t=30 time embedding: 10 500 points training data 100 points test data Gaussian noise: zero mean, 0.1 variance Kernel width: 1

Learning curve of RA-RBF-1

Learning curves of RA-RBF-2, LMS, OKL

Results

Novelty criterion The novelty criterion used in RAN can be employed in the RA-RBF-2 (KLMS) The advantage More sparse Better generalization Simple computation

Performance using novelty criterion
TABLE II Predication performance for KLMS with novelty criterion (ε, δ) Algorithms KLMS (0.2, 0.7) (0.1, 0.5) (0.08, 0.3) (0.05, 0.1) Training MSE 0.018 0.057 0.037 0.020 0.019 Test MSE 0.049 0.034 0.021 Network Size 500 19 81 290 324

Conclusions Proposed two recursively adapted RBF networks
Theoretically explained the convergence properties of the recursively adapted RBF networks Theoretically explained the well-posedness of the recursively adapted neural networks Established connections between resource allocating network and online kernel learning with least-mean-square algorithm

Recursively Adapted Radial Basis Function Networks and its Relationship to Resource Allocating Networks and Online Kernel Learning Weifeng Liu, Puskal.

Similar presentations

Presentation on theme: "Recursively Adapted Radial Basis Function Networks and its Relationship to Resource Allocating Networks and Online Kernel Learning Weifeng Liu, Puskal."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Recursively Adapted Radial Basis Function Networks and its Relationship to Resource Allocating Networks and Online Kernel Learning Weifeng Liu, Puskal.

Similar presentations

Presentation on theme: "Recursively Adapted Radial Basis Function Networks and its Relationship to Resource Allocating Networks and Online Kernel Learning Weifeng Liu, Puskal."— Presentation transcript:

Similar presentations

About project

Feedback