Dynamical Analysis of LVQ type algorithms, WSOM 2005 Dynamical analysis of LVQ type learning rules Rijksuniversiteit Groningen Mathematics and Computing.

Dynamical Analysis of LVQ type algorithms, WSOM 2005 Dynamical analysis of LVQ type learning rules Rijksuniversiteit Groningen Mathematics and Computing Science http://www.cs.rug.nl/~biehl m.biehl@rug.nl Michael Biehl, Anarta Ghosh Clausthal University of Technology Institute of Computing Science Barbara Hammer

Dynamical Analysis of LVQ type algorithms, WSOM 2005 identify the closest prototype, i.e the so-called winner initialize prototype vectors for different classes present a single example move the winner - closer towards the data (same class) - away from the data (different class) classification: assignment of a vector  to the class of the closest prototype w    aim: generalization ability classification of novel data after learning from examples Learning Vector Quantization (LVQ) - identification of prototype vectors from labelled example data - parameterization of distance based classification schemes example: basic LVQ scheme [Kohonen]: “LVQ 1” often: heuristically motivated variations of competitive learning 

Dynamical Analysis of LVQ type algorithms, WSOM 2005 LVQ algorithms... - frequently applied in a variety of practical problems - plausible, intuitive, flexible - fast, easy to implement - often based on heuristic arguments or cost functions with unclear relation to generalization - limited theoretical understanding of - dynamics and convergence properties - achievable generalization ability here: analysis of LVQ algorithms w.r.t. - dynamics of the learning process - performance, i.e. generalization ability - typical properties in a model situation

Dynamical Analysis of LVQ type algorithms, WSOM 2005 Model situation : two clusters of N-dimensional data random vectors  ∈ ℝ N according to mixture of two Gaussians: orthonormal center vectors: B +, B - ∈ ℝ N, ( B  ) 2 =1, B + · B - =0 prior weights of classes p +, p - p + + p - = 1 B+B+ B-B- (p+)(p+) (p-)(p-) separation ∝ ℓ ℓ independent components: with variance: ℝNℝN

Dynamical Analysis of LVQ type algorithms, WSOM 2005 Dynamics of on-line training sequence of new, independent random examples drawn according to learning rate, step size competition, direction of update etc. change of prototype towards or away from the current data example: LVQ1, original formulation [Kohonen] Winner-Takes-All (WTA) algorithm update of two prototype vectors w +, w - :

Dynamical Analysis of LVQ type algorithms, WSOM 2005  recursions Mathematical analysis of the learning dynamics random vector ξ μ enters only through its length and the projections projections into the (B +, B - )-plane length and relative position of prototypes 1. description in terms of a few characteristic quantitities ( here: ℝ 2N  ℝ 7 )

Dynamical Analysis of LVQ type algorithms, WSOM 2005 completely specified in terms of first and second moments in the thermodynamic limit N   correlated Gaussian random quantities 2. average over the current example  averaged recursions closed in random vector according to : avg. length characteristic quantities - depend on the random sequence of example data - their variance vanishes with N   (here: ∝ N -1 ) learning dynamics is completely described in terms of averages 3. self-averaging property

Dynamical Analysis of LVQ type algorithms, WSOM 2005 4. continuous learning time # of examples # of learning steps per degree of freedom integration yields evolution of projections stochastic recursions  deterministic ODE probability for misclassification of a novel example 5. learning curve  generalization error ε g (α) after training with α N examples

Dynamical Analysis of LVQ type algorithms, WSOM 2005 LVQ1: The winner takes it all initialization w s (0)≈0 theory and simulation (N=100) p + =0.8, v + =4, v + =9, ℓ=2.0,  =1.0 averaged over 100 indep. runs Q ++ Q -- Q +- α RSσRSσ winner w s 11 only the winner is updated according to the class label w-w- w+w+ ℓ B - ℓ B + R S- w+w+ R S+ Trajectories in the (B +,B - )-plane ( )  =20,40,....140....... optimal decision boundary ____ asymptotic position

Dynamical Analysis of LVQ type algorithms, WSOM 2005 Learning curve  η= 2.0 1.0 0.2 - suboptimal, non-monotonic behavior for small η ε g (α  ∞) grows linearly with η - stationary state: η  0, α  ∞, (η α )  ∞ - well-defined asymptotics: η η  εgεg p + = 0.2, ℓ=1.0 v + = v - = 1.0 achievable generalization error: εgεg εgεg p+p+ p+p+ v + = v - =1.0 v + =0.25 v - =0.81.... best linear boundary ― LVQ1

Dynamical Analysis of LVQ type algorithms, WSOM 2005 “LVQ 2.1“ [Kohonen] here: update correct and wrong winner theory and simulation (N=100) p + =0.8, ℓ=1, v + =v - =1,  =0.5 averages over 100 independent runs problem: instability of the algorithm due to repulsion of wrong prototypes trivial classification für α  ∞: ε g = min { p +,p - } R S+ R S-

Dynamical Analysis of LVQ type algorithms, WSOM 2005 suggested strategy: selection of data in a window close to the current decision boundary slows down the repulsion, system remains instable Early stopping: end training process at minimal ε g (idealized) εgεg  η= 2.0, 1.0, 0.5 η - pronounced minimum in ε g (α) depends on initialization and cluster geometry - lowest minimum assumed for η  0 v + =0.25 v - =0.81 εgεg p+p+ ― LVQ1 __ early stopping

Dynamical Analysis of LVQ type algorithms, WSOM 2005 “Learning From Mistakes (LFM)” LVQ2.1 update only if the current classification is wrong crisp limit of Soft Robust LVQ [Seo and Obermayer, 2003] projected trajetory: ℓ B - ℓ B + R S+ R S- εgεg  p + =0.8, ℓ=3.0 v + =4.0, v - =9.0 η= 2.0, 1.0, 0.5 Learning curves: η-independent asymptotic ε g p+=0.8, ℓ= 1.2, v+=v=1.0

Dynamical Analysis of LVQ type algorithms, WSOM 2005 εgεg p+p+ equal cluster variances p+p+ unequal variances..... best linear boundary ― LVQ1 --- LVQ2.1 (early stopping) ·-· LFM Comparison: achievable generalization ability v + =0.25 v - =0.81 v + =v - =1.0

Dynamical Analysis of LVQ type algorithms, WSOM 2005 work in progress, outlook multi-class, multi-prototype problems optimized procedures: learning rate schedules variational approach / Bayes optimal on-line Summary prototype-based learning Vector Quantization and Learning Vector Quantization a model scenario: two clusters, two prototypes dynamics of online training comparison of algorithms: LVQ 1 : close to optimal asymptotic generalization LVQ 2.1. : instability, trivial (stationary) classification + stopping : potentially very good performance LFM : far from optimal generalization behavior

Dynamical Analysis of LVQ type algorithms, WSOM 2005 Perspectives Self-Organizing Maps (SOM) (many) N-dim. prototypes form a (low) d-dimensional grid representation of data in a topology preserving map neighborhood preserving SOM Neural Gas (distance based) Generalized Relevance LVQ [e.g. Hammer & Villmann] adaptive metrics, e.g. distance measure training applications

Dynamical Analysis of LVQ type algorithms, WSOM 2005 Outlook:

Dynamical Analysis of LVQ type algorithms, WSOM 2005

completely specified in terms of first and second moments (w/o indices μ): in the thermodynamic limit N   correlated Gaussian random quantities 2. average over the current example  averaged recursions closed in random vector according to : avg. length

Dynamical Analysis of LVQ type algorithms, WSOM 2005 N   - repulsive/attractive fixed points of the dynamics - asymptotic behavior for  - dependence on learning rate, separation, initialization -... investigation and comparison of given algorithms - time-dependent learning rate η(α) - variational optimization w.r.t. f s [...] -... optimization and development of new prescriptions maximize

Dynamical Analysis of LVQ type algorithms, WSOM 2005 LVQ1: The winner takes it all initialization w s (0)=0 theory and simulation (N=100) p + =0.8, v + =4, p + =9, ℓ=2.0,  =1.0 averaged over 100 indep. runs Q ++ Q -- Q +- α RSσRSσ winner w s 11 only the winner is updated according to the class label self-averaging property (mean and variances) 1/N R ++ (α=10)

Dynamical Analysis of LVQ type algorithms, WSOM 2005 high-dimensional data (formally: N  ∞) ξ μ ∈ℝ N, N=200, ℓ=1, p + =0.4, v + =0.44, v - =0.44 μ By ξ   (● 240) (○ 160) projections into the plane of center vectors B +, B - μ By ξ   μ 2 2 x ξ w   projections on two independent random directions w 1,2 μ 11 x ξw  

Dynamical Analysis of LVQ type algorithms, WSOM 2005 Dynamical analysis of LVQ type learning rules Rijksuniversiteit Groningen Mathematics and Computing.

Similar presentations

Presentation on theme: "Dynamical Analysis of LVQ type algorithms, WSOM 2005 Dynamical analysis of LVQ type learning rules Rijksuniversiteit Groningen Mathematics and Computing."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Dynamical Analysis of LVQ type algorithms, WSOM 2005 Dynamical analysis of LVQ type learning rules Rijksuniversiteit Groningen Mathematics and Computing.

Similar presentations

Presentation on theme: "Dynamical Analysis of LVQ type algorithms, WSOM 2005 Dynamical analysis of LVQ type learning rules Rijksuniversiteit Groningen Mathematics and Computing."— Presentation transcript:

Similar presentations

About project

Feedback