Presentation on theme: "國立雲林科技大學 National Yunlin University of Science and Technology Application of LVQ to novelty detection using outlier training data Hyoung-joo Lee, Sungzoon."— Presentation transcript:
國立雲林科技大學 National Yunlin University of Science and Technology Application of LVQ to novelty detection using outlier training data Hyoung-joo Lee, Sungzoon Cho, Pattern Recognition Letters, 2006. (article in press). Presenter : Wei-Shen Tai Advisor : Professor Chung-Chian Hsu 2006/7/5
N.Y.U.S.T. I. M. Outline Introduction Learning vector quantization for novelty detection Codebook update for an LVQ for novelty detection Determining local thresholds Parameters for the proposed approach Experimental results Conclusion and discussion Comments
N.Y.U.S.T. I. M. Motivation Novelty detection A model learns the characteristics of normal patterns in training data and detects outliers or novel patterns. Original LVQ problem Cannot deal with a highly imbalanced dataset, Codebook update is modified codebooks should be located close to normal patterns and far away from novel patterns.
N.Y.U.S.T. I. M. Objective Local thresholds to determine Effectively exclude novel patterns outside boundaries of the normal class. LVQ for novelty detection (ND) generate more accurate and tighter boundaries than other approaches that use only the normal class of patterns.
N.Y.U.S.T. I. M. Training algorithm and classification
N.Y.U.S.T. I. M. Results on an artificial dataset Effects of the modified LVQ update. (a) True boundaries, (b) SOM, (c) LVQ-ND and (d) LVQ. No training at all since all codebooks were assigned to the normal class while training was prematurely stopped due to the class imbalance.(a) SOM-G, (b) SOM-L, (c) LVQ-ND and (d) LVQ.
N.Y.U.S.T. I. M. Results on real-world datasets When applied to the Ratsch’s benchmark datasets and the pump vibration dataset, It performed better than other widely-used novelty detectors.
N.Y.U.S.T. I. M. Codebook update rule Initial codebooks Generated by training a SOM. Note that only the normal patterns are used in this process. A modified error function (y i = +1, -1) Codebooks can be written as if x i does not belong to Voronoi region S k that w k represents, w k remains unchanged. If x i does belong to S k, w k moves toward x i if x i is normal, or moves away from x i otherwise.
N.Y.U.S.T. I. M. Determining local threshold Voronoi region S k A hypersphere with a center at w k and a minimal radius can be obtained so that it surrounds as many normal patterns and as few novel patterns as possible. Find the radius an ‘‘optimization’’ problem a large radius can surround many normal patterns, but may increase false acceptance. a small radius can exclude many novel patterns, but may increase false rejection.
N.Y.U.S.T. I. M. Parameter setting The number of codebooks, K, Minimize the misclassification error C 1, C 2 While larger normal regions are defined with a larger C 1, tighter boundaries are obtained with a larger C 2. Suppose k, O k = ; and x 1 ;... ; x |Tk| T k. If FR k denotes the FRR (false rejection rate) in Voronoi region S k, the following holds (|T k |-u k means normal pattern outside the hypersphere)
N.Y.U.S.T. I. M. Average AUROCs (%) with respect to |O|/|T| (a) Banana, (b) Breast-cancer, (c) Diabetes, (d) German, (e) Heart and (f) Titanic.
N.Y.U.S.T. I. M. Conclusions Utilizing information on the novel class LVQ-ND and SVDD, outperformed their counterparts at least slightly, it can improve novelty detection performance. Well determined thresholds A codebook-based method with well determined thresholds can be good enough for novelty detection tasks. (in SOM-L) The number of novel patterns gradually increases. As |O|/|T| increases, however, the LVQ-ND excels other models.
N.Y.U.S.T. I. M. Comments Is it feasible for classification of two more classes? Focus on outlier processing, but those functions seems cannot be utilized in the experiments. If it do so, the effectiveness of LVQ-ND merely was applied in binary classification so far. Further experiments for multiple classes It is essential for demonstrating those effectiveness of the proposed method or functions.