Presentation is loading. Please wait.

Presentation is loading. Please wait.

Negative Selection Algorithms at GECCO 2005 7/22/2005.

Similar presentations

Presentation on theme: "Negative Selection Algorithms at GECCO 2005 7/22/2005."— Presentation transcript:

1 Negative Selection Algorithms at GECCO 2005 7/22/2005

2 AIS track of GECCO 2005 11 regular paper –5 negative selection algorithm related –3 immune network model related –multi –agent simulation, gene library, antigenic search 2 posters –Immune network model, clonal selection

3 Papers on Negative selection algorithms Ji & Dasgupta Estimating the detector coverage in a negative selection algorithm Gonzalez et al Discriminating and visualizing anomalies using negative selection algorithm and self-organizing maps Stibor et al, Is negative selection appropriate for anomaly detection? Shaprio et al, An evolutionary algorithm to generate hyper-ellipsoid detectors for negative selection Hang et al, Applying both positive and negative selection to supervise learning for anomaly detection

4 Discriminating and visualizing anomalies using negative selection algorithm and self-organizing maps Main Idea: Combination of NS and SOM (self-organizing map) Visualize the anomalies

5 Key feature Using negative selection to produce artificial anomalies instead of detectors

6 SOM A type of neural network To capture the feature in the input and to provide a structural representation Output neurons are organized in a one- or two-dimensional lattice The weight vectors of these neurons represent prototypes (cluster centroid)

7 Three phases of NS-SOM

8 NS-SOM model training SOP with only normal samples will produce a map that only reflect the structure of the self space, ignoring the non- self space N-dimensional real-valued During the second phase: if the input samples are labels, … (moving the third phase). The first phase is executed just once, but the second and third phases could be executed as many times as sets of new samples are available Visual representation by a 2-D grid corresponding to the network

9 SOP output A visual representation of the feature (self/non-self) space could be generated by drawing the 2- dimensional grid corresponding to the network, and assigning each node a different color depending on the category it represents (normal, unknown anomaly, or known anomaly). Two different SOM topologies were used with a rectangular output layer of 8×8 and 16×16 nodes.

10 Output visualization

11 Implementation –NS : RRNS algorithm by Gonzalez et al –SOP : using the SOM-PAK package by Helsinki University of Technology Experiments –Iris data set –Wisconsin Breast Cancer data set

12 Is negative selection appropriate for anomaly detection? Problems in negative selection (specific schemes and applications) Compare with SVM (Support Vector Machine): requiring examples of one class or two classes?

13 General problem : candidates are generated by a simple random search Shape space affinity holes are necessary, to generalizing beyond training set –No hole: overfitting –Too many hole: underfitting

14 Criticism for binary representation the hamming shape-space and the r-chunk matching rule only appropriate and applicable for anomaly detection problems for a small value of l (e.g. 0<l<32) –Totally based on Esponda et als analysis about number of holes * Although I want to focus on introducing instead criticizing this work. The authors seems confused between hamming and r-chunk.

15 Criticism for real-valued representation Positive selection (Self Detection Classification) is more straightforward. It is not clear how to choose self radius. –From our point of view, it is an approach which requires two classes in the learning phase in order to determine the self-radius. – no reason given. It is a problem how to find an optimal distribution do the detector (Gonzalez et als method takes a vast amount of time).

16 Occams razor principle When you have two competing theories which make exactly the same predictions, the one that is simpler is the better.

17 Comparison with SVM SVM is a machine learning algorithm for a two-class classification problem. The input data is mapped into a higher-dimensional feature space, where a linear decision region is constructed. A one-class SVM was proposed by Scholkopf et al. –Provides good results in high dimensional space (no detail or results provided)

18 Summary Unfortunately, citing several related works, then making a scary claim. Little was done to analyze or propose alternatives, except proposing Self Detector Classification – detection by directly check all training samples.

19 Applying both positive and negative selection to supervise learning for anomaly detection Use synthetic anomalies to deal with anomaly-detection (supervised learning from class-imbalance data sets) –GA: Positive selection –Synthetic data: negative selection Categorical/discrete data

20 Two categories of methods At data level: main focusing on re-sampling –Under-sampling the normal class –Over-sampling the anomaly class –combination At algorithm level

21 Other works using this strategy Gonzales et al SMOTE (Synthetic Minority Over-sampling TEchniques) –taking each minority class sample and introducing synthetic examples along the line segment joining any/all of the k minority class nearest neighbors.

22 The way of SMOTE generating synthetic samples

23 Phase 1: co-evolving patterns of the normal data (positive selection) A number of non- interbreeding subpopulation: no cooperation, no competition Randomly initialized All converged scheme together form the decision boundary. Individuals consist of four sections:

24 fitness-proportionate selection Uniform crossover Bit flipping mutation Subpopulation size=100 Crossover rate=0.65 Mutation rate=0.15

25 Phase 2: synthetic generation of anomalous samples Strategy 1: with seed –Starting with vacant neighbors of the examples of the anomaly class 2n neighbors for n-dimensional Vacant means neither normal nor anomaly –Check if candidates is covered by schema of normal class. Those covered are removed. Strategy 2: without seed – in the case of no anomaly examples –Starting with random position

26 experiments UCI data sets: 14 used Multi-class data are mapped into a 2-class dataset –Version 1: Natural distribution –Version 2: Balanced natural distribution –Version 3: balanced extreme distribution (balanced means processed by the approach described in this paper) Classifiers used: C4.5 and Naive Bayes Result: v2>v3>>v1

Download ppt "Negative Selection Algorithms at GECCO 2005 7/22/2005."

Similar presentations

Ads by Google