Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 A LVQ-based neural network anti-spam email approach 楊婉秀 教授 資管碩一 詹元順 94722001 2005/12/07.

Similar presentations


Presentation on theme: "1 A LVQ-based neural network anti-spam email approach 楊婉秀 教授 資管碩一 詹元順 94722001 2005/12/07."— Presentation transcript:

1 1 A LVQ-based neural network anti-spam approach 楊婉秀 教授 資管碩一 詹元順 /12/07

2 2 Outline 1. Introduction 2. sample and data preprocessing –2.1 representation –2.2 Feature extraction 3. Anti-spam LVQ model –3.1 Spam category. –3.2 Learning vector quantization neural network model –3.3 Anti-spam LVQ algorithm –3.4 Parameter setting 4. Experiments and result 5. Conclusion

3 3 1. Introduction(1/2) Spam waste users time, money, network bandwidth as well as, meanwhile, clutter users' mailboxes, even be harmful, e.g. pornographic content. In America, spam s make enterprises to be loss up to 9 billions per year. Without appropriate counter-measures, the situation will continue worsening and spam will eventually undermine the usability of .

4 4 1. Introduction(2/2) Duhong Chen et al. compared four algorithms, Bayes, decision tree, neural networks, Boosting, and drew a conclusion that neural network algorithm has higher performance. Experiments have proved that the LVQ-based anti-spare filter has better performance than Bayes- based and BP neural network.-based approaches.

5 5 2. sample and data preprocessing(1/2) 2.1 representation TFIDFi=TFi × log (N/DFi) (1) –TFi : the frequency that word ti appears in document d 2.2 Feature extraction –N : the total numbers of training documents –DFi : represents the numbers of documents which contain word ti

6 6 2. sample and data preprocessing(2/2) 2.2 Feature extraction –A : the numbers of s which contain word t and belong to class s –B : that of s which contain word but not belong to class s –C : that of s which belong to class s but not contain word t –N : the total number in training corpus

7 7 3. Anti-spam LVQ model(1/5) 3.1 Spam category.

8 8 3. Anti-spam LVQ model(2/5) 3.2 Learning vector quantization neural network model –The model is divided into two layers. The first layer is competitive layer, in which each neuron represents a subclass. –The second is output layer, in which each neuron represents a class.

9 9 3. Anti-spam LVQ model(3/5) 3.3 Anti-spam LVQ algorithm(1/2)

10 10 3. Anti-spam LVQ model(4/5) 3.3 Anti-spam LVQ algorithm(2/2)

11 11 3. Anti-spam LVQ model(5/5) 3.4 Parameter setting

12 12 4. Experiments and result(1/4) This project makes use of corpus from which is open available source. Select 1000 pieces s randomly from the corpus, including 580 spam s, 420 legitimate s.

13 13 4. Experiments and result(2/4) Anti-spare filter performance is often measured in terms of spam precision (SP) and sparn recall (SR).

14 14 4. Experiments and result(3/4) A criterion F1, which incorporates spam precision and spare recall.

15 15 4. Experiments and result(4/4)

16 16 5. Conclusion Both neural network-based algorithms are usually better than that based on Bayes. LVQ-based method classify spam s into several subclasses in content so that the feature words of each subclass of spam is more related and closer as well as characteristics of each subclass of spam s are easier to identify.


Download ppt "1 A LVQ-based neural network anti-spam email approach 楊婉秀 教授 資管碩一 詹元順 94722001 2005/12/07."

Similar presentations


Ads by Google