Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fighting Spam Enterprise Spam Filtering Using Open Source Tools.

Similar presentations


Presentation on theme: "Fighting Spam Enterprise Spam Filtering Using Open Source Tools."— Presentation transcript:

1 Fighting Spam Enterprise Spam Filtering Using Open Source Tools

2 Introduction  Newsflash: SPAM is a problem  Newsflash: SPAM is a problem  SRJC: 60-80% of mail received is Spam!  Commercial Solutions exist, but are expensive  Open Source tools are a powerful alternative

3 Tonight’s Agenda  SpamAssassin Overview  Additional Spam Rules (S.A.R.E.)  Integrating with Multiple Mail Servers  Bayesian Filtering

4 SpamAssassin – How It Works  Uses the combined score from multiple types of checks to determine if a given message is spam.  Header tests  Body phrase tests  Bayesian filtering  Automatic address whitelist/blacklist  Manual address whitelist/blacklist  Collaborative spam identification databases (DCC, Pyzor, Razor2)  DNS Blocklists ( "RBLs" )  Character sets and locales  Even though any one of these tests might, by themselves, mis-identify a Ham or Spam, their combined score is terribly difficult to fool. HamSpamHamSpam

5 SpamAssassin - Advantages  Wide-spectrum of different tests  Open Source and Free!  Flexible – works with many platforms and servers  Easy Configuration

6 SpamAssassin Rules Emporium  http://rulesemporium.com/  Popular Repository for Third Party SpamAssassin Rules  “Actively” Updated between SpamAssassin releases

7 SARE Usage Guidelines  Just download rules into SpamAssassin directory (i.e.: /etc/spamassassin)  Restart daemon if necessary  Most Popular Rules have “levels” (i.e.: 0 = conservative, 3 = aggressive)  Choose Rules you use carefully!

8 Rules Du Jour  http://www.exit0.us/index.php?pagename= RulesDuJour http://www.exit0.us/index.php?pagename= RulesDuJour http://www.exit0.us/index.php?pagename= RulesDuJour  Automates updating, downloading and installation of most popular SARE rules

9 Rules Du Jour  Install script in $PATH (i.e.: /usr/local/sbin) and make executable  Create a blank configuration file at /etc/rulesdujour/config  Add a TRUSTED_RULESETS line to your config file that contains the names of the rulesets you chose. i.e.:  TRUSTED_RULESETS="SARE_ADULT SARE_OBFU0 SARE_OBFU1 SARE_URI0 SARE_URI1"  Configure any local settings. Examples below:  SA_DIR="/etc/mail/spamassassin"  MAIL_ADDRESS="administrator@example.com"  SA_RESTART="killall -HUP spamd"  Run this script periodically (manually or via crontab)

10 SpamAssassin Serving Multiple Servers  Problem:  How do you keep multiple mail servers syncronized?  Spam checking adds load to mail server

11 SpamAssassin Serving Multiple Servers  Solution: Use a single machine to manage spam sitewide!  Logs, Configuration unified on a single machine

12 SA/multi-server – set up server  Server must be running SpamAssassin as a daemon (spamd -d)  Server must accept outside connections (i.e.: spamd –A 127.0.0.1,192.168.1.10,192.168.1.11)  Make sure server can listen to port 783 (spamd’s default port)

13 SA/multi-server – set up client  Use “spamc” command instead of “spamassassin”  Use switch for remote server: spamc -d 192.168.1.10, and so forth …  Test:  spamc –d my.server.net < /path/to/sample/email

14 Bayesian Filtering - Introduction   “Bayesian Filtering uses statistics from previously-classified messages to estimate the likelihood that a particular message is spam.”*   “This likelihood estimate is converted to a (possibly negative) weight which is added to the ad hoc spamminess score.”*   *GORDON V. CORMACK and THOMAS R. LYNAM, University of Waterloo

15 Bayes – Getting Started  Enable Bayes in Config: use_bayes 1  Put aside space for Bayes DB (either file- based or SQL)  bayes_path /var/local/spamassassin/bayes  or  bayes_store_module Mail::SpamAssassin::BayesStore::SQL

16 Bayes – Getting Started  Feed Bayes “ham” and “spam”  You MUST feed it samples of good and bad messages to start!  At least 200 samples of each, but use as much as possible  sa-learn --spam --dir /path/to/directory/full/of/spam/msgs  sa-learn --ham --dir /path/to/directory/full/of/ham/msgs

17 Bayes – Enhancing  Enable automated learning:  bayes_auto_learn 1  bayes_auto_learn_threshold_nonspam 0.1  bayes_auto_learn_threshold_spam 6.0  “Teach” Bayes  Create mailbox for “ham” and “spam” and scan periodically  Note: “Resend” email, don’t forward!  You can’t overtrain the Bayes database!

18 Bayes – Enhancing  Give more “weight” to Bayesian Results  score BAYES_00 -4  score BAYES_05 -2  score BAYES_95 6  score BAYES_99 9

19 Conclusion  World-class Spam Prevention is Possible with Freely Available Tools!  SRJC Stats:  Process 30,000 – 60,000 messages per day with one dual-processor server  Most messages scanned < 10 seconds ( < 1 without network tests)  < 0.007% false positives/negatives


Download ppt "Fighting Spam Enterprise Spam Filtering Using Open Source Tools."

Similar presentations


Ads by Google