Spam, Phishing and Security Last updated: 6/15/09 © Prof. Amir Herzberg Computer Science Department, Bar Ilan University

Spam, Phishing and Email Security Last updated: 6/15/09 © Prof. Amir Herzberg Computer Science Department, Bar Ilan University http://AmirHerzberg.com

References DNS-based Email Sender Authentication Mechanisms: a Critical Review, at http://amir.herzberg.googlepages.com/some recentpapers. (another paper there, too) ‏ Many relevant RFCs... mentioned within.

Agenda: Email Security and Spam Intro to email security & spam  goals, threats, motivation Anti-spam defenses and their use  Hidden email address  White/Blacklists/Greylisting  SMTP-Sender Authorization (SPF, SIDF) ‏  Signed Email (DKIM) ‏  Content Filtering  Reputation and Accreditation  Penalties

Email Threats, Goals, Tools Exposure of sensitive email  Goal: confidentiality Of message Of who sent to whom, bcc recipients Of location  Tools: encryption (S/MIME, PGP,...) [hidden] Fake email (phishing) ‏  Goal: integrity / authentication Spam: undesirable, excessive email  Goals: block / limit / punish Our focus: spam and phishing

Example: Email capture by DNS Poisoning con.co.in  6.6.6.1, co.in  6.6.6.2 Resolve con.co.in From: alice, To: bank.co.in My acct is… ClientMTA 6.6.6.1 NS for con.co.in 6.6.6.2 Fake NS for co.in 6.6.6.3 Fake bank.co.in Local DNS From: con.co.in, To: alice Resolve bank.co.in bank.co.in  6.6.6.3 From: alice, To: bank.co.in, My acct is… If local DNS accepts gratuitous resolutions

Email Security – Authentication Internet email was designed for connectivity and functionality, not for security (e.g. authenticity) ‏ Sender Identification and Authenticity Goals:  Correct sender identity  Message received as sent (no changes) ‏  (Long-lasting) Evidences? Of identity of sender Of sending message Timestamp Requires trusted third party; not in scope No authenticity  spoofed email attacks

Clear / Detached Signatures Many MUAs don’t support OpenPGP, S/MIME  No crypto or other systems: Legacy PGP, PGP/MIME, MOSS, PEM…  OpenPGP, S/MIME incompatible with each other Incompatible recipients get only an unknown MIME type (do not get the message) ‏ Separate signature from “plain” message  S/MIME: Use “Clear Signing” content type  OpenPGP: Use “Detached signatures”  Receive message + `unknown MIME` attachment

End-to-End Crypto End-to-End: Sender & recipient must interoperate Two standards: OpenPGP & S/MIMEv3  OpenPGP based on PGP (Pretty Good Privacy) ‏  PGP: A popular freeware file encryption package  S/MIMEv1,2 developed by consortium of vendors  Other variants: PGP/MIME, PEM, MOSS Similar concepts, services, mechanisms  Encryption: public key, shared key, hybrid (envelop) ‏  Digital signatures and message authentication  Key distribution: add certs to message

Clear / Detached Signatures (1) ‏ With “Regular” OpenPGP and S/MIME signatures:  Message is part of a signature object Problem: Many users / mail readers incompatible  Many legacy users have no crypto  OpenPGP & S/MIME not compatible with each other  Other systems: legacy PGP, PGP/MIME, MOSS, PEM… Incompatible recipients get only an unknown MIME type – they do not get message

Clear / Detached Signatures Solution: Separate signature from “plain” message  S/MIME uses “Clear Signing” content type  OpenPGP uses “Detached signatures”  Signatures still appear as meaningless attachments

Incompatible Recipient MUA OpenPGP & S/MIME messages to incompatible Mail User Agent (MUA) ‏  Recipients get an unknown MIME type and “plain” message as usual (if detached / clear signed) ‏  Or – recipients get a short text explaining message is encrypted Is this OK?  Fine for security…  Fine with compatible mail user agents (MUA) ‏  Somewhat weird for naive users with incompatible MUA

Usability Beats Security OpenPGP & S/MIME messages to incompatible Mail User Agent (MUA) ‏  Assume: 30% incompatible users  Assume: 1/3 of them get worried & call helpdesk  What is the cost of a call from 10% of your users?  Too high  Very few corporations sign email Usability beats security (and it will always be so) ‏

Transparent Signatures and Keys Idea: Use “transparent” signatures & keys  Detected and used by compatible recipient MUA  Ignored by non-compatible MUA  Proposed in DKIM (Domain Keys Identified Mail) ‏  Also proposed in MetaSignatures proposal  And some other proposals (e.g. Stream) ‏ Use new email message headers  DomainKeys-Signature: (e.g. mail sent by gmail.com) ‏  And others… (ignored by incompatible mail agents) ‏ Standards still being drafted

Key Distribution – PGP and S/MIME PGP and S/MIME differ in their trust model PGP: User manages public keys of peers  Peer to peer – “web of trust” model  Each user acts as a certificate authority (CA) ‏  User can accept public keys “certified” by a trusted peer  Problematic for establishing trust between remote entities S/MIME: List of trusted Certificate Authorities (CAs) ‏  Predefined list installed in MUA  In theory, user may edit (remove, add, etc.) – very few do  Getting certificate is not too difficult or expensive  Yet few do – see usability problem

Opportunistic Key Distribution Opportunistic public key distribution:  Send public key with each message (to new recipient) ‏  In new (transparent) email header field – no impact on incompatible MUA users Recipients collect public keys and identities from incoming email  To defend against spoofing – send challenge to sender and piggybacked on response  Again in a header field for interoperability  Add public key only when received with response

Network-Based Email Crypto End-to-End crypto is hard to deploy  Need compatible mechanisms in sender & recipient  Host encryption prevents gateway content filtering Alternative – network-based crypto email security Gateway-to-Gateway Crypto:  Mail agents on sending & receiving domains  Mail security gateway: content filtering and crypto  Can encrypt & authenticate (opportunistically) all email Hop-by-Hop Crypto:  Protect domain-to-domain links & intra-domain links  Use IP-Sec or SSL/TLS to protect between servers  No signatures

Incompatible Recipient MUA OpenPGP and S/MIME messages to incompatible Mail User Agent (MUA) ‏  Recipients get an unknown MIME type  Get “plain” message as usual (if detached / clear signed) ‏  If encrypted: get short text explaining how to decrypt Is this OK?  Seems fine for security…  Fine with compatible Mail User Agents (MUA) ‏  But weird for naive users with incompatible MUA  Many customer-service calls!

Usability Beats Security OpenPGP and S/MIME messages to incompatible Mail User Agent (MUA) ‏  Recipients get an unknown MIME type  Get “plain” message as usual (if detached / clear signed) ‏  If encrypted: get short text explaining how to decrypt Is this OK?  Assume: 30% incompatible users  Assume: 1/3 of them get worried, call helpdesk  What is the cost of a call from 10% of your users?  Too high, so – very few corporations sign Email

Abusive Email – Scams Scams: misleading users, sting cases Nigerian 4-1-9 Scam  `Sting` scam from 1980s  Ask victim to help launder money, etc.  4-1-9: Nigerian penal code section on fraud schemes  Billions in damages, many victims  Top 5 Google replies to ‘Nigerian’ are on this fraud  Not only via email Recently more common: stock `tips` scams  Plus: no `trace` of money movements…

Abusive Email – Spam & Phishing Spam:  Advertising, offensive e-mail, usually mass-mailed  Without a proper warning label (e.g. ADV:) ‏ Or trivially filtered  Major nuisance – reduces email utility  Cause for changes in email operation & usage Phishing email:  `Spoofed` message: fake sender identity, e.g. of bank  Goal: lead user to expose credential (at fake bank site) ‏  Or other exposure, e.g. install malware

Spoofed Email Attacks (1) ‏ Attacker sends email to (many) victim(s) ‏ Email comes with fake (spoofed) sender identity  Of trustworthy source (e.g. bank) ‏ Email contains misleading / coercive subject or contents Attack vector(s):  Scam – e.g. Nigerian 4-9-1  Malware Manual infect: virus, Trojan, … Auto infect (worm): exploit e-mail client vulnerability  Phishing: direct user to malicious (fake/spoofed) web site

Spoofed Email Attacks (2) ‏ Spoofed Email Spoofed Email Spoofed Site / Page Spoofed Site / Page Scam Malware

Spam Channels Spam is mostly associated with Email, but… Instant Messages and SMS spam (aka spim) ‏ Spam in file-sharing systems  “Poison” shared files as crude anti-piracy method  Or: hide spam (ads, etc.) in `fake files` Spam in newsgroups, forums, chat rooms, Web spam: Wiki, blogs, … direct or for spamdexing Spamdexing: search engine spamming / poisoning  Trick page rank algorithms (e.g. links from blogs) ‏  Defense: rel="nofollow" attribute for links We focus on email, though

Web-Based Spam Web Email Services Spam  E.g. send spam by abuse of greeting-cards site Web-Spam: spam-content on the web Spam Reviews:  Fake good / bad reviews – e.g. exposed at “Amazon glitch” Comment Spam  Spam via “talkback” comments in web sites  Wikispam – spam open-authoring sites, e.g. Wikipedia Goal is often spamdexing

Spamdexing: New Major Threat? Spamdexing: Search-Engine spamming / poisoning Google page-ranking: based on referring pages  References from many pages with high rank  Blogs and blog-hosting sites often have high page ranks  Link spam: via comments, referrer, spam blogs (spblogs) ‏ Result: Spammer web page in top search results  Users often visit & trust top search results  Reach unrelated, spam-full pages  Page with malware (by scripts or automatic-download) ‏  Spoofed page of bank, download site or search engine We’ll focus on email security / spam

Blocking and Open Relays Initially: report abuse to ISP (abuse@xxx.net,...) ‏  Spammers disconnected by ISP  Fails if ISP cooperates (“pink contract”) ‏ Blacklists:  Refuse email from mail servers / ISP sending spam  Mail server / ISP responsible for mail (spam) it relays Example: Sendmail blocking rules – /etc/mail/deny  E-mail addresses, domain names or IP address blocks: spammer@ISP.net REJECT [specific address] spammer.com REJECT [entire domain] 6.6.6 REJECT [entire 6.6.6.* IP address block] Which IP address?  Of SMTP-Sender  Of MAIL-FROM domain (if exists) ‏

Blacklisting by IP: Assumptions Identify spamming SMTP-senders by IP address Fail if attacker uses many / spoofed IP addresses  Attacker must receive packets (to respond) ‏  Fails for Man-In-The-Middle (MITM) attacker Assumes cost per IP address / domain  But: Cost < 10$ Block large IP address range Collateral damage: harms “good” domains  Cost per IP may decline with IPv6

Aggressive Blacklisting Not much spam is from known sources  Example: Spamhaus Block List (SBL): 12% Aggressive blacklisting: beyond “known sources”  XBL: Exploit Block List  Open proxies / relays, Open port 25,…  Catches 51% – including many spambots Spambots: Spamming malware Block Email with URL to spam site Example: Spamhaus “URL on SBL” Source: Spamhaus

Initially: individual blacklists  Overhead on system administrators  Manual  Too late – only responds to damage Early shared blacklists:  Manually shared (email, files,etc.) ‏  Inefficient 1996: MAPS Realtime Blackhole List (RBL®) ‏  MAPS = Mail Abuse Prevention System  “Blackhole” – no Email from IP on list  Efficient lookup, distribution via DNS  Currently: many DNS blacklists (DNSBL) ‏ Blacklists Evolution

Spam and Phishing Emails Spam: unsolicited, undesirable emails  Lots of it (`bulk') ‏  Without `warning label` (AD: xxx...) ‏  Mostly: (illegal) ads and phishing Phishing: spam trying to fool the user:  To enter spoofed website (steal password, etc.) ‏  To run attached malware, etc.  Major risk

Why we hate Spam Spam annoys Spam motivates net-bots Spam carries:  Scams: Pyramid schemes, stock `tips`, …  Phishing: Trick user into exposing credentials to fake site  Malware: Viruses, Trojans, worms, bacteria, etc.  Undesired content: e.g. pornographic  And more…

The Spam Tragedy Spam is an example of the tragedy of the commons  Picture a common pasture, Ok for 101 sheep  Ten farmers – each has ten sheep – all is fine  But if each farmer adds even one...  I will have 11 sheep, and the pasture – 101  Similar: bandwidth, pollution, ISPs allowing IP spoofing… Marketing & commercial Email could be valuable  But: Costs of sending Email are negligible Solutions: impose limits or costs  Technical and/or legal

Spammers often violate many laws  Often sent illegally, from Zombies (spambots) ‏  Often contains scams, malware, porn, etc.  Other laws – privacy, copyright, minor-protection,... Special anti-spam legislation  Opt-in – mandatory prior consent to Email  Opt-out Benefits:  Reduced spam by law-abiding entities  Also: opt-out, `ad` labels  easier to filter Anti-Spam Legislation

US CAN-SPAM act of 2003:  Senders must implement opt-out – honor recipient requests to be removed from list  US FTC to maintain “do-not-email-me” registry  Must clearly identify sexual connotation  Heavy sanctions to offending persons & corporations Criticism:  Still burdens the recipient  Exposes / validates email address to spammers  Most recipients do not opt-out Anti-Spam Legislation (2)

Anti-Spam Legislation (3) ‏ European directive 2002/58/EC  Opt-in – mandatory prior consent to Email marketing  Must also support opt-out Korea: mandatory “ADV” prefix in subject line  Few other laws requiring labels (often in subject line) ‏ Important benefits of legislation  Reduced spam by law-abiding entities  Also: CAN-SPAM, ADV labels  easier to filter Challenge: definition of spam

Definition: UBE and UCE Legislation is easier against well-defined threats Two common definitions for spam: UBE and UCE Unsolicited Bulk Email (UBE) ‏  Bulk: Sent to many users (possibly with minor variations) ‏  Unsolicited: Without prior request from recipient (opt-in) ‏ Unsolicited Commercial Email (UCE) ‏ Problems:  No simple test whether email is UBE / UCE  Not all unsolicited bulk commercial messages are spam  Not all spam is unsolicited  Not all spam is commercial e.g. “Jesus coming”  Not all spam is bulk e.g. `spear phishing`

Our Definition of Spam Our Definition: Spam: message belonging to one of predefined categories without label Requires definition of categories & labels  Categories: Advertising, commercial, porn,...  Well defined for given message  Labels: ADV, ADULT, VIRUS, etc.  It is easy to filter labeled mail  allows filtering Allows simple & universal test of given message  Any missing / incorrect label(s) ?  Easier for use in litigation

Sending Spam, Phishing: How? From legit IP addr: but quickly blocked So often: use `zombies`; but how? Send directly from Zombie to target server  Simple, efficient... most common  But: often Zombie can't send directly (port 25 blocked) ‏ Send via Zombie's mail submission server  Using Zombie's credentials  Harder, slower, more visible... but hard to block

Defenses from Spam & Phishing Prevent phishing from succeeding (not today!) ‏  Improved login indicators / process  Anti-malware: VM, sandbox, signed code... Penalize phishermen & `collaborators` (ISP?) ‏ Hide email address (from `harvesting') ‏  Con: reduced connectivity [how: hidden] Block spam & phishing emails  Challenges: efficiency, accuracy  False-negatives more harmful for phishing

Email Address as Secret Use “hidden” Email addresses  Reduces Email connectivity Email eventually exposed  By spyware in computer of user, recipient, or somebody copied on message / reply / forward  By sniffer on subnet (of user or recipient) ‏ Exposure  must change Email address  Can’t identify source of exposure  Inconvenience  Overhead

Hiding Recipient Email Addresses Harvesting: Collection of Email addresses  By crawling web pages, searching in public forums, etc.  Reason not to “opt-out” Result: Users hide Email addresses  Use separate “public” email for forums, etc. Mangle published address  To combat automated harvesting  Not difficult to circumvent  Result: Reduced connectivity among Email users Alice at wonderland dot com

Secret Address Unique to Each Sender Secret Email address, different for each sender Example: Bob’s incoming email addresses:  Alice sends to Bob+24224@b.com  Abe sends to Bob+85622@b.com  Reject other Email to Bob How Alice & Abe get these addresses for Bob?  Manually  By response from Bob’s “Public Email”: Bob@b.com When Bob+85622@b.com is exposed – send a new dedicated address to Abe  Can’t know who exposed address

Secret Address for Each Sender (cont’) ‏ Problem: multiple parties on Email messages Abe sends both to Carl and to Bob (at Bob+85622@b.com) Carl responds: To Abe and to Bob+85622@b.com What to do?

Challenge-Response Solution Bob receives email from Carl  To Bob’s `open` email or to Bob+85622@b.com  Not dedicated to Carl Problem: Rejection to Carl – good user! Solution: Bob sends Carl rejection with challenge  With new dedicated user Bob+33333@b.comBob+33333@b.com  May also contain “puzzle” to make sure Carl is a person Problems:  Hassle to Carl – new kind of spam?  What if Carl will also send a challenge back?

Blocking Spam/Phishing: Goals Do no harm  Avoid false-positives (lost mail) ‏ Esp.: avoid `framing' (Joe-Job) ‏  No significant extra delay  Abuse-free, e.g. do not spam others Block most spam/phishing emails  False-negative phishing emails esp. harmful Acceptable, `minimal' overhead By recipients or by mail servers

Blocking Spam/Phishing: Tools Content Classification  Cons: computationally expensive, errors Authentication / Authorization  Did domain authorize server's IP address? (SPF) ‏ Corrupted by forwarding services (some fixes) ‏  Did domain send (authorize sending) this message? By digital signature (DKIM) – fails if msg modified Reputation: blacklists, whitelists and beyond  By server's IP address or (domain) name

Reputation Mechanisms Blacklists: known `bad` senders  Identify (mainly) by IP of sending server Block entire `suspect' address blocks Motivating ISP/companies to block port 25 (SMTP) ‏ Fails if sending from valid acct at legit mail server Whitelists: known `good` senders (by IP or name) ‏  Whitelist of (domain) names: allows flexibility, requires authentication (DKIM) or authorization of IP (SPF) ‏ Greylist: recently / already seen senders, blocked emails  Most spammers are `new`, short lived, & do not resend Reputation services (guaranteed compensation/screening) ‏

Email connection from IP Content-based Classifier Displ ay Bloc k Is IP in one of the lists? Bloc k IP in blacklist Opt: add IP to blacklist Ok Not Tempora ry Failure Add IP to greylist after delay IP in greylist Domain authorized this IP? (SPF)‏ Unauthorized IP (SPF hard fail)‏ IP in whitelist (for this domain)‏ Check Signature (DKIM)‏ Yes, trusted domain Sig OK, trusted domain Else No/bad signature (and sig required)‏ Opt: rDNS? Fail

1 st Filtering Stage: Lists Blacklists: known/suspect spammers  By IP of sending MTA  By domain of URLs (in `content filtering`) ‏ Whitelists: trusted (`good') senders  By IP of sending MTA and domain Non-matching IP/domain: phishing? Domain - in HELO, Mail-From,... Greylist: recent attempts to send  Assuming spambots usually do not retry

Blacklists (DNSBL) ‏ Most blacklists use DNS (hence DNSBL) ‏  Efficient distribution – caching by ISPs To query if IP address ww.xx.yy.zz is in list EBL.net  DNS “A” query for zz.yy.xx.ww.EBL.net  DNS addresses: right to left  IP addresses: left to right Response  ww.xx.yy.zz in EBL.net  Can reply for addr block: xx.ww.EBL.net  Value can be “Type of Abuse” May provide details, e.g. via DNS “TXT”

Blacklist Policies Addresses added and removed from blacklist by blacklist owners Some blacklists publish policies, e.g. MAPS:  Spam originating from address  Open relay / proxy / open port-25  URL (or IP) of `spammer’s web server`  Other address owned by spammer  ISP that refuses to disclose spammer IP block  Company providing spam support services Hey, is this ethical?

Criticism of Blacklists Imprecise, negligent, obsolete listings  No accountability  negligence Unfair, hidden interests & agendas  False listings, e.g. by competitor Collateral damages  Some lists block entire countries! Over-zealousness / censorship:  Blocking open relay / proxy before abuse  Blocking “spam supporters”  Too broad definitions Fail for new spammer's IP (spambots!) ‏

Greylisting Greylisting: for unknown / suspect SMTP-senders  E.g.: rDNS failed, or not in white-list Idea: Most spam is sent by spambots Spambots often less tolerant than mail servers  Need efficiency and simplicity / not reliability Instead of flatly refusing (like blacklist)…  Respond with delay (of 30 seconds or so) ‏  Or: Respond with error code 451 (transient error)  This will force retransmission after delay Acceptable extra load on compliant SMTP-sender Most spambots fail (to wait / retransmit) ‏   Spam lost !

Greylisting – Details Details of transient-error greylisting Upon receiving packet from suspect SMTP- sender Let ww.xx.yy.zz be the sender’s IP address Receive RCPT TO After receiving MAIL FROM  X =  If Time[X] = NotFound or time < Time[X]+1 hour then {Time[X]=time; respond with 451 Transient error } else {Time[X]=time; respond with 250 OK}

Email connection from IP Content-based Classifier Displ ay Bloc k Is IP in one of the lists? Bloc k IP in blacklist Opt: add IP to blacklist Ok Not Tem p Failu re Add IP to greylist after delay IP in greylist Domain authorized this IP? (SPF)‏ Unauthorized IP (SPF hard fail)‏ IP in whitelist (for this domain)‏ Check Signature (DKIM)‏ Yes, trusted domain Sig OK, trusted domain Else No/bad signature (and sig required)‏ Opt: rDNS? Fail

SMTP-Sender Authorization Email Outgoing MTA Authorization Policy  Identify IP add of domain’s outgoing MTA “Our users send only via these MTA(s)”  Never via another mail server (e.g. hotel) ‏ Receiver uses policy of sending domain  Found, valid sending-MTA's IP: Accept  Found, invalid sending-MTA's IP: reject or tag  No policy for sending domain: no indication Which domain's policy?  SPF: Mail-From or HELO; Why not `From'???

Outgoing MTA Authorization Process a.com Authoritative DNS Local DNS Local DNS Domain a.com Domain b.com Out-MTA In-MTA Request Email Policy Record for a.com a.com’s Email Policy Record a.com’s Email Policy Record = Email from a.com

Authorization – Which Domain? Policy record: “Our domain users always send via this MTA(s)” Question: Which domain to query? Naive choice: domain of From: address  From: address is shown to user  Agent (person / program) who wrote the message  But mailing lists keep “from:” address of Email SPF (Sender Policy Framework) ‏  MAIL FROM address and (optionally) HELO address Sender-ID: Result of PRA algorithm (later) ‏

SMTP Service to Mobile User Scenario: Abe works for a.com, visits UK Uses hotel x.co.uk domain, mail server Sender-SMTP: x.co.uk From: abe@a.comabe@a.com Problem: Is this really abe@a.com? abe@a.com  x.co.uk can’t tell  Allows spoofing  Bad for a.com? Alice MUA Alice MUA MSA Bob MUA Bob MUA MDA Abe MUA Abe MUA Domain a.com Domain b.com MTA A MTA B MTA B x.co.uk MTA x.co.uk MTA Domain x.co.uk

Mailing list scenario (1) Sending-MTA not of `From' domain Abe MUA Abe MUA Domain a.com MTA A List- Out MTA AlAl ListServ Domain ML.com List Server MTA B Domain b.com List-In MTA

Mailing list scenario (2) Sending-MTA not of `From' domain S: 220 ML.com C: HELO a.com S: 250 OK C: MAIL FROM: abe@a.com S: 250 OK C: RCPT TO: L@ML.com S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: L@ML.com C: C: New to list Abe MUA Domain a.com MTA A MTA B Domain b.com List- Out MTA AlAl ListServ Domain ML.com List Server List-In MTA

List- Out MTA AlAl ListServ Domain ML.com List Server List-In MTA Mailing list's MTA not of `From' domain S: 220 B.com C: HELO ML.com S: 250 OK C: MAIL FROM: L@ML.com S: 250 OK C: RCPT TO: Bob@b.com S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: bob@b.com C: sender: L@ML.com C: C: New … Abe MUA Domain a.com MTA A MTA B Domain b.com S: 220 ML.com C: HELO a.com S: 250 OK C: MAIL FROM: abe@a.com S: 250 OK C: RCPT TO: L@ML.com S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: L@ML.com C: C: New to list

S: 220 B.com C: HELO ML.com S: 250 OK C: MAIL FROM: L@ML.com S: 250 OK C: RCPT TO: Bob@b.com S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: bob@b.com bob@b.com C: sender: L@ML.com C: C: New to list Abe MUA Domain a.com MTA A MTA B Domain b.com List- Out MTA AlAl ListServ Domain List Server List-In MTA S: 220 ML.com C: HELO a.com S: 250 OK C: MAIL FROM: abe@a.com S: 250 OK C: RCPT TO: L@ML.com S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: L@ML.com C: C: New to list Typical Mailing List Scenario (4) ‏ Change RCPT TO: address to reach recipients Change MAIL FROM to mailing list’s to capture bounces (DSN’s)‏ Often: same `from:` header; new `sender` and/or `resent` headers Change RCPT TO: address to reach recipients Change MAIL FROM to mailing list’s to capture bounces (DSN’s)‏ Often: same `from:` header; new `sender` and/or `resent` headers

Typical Mailing List Scenario (5) ‏ S: 220 B.com C: HELO ML.com S: 250 OK C: MAIL FROM: L@ML.com S: 250 OK C: RCPT TO: Bob@b.com S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: bob@b.com bob@b.com C: sender: L@ML.com C: C: New to list Abe MUA Domain a.com MTA A MTA B Domain b.com List- Out MTA AlAl ListServ Domain List Server List-In MTA S: 220 ML.com C: HELO a.com S: 250 OK C: MAIL FROM: abe@a.com S: 250 OK C: RCPT TO: L@ML.com S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: L@ML.com C: C: New to list Do not change message (some lists do, e.g. add list-name)‏ May change from: & to: headers Often not (or not `from:`)‏ May add some headers, e.g.: sender: L@ML.com)‏L@ML.com Resent-from,… Do not change message (some lists do, e.g. add list-name)‏ May change from: & to: headers Often not (or not `from:`)‏ May add some headers, e.g.: sender: L@ML.com)‏L@ML.com Resent-from,…

DSN and MAIL FROM Address If list fails to deliver to b.com:  Try alternate routes, retransmit on transient errors  Permanent failure: most lists do not send Delivery Status Notification (DSN) to sender abe@a.comabe@a.com  If failure persists: Remove bob@b.combob@b.com If B.com cannot deliver after sending 250 OK:  Send DSN to MAIL FROM address: L@ML.com Also called bounce address  Allows list to remove bad addresses If bounce DSN fails to deliver:  Do not bounce – can cause loop  DSN itself uses empty MAIL FROM address

Sender Policy Framework (SPF) ‏ Most deployed outgoing MTA authorization proposal Authorizes the Email envelope MAIL-FROM  The “bounce-to” address  Used to send Delivery Status Notifications (DSN) ‏ Optionally: also the HELO address  Always, or at least when no MAIL-FROM exists (in a DSN) ‏ SPF policy: a DNS record  A new SPF DNS record, or format for TXT DNS record Limit length (<450chars), so DNS response is sent w/ UDP  Record is for the domain name (no prefix), e.g. x.com Or wildcard: *.x.com Plus: can define at top-level domain (not SPF.x.com) ‏ Minus: conflict with other TXT records (try SPF first) ‏

SPF Process Sending MTA IP:1.2.3.4 a.com's authoritative DNS server b.org's DNS proxy server b.org's Incoming MTA... mailfrom=alice@a.com... Request SPF RR of a.com SPF RR of a.com, e.g.: v=spf1 +IP:1.2.3.4 ~all SPF RR of a.com Validate that sending MTA's IP (1.2.3.4) is authorized by policy Yes: continue accepting email No: reject (break SMTP connection)‏

SPF Policy DNS Record SPF policy records kept in DNS  In TXT and/or SPF records  Define which hosts can be Sender-SMTP for domain Simple Format, e.g. v=spf1 +IP4:133.3.3.3 –all (version 1, 133.3.3.3 is outgoing MTA, none other) ‏ SPFrecord =“v=spf”version terms  Version={1, 2.0}; version 2.0 is actually sender-ID [later] Terms= *( SP (([qualifier] mechanism) / modifier )  qualifier: one of: +, -, ~, ?  Mechanism: method of identification, e.g. by IP address  mechanism = ( all / include / A / MX / PTR / IP4 / IP6 / exists ) ‏

SPF Qualifiers Basic SPF example: v=spf1 +IP4:133.3.3.3 ~all (version 1, 133.3.3.3 is outgoing MTA, none other) ‏ Qualifiers: + [default] pass: allow use  E.g. this IP address allowed to send for this domain – fail: forbid use ? don’t know (Neutral) ‏  Policy not enforced yet (e.g. experimentation phase) ‏ ~ probable fail (SoftFail) ‏  Testing phase

SPF: Basic Mechanisms (Operators) ‏ “all”  matches everything (at end) ‏ “ip4:” ip4addr [“/” length], “ip6”…(same) ‏  IPv4 / IPv6 address, e.g. +ip4:6.6.6.0/24 “a” [“:” domain-spec] [“/” length]  address specified by A record for domain-spec “mx” [“:” domain-spec] [“/” length]  address specified by MX record for domain-spec PTR = “ptr” [“:” domain-spec]  Tests if rDNS for points to domain-spec Domain-spec: a domain name, or macro…

SPF domain-spec & other macros Many mechanism and modifiers can be macros  Complex syntax… simplified here; see ch. 8 of RFC 4408 domain-spec = *(macro / DomainString) ‏ DomainString = *(alphanum / “.”alphanum) ‏ macro = “%{“ macro-letter [*DIGIT] [‘r’] “}” macro-letter =  “s” :  “l” : local-part of  “d” :  “i” : (in dot-format) ‏  “h” : HELO/EHLO domain  “r” : domain name of host checking  “t” : current timestamp `r`: reverse, e.g.: =1.2.3.4  %{ir}=4.3.2.1 Number of parts, e.g.: =1.2.3.4  %{2ir}=2.1

“exists:” Mechanism and its Use “exists:” domain-spec  Tests for existence of domain-spec Use for:  Check on blacklist: -exists:%{ir}.DNSBL.org  Generate log to identify use of domain in testing phase, e.g.: v=spf1 exists:%{h}.%{l}.%{i}.log.%{d} ?all  Create fine-grained restrictions, e.g. per-user, per-time,…

include:domain Mechanism Returns result of applying SPF policy of specified domain E.g., if a.com sends mail via a.net or a.org, it may have: IN TXT “v=spf1 include:a.net include:a.org – all” If result is `pass` (+) - this is match; if Fail/Neutral: no match; if Error: abort with Error

Limiting SPF's DNS Queries SPF validation causes DNS queries:  For policy (four: HELO/MailFrom, TXT/SPF) ‏  Due to some (A, MX, PTR, Include,..) mechanisms Concerns:  Overhead and abuse for DdoS  Abuse for DNS-poisoning attacks Spec limits to 10 mech, 10 queries/mech  Still: can cause 100 queries by single email ! Other concern: forwarding

Forwarding with SPF Forwarding retains Mail-From address  To inform sender (Abe) of problems  Hence: (typical) forwarding breaks SPF [hidden] Solutions?  Don’t use SPF  Whitelist forwarders in SPF policy or by receiver include:b.edu or “skip SPF if sender-SMTP is b.edu”  Change forwarding SPF-compatible forwarding  Change MAIL-FROM to forwarder (b.edu) ‏  How senders can get Delivery Status Notifications (DSN)?

Typical Forwarding Scenario (1) ‏ Abe MUA Abe MUA Domain a.com MTA A b.eduO ut MTA b.edu Domain Forward service MTA B Domain b.com b.edu In MTA Forwarding by b.edu (Bob’s old school)‏

Abe MUA Abe MUA Domain a.com MTA A b.eduO ut MTA b.edu Domain Forward service MTA B Domain b.com b.edu In MTA Typical Forwarding Scenario (2) ‏ S: 220 b.edu C: HELO a.com S: 250 OK C: MAIL FROM: abe@a.com S: 250 OK C: RCPT TO: bob@b.edu S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: bob@b.edu C: C: Hi pal!

Typical Forwarding Scenario (3) ‏ S: 220 b.com C: HELO b.edu S: 250 OK C: MAIL FROM: abe@a.com S: 250 OK C: RCPT TO: bob@b.com S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: bob@b.edu C: C: Hi pal! Abe MUA Abe MUA Domain a.com MTA A b.eduO ut MTA b.edu Domain Forward service MTA B Domain b.com b.edu In MTA S: 220 b.edu C: HELO a.com S: 250 OK C: MAIL FROM: abe@a.com S: 250 OK C: RCPT TO: bob@b.edu S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: bob@b.edu C: C: Hi pal!

S: 220 b.com C: HELO b.edu S: 250 OK C: MAIL FROM: abe@a.com S: 250 OK C: RCPT TO: bob@b.com S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: bob@b.edu C: C: New to list Abe MUA Abe MUA Domain a.com MTA A b.eduO ut MTA b.edu Domain Forward service MTA B Domain b.com b.edu In MTA S: 220 b.edu C: HELO a.com S: 250 OK C: MAIL FROM: abe@a.com S: 250 OK C: RCPT TO: bob@b.edu S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: bob@b.edu C: C: New to list Typical Forwarding Scenario (4) ‏ b.edu does not change MAIL FROM  SPF validation by MTA B (b.com) fails  Message considered as spam b.edu does not change MAIL FROM  SPF validation by MTA B (b.com) fails  Message considered as spam

SPF-Compatible Forwarding (1) ‏ S: 220 b.edu C: HELO a.com S: 250 OK C: MAIL FROM: abe@a.com S: 250 OK C: RCPT TO: bob@b.edu S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: bob@b.edu C: C: New to list Abe MUA Abe MUA Domain a.com MTA A MTA B Domain b.com b.eduO ut MTA b.edu Domain Forward service b.edu In MTA

SPF-Compatible Forwarding (2) ‏ S: 220 b.edu C: HELO a.com S: 250 OK C: MAIL FROM: abe@a.com S: 250 OK C: RCPT TO: bob@b.edu S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: bob@b.edu C: C: New to list Abe MUA Abe MUA Domain a.com MTA A MTA B Domain b.com b.eduO ut MTA b.edu Domain Forward service b.edu In MTA S: 220 b.com C: HELO b.edu S: 250 OK C: MAIL FROM: bob@b.edu S: 250 OK C: RCPT TO: bob@b.com S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: bob@b.edu C: C: New to list

SPF Forwarding Breaks DSN Mail bounce: recipient (MTA B) sends Delivery Status Notification (DSN); empty MAIL FROM Abe MUA Abe MUA Domain a.com MTA A MTA B Domain b.com b.eduO ut MTA b.edu Domain Forward service b.edu In MTA S: 220 b.edu C: HELO b.com S: 250 Ok C: MAIL FROM: <> S: 250 Ok C: RCPT TO: bob@b.edu S: 250 Ok C: DATA S: 354 Enter C: from: mail@b.com C: to: bob@b.edu C: Subject: DSN How can b.edu inform Abe? Abe’s address is “deep within” DSN

SPF Forwarding and Bounces With SPF-compatible forwarding, bounces (DSNs) are sent to forwarder (b.edu) ‏ But – should reach originator (Abe@a.com)!Abe@a.com Forwarder cannot retrieve original bounce address from DSN: Too difficult, risky... Solutions: Fix / Extend / Replace SPF  So bounces go directly to originator  Sender Rewriting Scheme (SRS): forwarder retrieves original bounce-address, by encoding it in the bounce-address it sends

SRS – Sender Rewriting Scheme SPF-compatible forwarding:  Bounces (DSNs) sent via forwarder (b.edu) ‏ On forwarding: Encapsulate original MAIL-FROM in forwarded MAIL-FROM, e.g.: MAIL FROM: SRS0=hh=tt=a.com=abe@b.edu On receiving bounce (DSN):  Extract original bounce (MAIL-FROM) address from received MAIL-FROM  Forward DSN to original bounce (MAIL-FROM) address

SPF Forwarding with SRS Alice MUA Alice MUA Domain a.com MTA A MTA B Domain b.com b.eduO ut MTA b.edu Domain Forward service b.edu In MTA S: 220 b.edu C: HELO a.com S: 250 OK C: MAIL FROM: abe@a.com S: 250 OK C: RCPT TO: bob@b.edu S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: bob@b.edu C: C: New to list... C: MAIL FROM: SRS0=hh=tt= a.com= abe@b.edu S: 250 OK C: RCPT TO: bob@b.com S: 250 OK C: DATA S: 354 Enter C: from: abe@a.com C: to: bob@b.edu C: C: New to list

Alice MUA Alice MUA Domain a.com MTA A MTA B Domain b.com b.eduO ut MTA b.edu Domain Forward service b.edu In MTA Forwarding with SRS: Bounce (1) ‏... C: MAIL FROM: <> S: 250 OK C: RCPT TO: SRS0=hh=tt= a.com= abe@b.edu S: 250 OK C: DATA S: 354 Enter C: from: mail@b.com C: Subject: DSN Mail bounce: send DSN (Delivery Status Notification): RCPT TO  incoming MAIL FROM (bounce)‏ MAIL FROM  empty Mail bounce: send DSN (Delivery Status Notification): RCPT TO  incoming MAIL FROM (bounce)‏ MAIL FROM  empty

Forwarding with SRS: Bounce (2) ‏... C: MAIL FROM: <> S: 250 OK C: RCPT TO: abe@a.com S: 250 OK C: DATA S: 354 Enter C: from: mail@b.com C: Subject: DSN... Alice MUA Alice MUA Domain a.com MTA A MTA B Domain b.com b.eduO ut MTA b.edu Domain Forward service b.edu In MTA... C: MAIL FROM: <> S: 250 OK C: RCPT TO: SRS0=hh=tt= a.com= abe@b.edu S: 250 OK C: DATA S: 354 Enter C: from: mail@b.com C: Subject: DSN

Abuse of Naive SRS Spammer (or `Joe Job`)‏ Spammer (or `Joe Job`)‏ AlAl... C: MAIL FROM: <> S: 250 OK C: RCPT TO: SRS0=hh=tt= a.com= Joe@b.edu S: 250 OK C: DATA S: 354 Enter C: from: mail@b.com C: Subject: F&*^ you Joe! SRS as presented has a bug: Spammer can cause SRS forwarder to send “spam bounce” (Joe-job)! SRS as presented has a bug: Spammer can cause SRS forwarder to send “spam bounce” (Joe-job)! SRS as presented has a bugA spammer could sendSRS as presented has a bugA spammer could send Alice MUA Alice MUA Domain a.com MTA A b.eduO ut MTA b.edu Domain Forward service b.edu In MTA

SRS Authentication To prevent abuse, SRS forwarders authenticate the encapsulated bounce (MAIL-FROM) address MAIL FROM: SRS0=hhh=tt=a.com=abe@b.edu tt: Timestamp (limit time of bounce) ‏ hhh: Message Authentication Code (MAC) applied to bounce address, typically using HMAC_h:  h: crypto hash function, e.g. SHA-1, MD5, SHA-256  hhh = h( k || h(k || h(tt || “a.com=abe”)) ‏  Key k is known only to forwarder (e.g. b.edu) ‏  Typically, short prefix of MAC is used, e.g. 24 bits  Each forwarder can choose own MAC Hidden foils: SRS extended for multiple forwarders

SRS with Two Forwarders Regular SRS authenticated MAIL-FROM address: SRS0=hhh=tt=a.com=abe@b.edu Two forwarders SRS authenticated MAIL- FROM: SRS1=hhh=b.edu==hhh=tt=a.com=abe@a cm.org a.com MTA a.com MTA Abe b.edu MTA b.edu MTA acm.org MTA acm.org MTA b.com MTA b.com MTA First Forwarder Second Forwarder

SRS with Multiple Forwarders SRS also supports Multiple Forwarders Encoding must not be too long or it won’t pass! MAIL-FROM only need to include:  Authenticated MAIL-FROM as sent by first forwarder  Identity and authentication (MAC) of last forwarder SRS1=hhh=b.edu==hhh=tt=a.com=abe@ac m.org a.com MTA a.com MTA Abe b.edu MTA b.edu MTA acm.org MTA acm.org MTA b.com MTA b.com MTA First Forwarder Last Forwarder …

SPF vs. SIDF (Sender ID Framework) SPF identifiers discussed so far:  IP address – in IP header  HELO/EHLO domain – in greeting  MAIL-FROM (bounce) – in envelope Almost all MUAs identify messages using From:  And/or (less common) other message (RFC 822) headers SIDF (Sender ID Framework – by Microsoft) ‏  Idea: Authorize use of MUA-visible identity  Requires defining & displaying appropriate identity  Identify party responsible for sender-SMTP  Then look-up appropriate SPF record  PRA: Purported Responsible Address

PRA: Purported Responsible Address Last “entity” responsible for message transfer Hence – should authorize sender-SMTP May differ from originator (From:) identity  E.g. – mailing list identity (usually in “Sender”) ‏ Function of message header fields, e.g. From:, Sender:  Algorithm: in hidden foil Use SPF records: existing (SPF v=1) or "spf2.0/pra", "spf2.0/mfrom", "spf2.0/mfrom,pra"  Criticism: modified use of SPF v=1 records “Hybrid identity” – to be displayed by mail clients

PRA Algorithm (Simplified) ‏ If a “Resent-Sender” or “Resent-From” header exists – select first such header as PRA If there is a single “Sender” header – select it as the PRA If there is more than one “Sender” or “From” header – no PRA If there is a single “From” header – select it as the PRA Message is ill-formed – no PRA Example: Mailing lists use “sender”, or if “sender” already exists – “Resent-Sender”

Typical Forwarding and PRA S: 220 b.com C: HELO b.edu S: 250 OK C: MAIL FROM: abe@a.com S: 250 OK C: RCPT TO: bob@b.com S: 250 OK C: DATA S: 354 Enter C: From: abe@a.com abe@a.com C: Resent-From: bob@b.edu C: to: bob@b.edu Alice MUA Alice MUA Domain a.com MTA A MTA B Domain b.com b.eduO ut MTA b.edu Domain Forward service b.edu In MTA S: 220 b.edu C: HELO a.com S: 250 OK C: MAIL FROM: abe@a.com S: 250 OK C: RCPT TO: bob@b.edu S: 250 OK C: DATA S: 354 Enter C: From: abe@a.com C: to: bob@b.edu C: C: New to list Forwarding Service adds Resent-From  Forwarding user identified by PRA

Filtering by PRA & Submitter PRA is derived from header fields Disadvantage: Must receive (part of) message Blocking during envelope – more efficient Solution: Submitter Mail-From extension: S: 220 b.com C: HELO b.edu S: 250 OK C: MAIL-FROM:abe@a.com Submitter= abe@a.com S: 250 OK Allows filtering by MAIL FROM – envelope If pass: validate PRA after receiving message

PRA Usability PRA spec says: “… message SHOULD NOT be displayed … without the PRA header field” Few mail clients support PRA Microsoft Outlook support: From bob@b.edu on behalf of abe@a.com No usability data published so far  Do users understand what this means? Does PRA improve security?  We work hard to teach users “don’t trust email From”  Can they trust “on behalf of” ?

PRA: False Sense of Security? (1) ‏ Does PRA improve security – or degrade security? From bob@b.edu on behalf of abe@a.com What is secure now (for Bob@b.com - outlook user) ‏Bob@b.com If SPF/PRA (Sender-ID) was validated by MDA, then we know “last sender” is really bob@b.edubob@b.edu User can’t know whether validation was done!  Will not always be done – mail server decision But anyway…Recipient only cares about originator Bob trusts bob@b.edubob@b.edu Can Bob trust that message was from abe@a.com ?abe@a.com

PRA: False Sense of Security? (2) ‏ Does PRA improve security – or degrade it? From bob@b.edu on behalf of abe@a.com Suppose SPF/PRA validation was done And: Recipient (Bob) trusts bob@b.edubob@b.edu Can he trust message was from abe@a.com ?abe@a.com No! Would see this also if b.edu got message from:  Mailing list (not using SPF/PRA screening) ‏  Another forwarder (not using SPF/PRA screening) ‏  A spammer pretending to be such a list / forwarder Misleading – false illusion of security?

SPF and Sender-ID: Summary Many proposals to authorize Sender-SMTP MTAs Most important: SPF, SenderID  35% of email, 75% of fortune 100 [IronPort, 2006] Authorization can be for multiple aspects – receiving MTA may check one or more:  MAIL-FROM (SPF) – may require SRS for forwarders  PRA (“responsible sender” – SenderID) ‏  Net-Block owner approval (via rDNS) ‏  HELO identity Only PRA tries to prevent spoofing – new UI  But – does it improve security?  Or is it a false illusion of security?

Email connection from IP Content-based Classifier Displ ay Bloc k Is IP in one of the lists? Bloc k IP in blacklist Opt: add IP to blacklist Ok Not Tem p Failu re Add IP to greylist after delay IP in greylist Domain authorized this IP? (SPF)‏ Unauthorized IP (SPF hard fail)‏ IP in whitelist (for this domain)‏ Check Signature (DKIM)‏ Yes, trusted domain Sig OK, trusted domain Else No/bad signature (and sig required)‏ Opt: rDNS? Fail

Origin Authentication is Good If we can authenticate originators:  We can safely white-list trusted originators  We can safely black-list spammers  We can filter based on reputation and guarantees Authentication  Accountability  Good

A uthentication: Origin or Sender-SMTP ? Accountability – goal of SPF, SIDF, etc.?  By authorizing sender-SMTP for specific origin identifiers  OK for direct Email from outgoing MTA to incoming MTA  Problems with intermediaries, especially forwarders  Fails with reflectors: sending unmodified content to list  Fails for web-generated Email  Fails for non-Email spam, e.g. spam comments, spam blogs, … We need to authenticate the origin  Not necessarily the sender-SMTP  Allow forwarding of authenticated (signed) message…

Domain Keys Identified Mail (DKIM) ‏ Transparent, compatible with existing MUA  Only adds new headers, e.g. DKIM-Signature:  Ignored by legacy clients & servers  No unknown MIME type  no questions! Email signed by:  Outgoing MTA (usually) ‏  MUA (or intermediate MTA) ‏  But could be any DNS entity Simple key management

Example of DKIM Signature, Tags a= algorithm d= domain s= selector subdividing namespace for the "d=" (domain) tag q= key query method(s) (default: dns) ‏ i= `signing address`; email address of the user or agent accountable for message. Domain must be `under` d= domain. t= Signature creation timestamp x= Signature Expiration time (default: no expiration time) ‏ h= Signed header fields, z= copied header fields b= (encoded) signature value DKIM-Signature: a=rsa-sha1; d=example.net; s=brisbane; q=dns; i=@eng.example.net; t=1117574938; x=1118006938; h=from:to:subject:date; z=From:foo@eng.example.net|To:joe@example.com| Subject:demo%20run b=dzdVyOfAKCdLXdJOc9G2q8LoXSlEniSbav+yuU4zGeeruD00lszZVoG

DKIM Key Management DKIM goal: simple key management Default: fetch public key from DNS (q=dns) ‏  Other methods to fetch key are optional (and not specified yet) ‏ DKIM keys stored in “_domainkey” subdomain  DKIM signatures identify PK by domain (d=) and selector (s=) ‏  Given DKIM-Signature: d=example.net; s=brisbane…, the DNS query will be for brisbane._domainkey.example.com. DNS records, type: TXT or DKK DKK: New binary DNS type  Binary – to allow large keys in single DNS UDP packet (<512B) ‏  Try to get first – if fails, get TXT record

DKIM Process with Keys in DNS Sending MTA IP:1.2.3.4 a.com's authoritative DNS server b.org's DNS proxy server b.org's Incoming MTA... mailfrom=alice@a.com...DKIM-Signature..q=DNS Request DKIM RR of a.com DKIM RR of a.com: encoded signature-validation key DKIM RR Validate signature, reject email if invalid

DKIM Key Record Simple textual, tag-value-list format Tags:  v= ; default (and only valid now) is v=DKIM1  g= ; the local-part of the signing address (i=) specified in the signature, with possible use of * for wildcard, e.g. g=Alice+* (for Alice+home@a.com).Alice+home@a.com  Hash and signature algorithms (h=, k=) ‏  p=  s= : restrict usage to service, e.g. email  t= ; currently only defined is `y` for testing DKIM – recipient should ignore DKIM result (even if signature fails) ‏  Other (undefined) tags ignored (also in signature) ‏

DKIM: DNS Security Issues DKIM retrieves public key using DNS (if q=dns) ‏  Other methods to fetch key are optional (and not specified yet) ‏ Problem 1: DNS poisoning (fake response) ‏ `Excuses` (in spec): complex attack, limited damage… Solution: use Secure DNS (SEC-DNS) ‏  Domain owner signs all records of domain  Including public keys of SEC-DNS sub-domains  If DKIM record is in secure domain, we are Ok Validate that i=abe@a.b.com signed message, if:  Message contains DKIM-Signature: d=b.com; s=staff  Bob makes DNS query for staff._domainkey.b.com  Get response r s.t. valid v (r)=Ok where v is valid key of ___  And where abe@a.b.com is in scope (using g= tag) ‏abe@a.b.com

`Domain Policy Records` and Secure DNS Problem 2: attacker blocks DNS response  Validation fails… false positive / negative? How we know a domain uses DKIM?  To ignore unsigned (fake) messages from it  ADSP (author domain signing practices) in DNS ADSP says: This domain always uses DKIM (cf. SPF) ‏ What if attacker blocks ADSP?? Secure DNS server signs response range: Sign ac.il.s (from=“abe”,fkey=a8us…,to=“carl”,tkey=acjk7 …) ‏  We get proof of no record for Alice, Bob!

DKIM ADSP Process Sending MTA IP:1.2.3.4 a.com's authoritative DNS server b.org's DNS proxy server b.org's Incoming MTA... mailfrom=alice@a.com...(no DKIM-Signature)‏ Request ADSP RR of a.com ADSP RR of a.com: all my email has DKIM-Signature ! ADSP Reject: email from a.com must have DKIM-Signature

DKIM and Email Mangling (1) ‏ Mail agents may modify (“mangle”) email:  Add header  Insert “white space”, e.g.  Add prefix and/or suffix text (“scanned by…”) ‏ Cryptographic authentication:  Sensitive to any change in content  Validate v (m’, Sign s (m)) = False  For any m’  m, even m’  mangle(m) ‏ Solutions in DKIM [hidden]  Scope limitations  Canonicalization

DKIM and Email Mangling (2) ‏ Solutions to mangling in DKIM Scope limitations – scope(m):  L=n attribute: Only first n bytes of body  Sign only specified Email headers (h= attribute) ‏  Save copies of selected headers (z= attribute) Canonicalization function – canon (m) :  Remove mangling (white spaces, etc.) ‏  Property: Return same value for every mangling  c= attribute: canon. method for header, body  If m’  mangle(m) or m’=m then canon(m’) = canon(m) ‏  Else (m, m’ are “different”): canon(m)  canon(m’) ‏

Email connection from IP Content-based Classifier Displ ay Bloc k Is IP in one of the lists? Bloc k IP in blacklist Opt: add IP to blacklist Ok Not Tempora ry Failure Add IP to greylist after delay IP in greylist Domain authorized this IP? (SPF)‏ Unauthorized IP (SPF hard fail)‏ IP in whitelist (for this domain)‏ Check Signature (DKIM)‏ Yes, trusted domain Sig OK, trusted domain Else No/bad signature (and sig required)‏ Opt: rDNS? Fail

Refresh: Reverse DNS (rDNS) ‏ Resolve IP address to domain name  Uses special domains in-addr.arpa, ip6.arpa  Use PTR records IP addresses are more specific left to right Domain names are more specific right to left Result: rDNS IP listings are reversed Example: To find domain name for 133.6.7.8 use DNS PTR query for 8.7.6.133.in-addr.arpa

Reverse DNS Validation Goal: Validate SMTP-sender using its IP address Valid servers have rDNS entries  Controlled by ISP Incoming MTA issues rDNS for SMTP- sender  If none or different from HELO: failed rDNS validation is only an indicator:  There are legitimate servers without rDNS entries  There are spammers with rDNS entries

Email connection from IP Content-based Classifier Displ ay Bloc k Is IP in one of the lists? Bloc k IP in blacklist Opt: add IP to blacklist Ok Not Tempora ry Failure Add IP to greylist after delay IP in greylist Check SPF record SPF hard fail IP in whitelist Check Signature (DKIM)‏ Else SPF OK, trusted domain Sig OK, trusted domain Else DKIM hard fail Opt: add IP to blacklist

Content Filtering Content Filtering: Inspect message contents to categorize, block spam  Incoming and outgoing messages  By rules, statistics, machine learning [see hidden] Actions:  Allow  Block entire message  Block/change part, e.g. suspect attachment, url  Label, e.g.: X-Spam-Status: No, score=1.7

Spam Content Filtering – Motivation Content filtering often used to block spam  Server-based: protect entire network  Host-based: feature of many MUAs Can block spam from spambots  Spambot: malware sending spam Spamhaus estimate: 70% of spam – from spambots  New spambots send spam using default MSA Spam appears to come from ISP, etc. Blacklisting, greylisting, SPF, DKIM are ineffective Content filtering is the main defense

Rule-based Content Filtering Most contemporary mail clients support filter rules Example: `FREE` in body  discard Rules set by: Users, sys-admin, filtering service  In reality: this is beyond most (naive) users  Easy to get “false positives”, e.g. `FREE` may appear in non-spam messages Natural technique that was effective initially But: Spammers adapt – send `FREEE` instead It is difficult to maintain a good rule base  Motivation for using “updated spam rules” services  But: Spammers subscribe to the services and “tune-up” their spam

Statistical Bayesian Filtering Rule-based filtering is hard to maintain Rules are too simplistic  Binary decisions over one or very few attributes  Final decision is binary  How to combine several attributes?m Statistical / Bayesian Filtering  Based on conditional probability – Bayes formula  Estimate probabilities for many terms & attributes  Combine terms to minimize false positives / negatives  Very good detection ratios against current spam

Bayesian Filtering – Example Sample n=1000 “typical” messages Categorize (carefully) to spam / ham (non-spam) ‏ 340 300 z=40 Ham n=1000 600 y=400 Total x=660 300 360 Spam# of… Contain keyword (e.g. FREE)‏ Total Does not contain keyword 50% 10% False 50% z/y=90% CorrectDetection of FREE as spam… Positive (spam) detection (FREE)‏ Negative (ham) detection (no FREE)

Using Multiple Keywords We got 10% false positive and 50% false negative False positive rate is critical  Lost valid messages considered as spam  10% is certainly too high Real filters use multiple keywords Example: Using two keywords B – message contains “FREE” C – message contains “hardcore”

Positive Keywords Positive keywords indicate ham  Important - to avoid false positives  But: Not if known to attacker Work well for organizations, individuals  motivates organization / personal filtering Spammers try to learn positive keywords  Any response to spam – even negative – can help them  Or: Learn by spyware

Spam Content Filtering – Trends Currently very effective  Identifies spam with removal instructions easily – benefit from anti-spam laws  Detect “intentional typos”, etc. Example: Viagra  suspect spam Viaggra  almost definite spam Spammers adapt  Ultra-short spam: “You must see ” URL-blacklists become critical  Mimic legit mail (keywords,...)- esp. phishing  Spam comment/blog: immediate feedback

Phish-or-Real Classification Classifiers work well for most ad-spam  Keywords, URLs, lots of learning samples,... Challenge: classify to `Phish' vs. `non-phish`  Phishing emails mimic legit emails  Very different phishing email styles  Often few samples (esp. `spear phishing') ‏ Idea: combine classifier with auth (SPF,...)  classify as `Phish-or-Real' of foo.com, `other'  Phishing: `Phish-or-Real`of foo.com If `real' foo.com always passes SPF/DKIM/IP

Phish-or-Real Classification Classification to three types:  Identified spam/phishing emails  `PhishOrReal` (of foo.com) ‏  Other emails Email not from bank (e.g.: bank always signs)‏ Content-based classifier Phish/spam PhishOrReal from foo.com Other Display Bloc k foo.com always identified by IP, DKIM or SPF? Yes Add IP to blacklist with prob. p No

Phisher's Economics Analysis Phish received `White-listed` by IP, SPF or DKIM Content based classifier Cost c Other Else Suspect User Suspect s Cos t c u Gai n g p 1-p 1-p u pupu No Yes Block Hard fail Displ ay Trusted Phisher's Dillema: Better mimic (lower p u )  classifier more effective Weaker mimic (higher p u )--> user likely to suspect/ignore

Simplified Economics Analysis: Phish received `White-listed` by IP, SPF or DKIM Content based classifier Cost c Other Else Suspect User Suspect s Cos t c u Gai n g p 1-p 1-p u pupu No Yes Block Hard fail Displ ay Trusted Simplifications: c u  0, p u =x-p, (0 2g Phishers utility: U=-pc+(1-p)(1-x+p)g

When is phishing unprofitable? Simplifications: c u  0, p u =x-p, c > 2g Phishers utility: U=-pc+(1-p)(1-x+p)g=-gp 2 +(xg-c)p+(1-x)g  U'=-2gp+xg-c ; p*=x/2-c/2g  c>2g  p*=x/2-c/2g < x/2 – 1 < 0 Optimal choice is p*=max(0,x-1), p* u =min(x,1)‏ Optimal utility is:  If x>1: p*=x-1, p* u =1, U=-pc<0  If x 0  adversary is profitable if and only if x>1

Agenda: Email Security and Spam Intro to email security & spam  goals, threats, motivation Anti-spam defenses and their use  Hidden email address  White/Blacklists/Greylisting  SMTP-Sender Authorization (SPF, SIDF) ‏  Signed Email (DKIM) ‏  Content Filtering  Hidden: Reputation, Accreditation, Penalties Summary

Conclusions Spam and phishing are major problems Combine authentication, reputation and classification to filter spam and phishing Appropriate combination can make phishing unprofitable  Further research req. on `phish-or-real` filtering (and on user detection)

Summary Spam is a critical threat  Helps and motivates malware, break-in  Waste resources, annoys users  Lost Email (false positive), harms eMail Combine authentication, reputation and classification to filter spam and phishing  Can make phishing, spam unprofitable  Further research req. on `phish-or-real` filtering (and on user detection)

Beyond Authentication DKIM authenticates the sender Allows white-list – known & trusted senders What about unknown & un-trusted senders? Reputation and Accreditation:  Senders provide certificates (“recommendations”) from trusted authority  Reputation: Based on history of sender  Accreditation: Sender pays or deposits at authority Cost-based:  Postage: Pay for each message  Penalty: Pay only for spam

Reputation Services Allow to receive messages from  Reputable senders – known to trusted authority Senders provide proof(s) of reputation  From reputation / rating service providers  Cryptographically authenticated: Signature/MAC Or – authenticated by routing, e.g. DNS query Reputation Record contains:  Identifiers – public key, Email and/or IP addresses  Reputation data – rank, history, sources, address Who pays the reputation service? Biased? How to collect history of non-spamming?

Accreditation Services (1) ‏ Allow to receive messages from  Senders without (good) reputation & history  New or recovering from loss of reputation  E.g. after removing spamware / spambot… Sender Alice Recipient Bob Accreditation Service Mail (and certificate?)‏ Feedback / Query Register

Accreditation Services (2) ‏ Senders register to accreditation service  Sender gives a bonded pledge and/or pays fees  Certificate signed by accreditation service  Signature identifies sender and recipient  To prevent using one certificate for many recipients Sender Alice Recipient Bob Accreditation Service Mail (and certificate?)‏ Feedback / Query Register

Accreditation Services (3) ‏ Sender provides accreditation certificate  E.g. “hidden” email header extensions w/ message  Or – by DNS lookup Recipient provides feedback to accreditation service Sender Alice Recipient Bob Accreditation Service 2. Mail and certificate 3. Feedback (optional)‏ 1.Buy Certificate

Bonded Sender Accreditation Service White-list / accreditation service to (commercial) non-spam senders Senders deposit bond, certified by TRUSTe Recipients query service for status of senders Bond charged if complaints received Sender Alice Recipient Bob Bonded Sender Service mail DNS Query + feedback Register + bond TRUSTe Certificates Resolutions

Refundable Postage `Accreditation` Sender buys stamp and sends with Email  If sender is willing to pay, this is hopefully not spam… Recipient may deposit stamp (e.g. to use for sending), or refund sender (if non-spam) ‏ Sender Alice Recipient Bob Post Office 2. Mail + stamp 3. Deposit and/or refund stamp (optional)‏ 1.Buy stamp

Responsibility for Spambots, Zombies? Penalty mechanisms will fine victim users  Whose machine runs spamware Is this fair?  User is responsible for damages due to her computer  Motivating users to improve security Is this feasible?  Users may object to pay “spam fines”  Or – Mail Service Provider (MSP) pay spam fines… provide `flat fee anti-spam` service

Goodmail CertifiedEmail Service Accreditation service to (commercial) non-spam senders, (to be?) adopted by Yahoo!, AOL Spammers/spoofers penalized (or disconnected) ‏  Substantial `deposit` so penalty is substantial  Random feedback identifies double-use of tokens Sender Alice Recipient Bob CertifiedEmail Service m, Token =Sign(h(m))‏ Feedback h(m)‏ Token=Sign(h(m))‏ Fees ($$$) (per quantity)‏ MTA (e.g. AOL)‏ Recipient (Bob)‏ m Feedback Kick-back ($$)‏ Fees ($/0)‏

Who is Guarding the Guards? Can we trust reputation & accreditation services? Accreditation services charge the senders  Also some reputation services Incentive to be permissive  Allow (some) customers to spam False positives also possible (if not signed, or by Zombie) ‏  “Denial of reputation” – bad-mouthing attack  “I got spam from xxx” - maybe a lie? Or due to spoofing? Interoperable, `open` accreditation services? Solutions:  Proofs of spamming, not just complaints  Use well-defined spam: Email without “warning” label  Compensation to victim  Penalty to spammers

Financial Penalties for Spam Spam  mail with incorrect label  Sender/server digitally signs mail and label  Recipient may select content based on labels  Penalize if label is incorrect  In following foils: label  {‘OK’, ‘spam’} Penalties may be:  Loss of trust / reputation  Financial compensation / fine (money) ‏ Challenges:  Automated / secure resolution process  Bootstrapping, “do no harm”…

SeARP – Secure Automated Resolution Entities:  Sender (Alice) / Recipient (Bob) – legacy MUA  Mail Service Providers (MSP): MSP-A (sender’s) / MSP-B (Bob’s) ‏ Trust each other (to pay `spam penalties`) ‏ Agree on definition of spam and resolution process/agent  Resolution Agent/Authority (RA) – was it spam? Multiple scenarios  Deployed by both MSPs, with `legacy` RA (e.g. spamhaus) ‏  Deployed by both MSPs and RA  Later: adding payment server(s) ‏

SeARP: Deployed by MSPs (legacy RA) (MSP_B trusts MSP_A to pay – no PSP) ‏ Recipient (Bob)‏ MSP_BResolution Authority (RA): Norton MSP_A Sender (Alice)‏ m m, Sign A.s (m,”Ok”,expires)‏ “I’ll pay 10$ for each message with a virus, as defined by Norton’s list” Virus m, virus (?)‏ Virus: {…, m,…}, date (If date<expires, MSP_A owes 10$)‏ m

SeARP: Deployed by MSPs and RA (MSP_B trusts MSP_A to pay – no PSP) ‏ Recipient (Bob)‏ MSP_BResolution Authority (RA)‏ MSP_A Sender (Alice)‏ m m, Sign A.s (m,”Ok”,expires,10$,RA.v)‏ Spam/Ok m, Spam (?)‏ Sign RA.s (m,”is/isn’t spam”,date), (If date<expires, MSP_A owes 10$)‏ m “I’ll pay 10$ for each spam, as defined by RA.v; I sign by A.v”

SeARP: Deployed by MSPs, RA and PSP (MSP_B trusts PSP to pay – can be extended to many PSPs) ‏ PSPRecipient (Bob)‏ MSP_B (trusts PSP.v)‏ RAMSP_A (trusts PSP.v)‏ Sender (Alice)‏ m m, Cert(t), Sign A.s (m,”Ok”,RA.v)‏ Spam/Ok m, Spam (?)‏ R=Sign RA.s (m,”is/isn’t spam”,date), m A.v,ValidPeriod, MAC k (A.v,ValidPeriod,MSP_B)‏ Cert(t)=Sign PSP.s (v,ValidPeriod,total,MSP_B)‏ Deposit(R,m, Cert(t), Sign A.s (m,”Ok”,RA.v))‏

SeARP: Deployed by MSP_A, PSP only (MSP_B trusts accreditation by PSP, and (legacy) RA to decide) ‏ PSPRecipient (Bob)‏ MSP_B (DKIM)‏ RAMSP_A Sender (Alice)‏ m m, Cert(t), Sign s(t) (m,”Ok”,RA.s,amt)‏ Spam/Ok m, Spam (?)‏ R=Sign RA.s (m,”is/isn’t spam”,date), m s(t), MAC k (s(t),t)‏ Cert(t)=Sign PSP.s (s(t),expires)‏ Spammer penalized by PSP or by other recipients Use DKIM Process

Spam, Phishing and Security Last updated: 6/15/09 © Prof. Amir Herzberg Computer Science Department, Bar Ilan University

Similar presentations

Presentation on theme: "Spam, Phishing and Security Last updated: 6/15/09 © Prof. Amir Herzberg Computer Science Department, Bar Ilan University"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Spam, Phishing and Security Last updated: 6/15/09 © Prof. Amir Herzberg Computer Science Department, Bar Ilan University

Similar presentations

Presentation on theme: "Spam, Phishing and Security Last updated: 6/15/09 © Prof. Amir Herzberg Computer Science Department, Bar Ilan University"— Presentation transcript:

Similar presentations

About project

Feedback