Blocking Spam
Premises:
- If every domain has an SPF (DNS TXT) record, it will be easy to block all spam.
- Because it would be easy to block all spam, it is appropriate to expect all domains to have an SPF record.
- It is easy to send errors to users of domains without SPF configured, without sending backscatter to forged addresses.
SPF records are an easy way for the owner of a domain name to list the servers that legitimately send email for their domain. A whitelist. Using their own DNS servers (a single TXT record for each domain and sub-domain).
Block all email without a matching SPF record, and blacklist all spammer domains using SPF, and you block all spam.
I expect blacklisting SPF domains to be much easier than IPs.
It is easy to reject email lacking an SPF match during delivery, so un-forged From: addresses get an error message, and you don't send backscatter to forged addresses.
While the percentage of domains using SPF is still improving, it might be better to use an additional rule in a spam filter (like SpamAssassin) to only slightly increase non-spams which are classified as spam. This is a little more difficult to do with error messages to legitimate senders, but still easy.
As of February 2010, 28.7% of my non-spam does not have an SPF record.
SpamAssassin rule for not matching an SPF record
meta SPF_NOT_PASS !(SPF_PASS || NO_RELAYS)
score SPF_NOT_PASS 4.506 # flag 10% of non-spam that hits this rule as spam.
describe SPF_NOT_PASS Not fully validated by SPF.
/etc/spamassassin/local.cf is a good place to put this.
Remember you're blocking all spammer domains with SPF using a domain blacklist.
| Spams without SPF blocked | Non-spams without SPF blocked | SpamAssassin score for SPF_NOT_PASS |
| 100% | 28% | 100 |
| 99.92% | 10% | 4.506 |
| 99.52% | 1% | 2.356 |
| 99.15% | 0.1% | 0.285 |
These numbers are estimated from the 97.84% spam accuracy of SpamAssassin and the 73.2% spam accuracy I'm getting due to significantly pre-filtering spam (RBL, greylisting, etc.).
Blacklist spammer domains using SPF
For postfix, in main.cf, add "check_sender_access hash:/etc/postfix/sender_access" to "smtpd_recipient_restrictions =". The format of /etc/postfix/sender_access is one domain per line, "example.com REJECT Domain blacklisted for sending spam."
It is the domain from the SMTP MAIL FROM command that you need to blacklist, often stored in the Return-Path: header. Not the From: address.
Domain name blacklists (bottom of the page) from Jeff Makey
Domain blacklists from spamlinks.net
Postfix syntax:
reject_rhsbl_sender hostkarma.junkemailfilter.com=127.0.0.2
reject_rhsbl_sender block.rhs.mailpolice.com
reject_rhsbl_client block.rhs.mailpolice.com
I'm not using these domain blacklists yet because it's too easy to maintain my own list.
Filter spam during delivery
To give error messages only to non-forged sending addresses.
With the Postfix mail server: Spampd as a Before-Queue Content Filter
"I found that with this setup on my sever, SpamAssassin couldn't determine the envelope sender as needed for certain rules (e.g. DNS_FROM_*, NO_DNS_FOR_FROM, SPF_*). I fixed this by passing the --sef (--seh could work as well; but see documentation first) switch to spampd and then adding envelope_sender_header X-Envelope-From to my SpamAssassin config. - JoshuaPettett"
Postfix main.cf:
message_size_limit = 10485760
spampd:
--maxsize=10240
This limits email size in both to 10 megabytes. Set it to whatever you like, but if your MTA accepts larger emails than spampd, spampd will skip spam filtering on them.
search spamassassin -D for "not available": aptitude install razor pyzor
Asking people to create SPF records
Please create this DNS TXT record for [example.com]:
[insert SPF TXT record]
It won't cause any SPF verification failures because the "?all" indicates the list is incomplete, but it will cause these listed servers to get a "pass" instead of a "none" from SPF verification. Which is good for spam filters that consider email that doesn't get a "pass" more spammy, like mine.
http://www.openspf.org/
Some domains using SPF
walmart.com
exxonmobil.com
chevron.com
verizon.com
homedepot.com
cvs.com
boeing.com
costco.com
target.com
dell.com
wallgreens.com
sprint.com
bestbuy.com
disney.com
americanexpress.com
macys.com
3m.com
google.com
gmail.com
aol.com
hotmail.com
amazon.com
ebay.com
apple.com
microsoft.com
schwab.com
hulu.com
gentoo.org
sprint.com
zappos.com
facebook.com
youtube.com
blogger.com
msn.com
twitter.com
myspace.com
craigslist.org
bbc.co.uk
photobucket.com
about.com
Stuff from 2006
The mail server software (MTA) I use is
Postfix.
Greylisting, SpamAssassin, SpamProbe, Image Spam, DNSWL, and Viruses
It's not necessary to do this all at once. That would probably be overwhelming. I recommend setting up one piece at a time.
- Greylisting
- I use Postgrey for greylisting. Basically if an email server which I have not heard from before tries to deliver email to me, it is immediately given a temporary error and asked to try again later. Well behaved mail servers will all try again, many spammers won't.
- SpamAssassin
- SpamAssassin is rule based, and uses a number of online services. It does things like "This includes a big html font and the sender address has been reported for spamming, therefore this is spam."
- SpamProbe
- SpamProbe is a (multi-word token bayesian) pattern learning filter. You tell it "here are the spams I'm getting, and here are the non-spams I'm getting", and it figures out on its own what the differences in the patterns are.
- Image Spam
- All image spams are blocked by postfix body_checks
- DNSWL
- DNSWL is a list of mail servers known not to send spam (White List), accessible in useful ways (including DNS).
- Viruses
- Viruses are filtered with ClamAV
How
- Anything that might be image spam is rejected by my postfix body_checks.
- Anything ranked high or medium by DNSWL.org skips all other filtering and goes right to my inbox. (Postfix Configuration)
- The rest is greylisted via postgrey from postfix.
- Procmail filters out viruses via ClamAv.
- Procmail filters remaining spam via SpamAssassin and SpamProbe. If they agree it's spam the email goes in my spam folder and I never look at it. If they agree it's not spam it goes to my inbox. If they disagree it goes in other folders which I check.
- After verifying everything is in the right place I run my retraining script.
Image spams are, I believe, a uniquely effective way to circumvent filtering, because all of their spamminess is contained in an inlined attached image. So I reject all images containing (basically) 'src="cid:' using the Postfix body_checks regex (pattern definition):
/\bsrc\s*=(?:3D)?\s*["']?cid:/ REJECT Your email was rejected because you embedded an attached image in the body.
This method ensures that legitimate senders will get an error message and will not result in error messages being sent to forged sender addresses from spam.
The postfix body_checks are implemented by putting this line in your main.cf:
body_checks = pcre:/etc/postfix/body_checks # image spam regex goes in this file
Then in the file /etc/postfix/body_checks you put the line:
/\bsrc\s*=(?:3D)?\s*["']?cid:/ REJECT Your email was rejected because you embedded an attached image in the body.
The regex is designed to match things that look similar to '<img src="cid:spam.jpg">'. The "pcre" part of the main.cf line indicates this is a Perl Compatible Regular Expression which means you can look it up in the
perlre man page. The meaning of each piece is as follows:
| / |
beginning of the regex |
| \b |
Matches "word boundaries", the point between the whitespace before "src" and the beginning of "src" |
| src |
the "src" part of the img tag |
| \s* |
any amount of whitespace (spaces, tabs, etc.), or none |
| = |
the "=" in the img tag |
| (?:3D)? |
quoted printable email encoding can replace an "=" with "=3D", this handles it |
| \s* |
any amount of whitespace (spaces, tabs, etc.), or none |
| ["']? |
single or double quote, "?" allows for it to be missing |
| cid: |
"cid:" the part of an img tag url that replaces http: and means it's an attached file, not hosted on a webserver |
| / |
end of the regex |
"REJECT Your email was rejected because you embedded an attached image in the body." defines the error message that will be sent only to legitimate senders.
my ~/.procmailrc config file
The relevant part of my postfix main.cf:
smtpd_recipient_restrictions =
permit_mynetworks
permit_sasl_authenticated
reject_unauth_destination
check_client_access cidr:/home/darxus/dnswl/postfix-dnswl-header # add X-DNSWL headers
check_client_access cidr:/home/darxus/dnswl/postfix-dnswl-permit # skip greylisting
check_policy_service inet:127.0.0.1:60000 # greylisting
body_checks = pcre:/etc/postfix/body_checks # image spam regex goes in this file
To get SpamAssassin to pay attention to DNSWL ranks I added these rules to /etc/spamassassin/local.cf:
header RCVD_IN_DNSWL X-DNSWL =~ /^none/
score RCVD_IN_DNSWL -0.1
describe RCVD_IN_DNSWL Sender listed at http://www.dnswl.org/, no trust
header RCVD_IN_DNSWL_LOW X-DNSWL =~ /^low/
score RCVD_IN_DNSWL_LOW -1
describe RCVD_IN_DNSWL_LOW Sender listed at http://www.dnswl.org/, low trust
header RCVD_IN_DNSWL_MED X-DNSWL =~ /^med/
score RCVD_IN_DNSWL_MED -4
describe RCVD_IN_DNSWL_MED Sender listed at http://www.dnswl.org/, medium trust
header RCVD_IN_DNSWL_HI X-DNSWL =~ /^hi/
score RCVD_IN_DNSWL_HI -8
describe RCVD_IN_DNSWL_HI Sender listed at http://www.dnswl.org/, high trust
header RCVD_IN_DNSWL_NO X-DNSWL =~ /^No$/
score RCVD_IN_DNSWL_NO 0.1
describe RCVD_IN_DNSWL_NO Sender *not* listed at http://www.dnswl.org/
SpamAssasin rules for use without Postfix (causes more network load for everyone):
header RCVD_IN_DNSWL eval:check_rbl('dnswl-firsttrusted', 'list.dnswl.org.')
score RCVD_IN_DNSWL -0.1
describe RCVD_IN_DNSWL Sender listed at http://www.dnswl.org/, no trust
header RCVD_IN_DNSWL_LOW eval:check_rbl_sub('dnswl-firsttrusted', '127.0.\d+.1')
score RCVD_IN_DNSWL_LOW -1
describe RCVD_IN_DNSWL_LOW Sender listed at http://www.dnswl.org/, low trust
header RCVD_IN_DNSWL_MED eval:check_rbl_sub('dnswl-firsttrusted', '127.0.\d+.2')
score RCVD_IN_DNSWL_MED -4
describe RCVD_IN_DNSWL_MED Sender listed at http://www.dnswl.org/, medium trust
header RCVD_IN_DNSWL_HI eval:check_rbl_sub('dnswl-firsttrusted', '127.0.\d+.3')
score RCVD_IN_DNSWL_HI -8
describe RCVD_IN_DNSWL_HI Sender listed at http://www.dnswl.org/, high trust
meta RCVD_IN_DNSWL_NO !RCVD_IN_DNSWL
score RCVD_IN_DNSWL_NO 0.1
describe RCVD_IN_DNSWL_NO Sender *not* listed at http://www.dnswl.org/
To get SpamProbe to pay attention to the DNSWL I ran this command (which will overwrite your config file):
spamprobe -H+x-dnswl create-config # needs to be lowercase
To install all relevant packages on a Debian based system, run:
aptitude update;aptitude install postgrey spamassassin spamprobe clamav
Versions on Debian Stable are typically pretty ancient. To keep up with spam it can be pretty important to verify you're running a reasonably recent version of this software.
How do I keep my spammers off my network?
Submit your mail server to DNSWL.org to be whitelisted.
Comment on this page.
Return to Darxus' home page.
Sun Feb 7 01:51:13 EST 2010