IP Reputation

This is an automated, free, public email IP reputation system. For people contributing data, the results are already better than anything else used with spamassassin. Now we just need more data to make it more useful for everybody else.

The primary goal is a whitelist. Other data is provided as a consequence.

It is usable and fully automated as of 2011-03-31.

The data is the actual percentage of email from each IP which is ham (normalized like SpamAssassin's S/O score) and a count of the total emails from that IP (as a logarithm).

Data provided (updated daily).
Bind zone file (DNS format) (intended to be replaced by rsync and a SpamAssassin plugin).

The above two data files are released as public domain.

Reporting Script


Run as:

./iprep.pl ham:dir:~/mail/ham spam:dir:~/mail/spam/

Arguments are the same "targets" used by SpamAssassin's mass-check, mail folders containing email that has been hand verified to be entirely ham or spam.

<class> is "spam" or "ham"
<format> is "dir" (maildir), "file", "mbx", "mbox", or "detect"
<location> is a file or directory name. globbing of ~ and * is supported

Config file ~/.ipreprc:

$trusted_networks = '<space delimited list of trusted hosts>';
$user = 'username';
$pass = 'password';

$trusted_networks is very important, as it prevents you from reporting the IP address of your trusted relays instead of the IP actually sending the email. Include the IPs (or CIDRs) from both trusted_networks and internal_networks SpamAssassin values, documented here: network test options, trust path.
It's pretty normal for this to be empty.

Please run as a daily cron job.

Another option is to feed the email through STDIN with the --live-ham or --live-spam arguments, and later upload the data with the --upload argument (probably from cron):

cat ham.txt | ./iprep.pl --live-ham
./iprep.pl --upload


Email me for an account to allow you to upload. Please email me from a non-freemail account. Major examples of freemail accounts, which I do not want you to email me from, are gmail.com, yahoo.com, and hotmail.com. SpamAssassin has a more complete list of freemail providers. This is just an attempt to make it slightly more difficult for spammers to send me bad data.

Please let me know what username you'd like, so I don't have to guess. And I'd be curious to hear how you found out about this project.

DNS White / Black list

While I don't want to use DNS to provide the data long term, I am doing it now for testing.

SpamAssassin Rules

ifplugin Mail::SpamAssassin::Plugin::DNSEval
header   __RCVD_IN_IPREPDNS     eval:check_rbl('iprep-firsttrusted', 'iprep.chaosreigns.com.')
tflags   __RCVD_IN_IPREPDNS     nice net

header   RCVD_IN_IPREPDNS_100   eval:check_rbl_sub('iprep-firsttrusted', '^127\.\d+\.\d+\.100$')
describe RCVD_IN_IPREPDNS_100   Sender listed at http://www.chaosreigns.com/iprep/, 100% ham
tflags   RCVD_IN_IPREPDNS_100   nice net

header   RCVD_IN_IPREPDNS_50    eval:check_rbl_sub('iprep-firsttrusted', '^127\.\d+\.\d+\.50$')
describe RCVD_IN_IPREPDNS_50    Sender listed at http://www.chaosreigns.com/iprep/, 50% ham
tflags   RCVD_IN_IPREPDNS_50    nice net

header   RCVD_IN_IPREPDNS_0     eval:check_rbl_sub('iprep-firsttrusted', '^127\.\d+\.\d+\.0$')
describe RCVD_IN_IPREPDNS_0     Sender listed at http://www.chaosreigns.com/iprep/, 0% ham
tflags   RCVD_IN_IPREPDNS_0     net

describe RCVD_NOT_IN_IPREPDNS   Sender not listed at http://www.chaosreigns.com/iprep/
tflags   RCVD_NOT_IN_IPREPDNS   net

score    RCVD_IN_IPREPDNS_100   -0.1
score    RCVD_IN_IPREPDNS_50    -0.0001
score    RCVD_IN_IPREPDNS_0     0.1
score    RCVD_NOT_IN_IPREPDNS   0.0001

The zone is iprep.chaosreigns.com, with the typical reversed IP address lookup, and 127.0.0.<type> values. The values are 0, 50, and 100. 0 means 0% of the mail from the IP has been ham, 100 means it was 100%, and 50 means anything in the middle. Only 0.04% of the data is between 0% and 100%, which is why I'm not currently providing more ranges. So to look up, do:

$ host has address


Training on 400 of my emails, then testing on 100 of my own emails (not used in testing):
RCVD_IN_IPREPDNS_100 hit 79.3% of ham and no spam.
RCVD_IN_IPREPDNS_0 hit 27.8% spam and no ham.

Those are crazy good numbers alone.

After training on 170,000 emails from myself and one other person, testing 10,000 of our emails:
RCVD_IN_IPREPDNS_100 hit 94.1% of ham, and 0.010%
RCVD_IN_IPREPDNS_0 hit 64% of spam and no ham.

Also crazy good numbers.

So for people contributing data, the results are better than anything else available for spamassassin. But for it to be useful for people not contributing data, we need more data.

Uploaded Data

The actual data you upload looks like this, just a timestamp and IP address from each email:

$ head ~/iprep/iprep-spam-darxus.log


I'm planning to provide the data only via rsync, because I think this will reduce bandwidth loads. I'll create a SpamAssassin plugin to retrieve the data directly and create the SpamAssassin tests for it.


IPv6 is supported. IPs are aggregated to /48 blocks. So all IPs in 1234:5678:9012:* are lumped together. It is entirely possible this will change.

Mutt (mail reader) colorization

The mail reader I use is mutt. In my ~/.muttrc I have the following, to easily see what hasn't been flagged as ham by this data:

color index     yellow     default   ~hX-Spam-Status:.*RCVD_NOT_IN_IPREPDNS
color index     yellow     default   ~hX-Spam-Status:.*RCVD_IN_IPREPDNS_0

Google's white paper on reputation systems

Google presented a white paper on their email reputation system at CEAS 2006.

"Seems like this could all be more useful if there was a good way to automatically report addresses that sent non-spam."
- Darxus, November 2006, discussing dnswl.org. This sort of automation is still not used by dnswl.org, and a substantial part of my reason for creating this project.
I have been involved with DNSWL since then. I have provided a DNSWL DNS mirror since March 2007.

Mon Feb 27 16:00:54 EST 2012
Contact Darxus