As I promised in a comment to my previous entry I want to show you how I configured spamassassin to use an inexistant mail address as a spam trap to gain better quality of the database it uses for the bayesian filtering. One day I was fed up with the false negatives I had to feed to sa-learn manually to keep the bayes database up to date. So I decided to publish a mail address that I will never use for anything other than poisoning spambots and then let spamassassin score mail to this address so high that it will ever be classified as spam. As a result, spamassassin will remember these messages as spam in its database. In my case the trap address is s.pemtrep@rompe.org. I realized that there are some other addresses that seem to exist in some spammers address lists only, so I simply applied them to the same rule. The rule for spamassassin is quite simple:
header ROMPE_BADRECIPS To =~ /(kuk|s.pemtrep|ballepromp)\@rompe\.org/i
score ROMPE_BADRECIPS 9.0
describe ROMPE_BADRECIPS Spam trap recipient
Add something like this to your /etc/spamassassin/local.cf and you are done. Publish the address on the web (but don't forget to mark it as a spam trap since you don't want humans to write to this address!) and soon the spambots will begin to feed your database with high quality spam. This will, of course, increase your traffic a bit, but it will definately lower your amount of false negatives.
The next logical step would be to combine this with something like teergrube and/or temporary host blocking, but we will have to accept the first mail for our database before we can start sanctioning. I will have to think about this. Comments are welcome.
Comments
Does this rule actually work
Does this rule actually work for anyone? I'm using it in user_prefs and it doesn't seem to have any effect. I'd appreciate any pointers.
Update: Here's what I've
Update:
Here's what I've learned:
1. The rule should look like this for best results: ToCc =~ /\b(?:uucp|majordomo|root)\@mydomain\.com/i
2. !Cpanel users: user_prefs may contain a variety of settings that work, but custom rules are typically disabled for security reasons.
3. This rule only contributes *header* rule points for Bayes autolearning. There must be at least 3.0 body rule points as well for SpamAssassin to learn from messages flagged by this rule.
apt-get install sa-exim ;)
apt-get install sa-exim ;)
Combination with postfix (2.0?)
teergrube might be a nice next step, but I'd suggest to do it the other way round - deny such incoming spam at the doorstep - close the door for recognized spam.
If I remember correctly, postfix (2.0?) shall have a feature like this: feed the current smtp-connection to some filter and depending on the result, deny the transport of that mail or allow it.
With a well trained database, it should be easy, to just send a "5xx" response to the spammers. If wanted, further steps can be used to classify an incoming smtp-session as spam, e.g. rtbl, hash-values etc.
On heavily used systems this will of course increase system load, but hey, computers are made for doing the work, not for idling around :-)
Best would of course be to have a dns-entry to verify that an incoming session coming from that computer can show "yes, I'm the mta for that email-sender". But that's far in the future, I guess.
jens
Show us the code! :-)
If I remember correctly, postfix (2.0?) shall have a feature like this:
feed the current smtp-connection to some filter and depending on the
result, deny the transport of that mail or allow it.
Yeah, that's a feature I I'm trying to use since — umm, well — I don't remember, its too long ago. But even if you succeed in rejecting a mail after receiving it, you won't change much. The traffic is used anyway, and the typical spam sender (aka wormed Windows system) doesn't really care if you send SMTP 550 or if you just drop the mail silently.
So if you want to safe some bandwidth the best way should be to publish loads of poison mail addresses and let your MTA reject the connection after reading the recipient and then trigger a temporary host blocking, maybe using iptables. The existing IDS projects should give us some ideas and tools for doing things like that.