[nylug-talk] SpamAssassin best practices

Ron Guerin ron at vnetworx.net
Sat Nov 24 15:54:34 EST 2007


Chris Knadle wrote:
> On Friday 23 November 2007, Ron Guerin wrote:
>> Sunny Dubey wrote:
>>> If you really want to fight Spam, greylisting is the answer.  Its not for
>>> the faint of heart, but the 50 bazillion threads about it on the postfix@
>>> list should give you a clear idea of what other real mail admins think
>>> about it,
>> The important thing with greylisting is to do it correctly.
>> Implementations where the message never gets through are broken, and I
>> run into them every so often.  Of course if enough people greylist, the
>> spambots will eventually be adapted to queue.
> 
>    Perhaps, but spambots that queue would obviously have to store the queued 
> messages which would make the Trojan that does the work easier to detect as 
> disk space is eaten up.

The reason they don't queue right now isn't that the people writing the
software are too dumb to queue, it's that there's an advantage to not
queuing.  At some point the scales tip, and they'll queue because they
have to.

I don't think detection is much of an issue.  The vast majority of the
machines in these botnets are not under the control of a competent
maintainer.  I'm sure you're right that there'd be some who do a sloppy
job of it and fill up drives.  That's the sort of thing that finally
gets a machine some attention though, and they'll be looking to avoid
doing that.

Here's a for-example.  Greylisting works by forcing your sender to come
back and try again.  How long are you willing to wait for your mail to
come back and try again because you've told it to buzz off and try
later?  Probably not too long, right?  I'm sure your typical spambot has
a handful of messages to send to long lists of addresses, and doesn't
have any interest in RFC compliance beyond what's necessary to get mail
out.  It's not going to add a whole lot of storage requirements to queue
this because you don't need to store a copy of the message for every
recipient.  As far as the length of the queue goes, think of it in terms
of hours instead of days.  If you haven't gotten the message through to
the victim in 4 hours, it's not going to get there, and you can drop it
from the queue.

>    There are also plenty of US originated spam organizations [AKA "email 
> marketing"] that use real mail servers already [thus get through 
> greylisting], who "comply" with the CAN-SPAM act.  Their web pages typically 
> have an "opt-out" input field, which will remove an email address from being 
> spammed from one domain name -- but will immediately start spamming from 
> several other domain names.

The one virtue of what are known as "mainsleaze" spammers is that their
IPs are well-known or knowable, which provides some incentive for them
to deal with customers who generate a lot of complaints, but more
importantly, allows us to do anything from subject their mail to special
 handling to outright blocking.

>    I don't like using Bayesian filtering right at SpamAssassin because 
> spammers slowly poison auto-training.  From what I've read, the best 
> recommendation for Bayesian filter training is "train only on errors"; it's 
> generally difficult to get users to train the filter remotely, and 
> mistraining is a problem if the filter that is trained is global.

I dumped my Bayes, and tightly controlled what went through it, and it
was grossly misclassifying my mail within 2 weeks.  I think that's the
end of the road there.

>    However I've found that Bayesian filtering on mail clients themselves with 
> easy to use buttons for training seems to work well.

I haven't made a serious attempt at using the one in Thunderbird in a
long time now, but I wasn't getting good results from it last I did.

Unfortunately, I'm also a provider of mail services, so other people's
spam is my concern.  I have to provide something on the server-side if
for no other reason than it's becoming dangerous to allow people to
forward their mail out of your system to somewhere else unless you've
filtered the spam out of the mail stream.  If for example you have a
user, bob at example.com and Bob has set his mail up to forward all
incoming mail for bob at example.com to AOL where he's bobexample at aol.com
you may have a big problem on your hands.  Bob wanted all his mail in
his AOL account and now that he's got it, he's going through it, and
marking things as spam, including messages he forwarded to himself from
your server.  As a result, AOL's ill-conceived system will decide your
server is a spam source.  And not just for Bob, but everyone at AOL.

- Ron


More information about the nylug-talk mailing list