[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: spam blocking engine



Theo de Raadt <deraadt_(_at_)_cvs_(_dot_)_openbsd_(_dot_)_org> posted some anti-spam code...

Rejecting spam with a 550 type of error is a good thing.  I use 552 or
553, not sure that's better or worse.  You don't actually want spam to
sit on the other machine's queue (which is what you'd get with most 400
class errors) - that just gives that machine the excuse to try to
deliver it to you again, and again, tying up network resources.  Some
mailers are pretty stupid about retries, and will retry really quickly
if they can.  If you aren't going to accept it, you *want* to return a
permament failure which will leave the other machine with no legitimate
choice but to try to return it to the sender.  Usually that means it
piles up in some postmaster box, but occasionally that means the
spammer actually gets to find out, and refine their list, hopefuly to
exclude yours.  As best I can tell, this actually does work but still
leaves a lot that gets through.  I don't think it's worth doing any
form of "stuttering" when you detect a blocked IP address.  I have yet
to see a spammer that pays enough attention to his connection to even
*notice* such a thing going on much less care.  He's probably got much
worse problems anyways.  It's a sure bet an open relay won't notice -
after all, they didn't notice the spam going through their machine.

Getting a good list of open relays is hard.  The last time I looked
(which was, admittedly, a while ago) I did not find very good
correspondence between the sites from which I saw spam being
originated, and which sites were blocked in several popular lists.  How
good do you find the data from spews.org to be?

A big problem is when people set up .forwards or aliases that forward
spam.  I can't really complain too much about this; at umich, my
"mdw_(_at_)_umich_(_dot_)_edu" address is just such an alias - it actually forwards to
"mdw_(_at_)_quince_(_dot_)_ifs_(_dot_)_umich_(_dot_)_edu".  Quince runs mail software which tries to
bounce spam with a 552 error and a bible quote.  Unfortunately, if, like
like most spam, what quince saw had an invalid return address, the result
will go off to postmaster_(_at_)_umich_(_dot_)_edu where it's an invisible small drop
in a huge flood of other complaints and problems.

Quite a bit of the spam I see was sent via various mailing lists.  This
is basically the same problem as forwarding, except bounces are much
more likely to go off to the list maintainer, who at least has some
incentive to do something about the spam they see.

I've been experimenting with various forms of "content" filtering and
returning error 552 after looking at the SMTP data.  This works very
nicely even with list mail and mail forwarded through forwards and
such.  It also gives me the chance to try to recognize patterns related
to spam software rather than spam machines - which means (I hope) that
I can detect spam more proactively without the need for someone to
complain and register a machine's IP address with somebody.  On the
downside it's hard and labor intensive to write good logic to detect
such spam.  Also this means we now run a highly mutant sendmail.

I run basically the same mail software on grex.org aka cyberspace.org
-- in addition to the content rules, we also block by IP address.  We
have our own custom list of IP ranges based on local intelligence.  In
the case of dialin lines, dsl, spam domains, etc., we try to block the
whole subnet when we notice problems.  In a typical week, we refused
1690 pieces of mail because they came from blocked ranges.  The message
we give in that case includes a URL.  11 people looked at the URL
during that same week.  The rules that bounce mail based on content
checks rejected 2366 pieces of mail during the same interval (some of
these were probably regretably false hits).  We also rejected 112 from
machines that claimed they were grex, 13 pieces of mail from internet
hosts using unqualified loginids such as "MAIL FROM: nobody", and 981
pieces from hosts that did not use HELO correctly.  During the
same time, grex also delivered 30317 pieces of mail and there are
31534 loginids total on grex.

grex also bounced 18776 pieces of mail for the over-full mailbox of
some slimeball who we think was probably a student at a university in
Pennsylvania and was working at some cell phone store.  He created a
mailbox on grex and spammed everybody at his university using grex.  We
found out about it too late, locked the account, locked his other
account, blocked the IP address from which he came, and complained to
his ISP and to his university.  Mail to his ISP bounced.  His
university sent out a form letter saying since the spam hadn't come
from them they didn't care.  Grr.

					-Marcus Watts



Visit your host, monkey.org