I've been working on crafting some new spam filtering rules. Rather than
trying to decipher the content of the message, I'm only looking at where the
message came from and how it was delivered (spammers have gotten really good
at making the message content indistinguishable from legitimate mail, from a
software perspective). Spammers are thieves, and they steal resources from
whatever systems they can gain control of. It's usually pretty easy to tell
the difference between a message sent from a real legitimate mail server, and
a message sent from somebody's residential DSL connection in a country I
couldn't find on a map.
Unfortunately, I've run into a couple of problems. The first problem is
that a few spammers have set up legitimate-looking mail servers—these
aren't botnet machines; these are real servers colocated in a datacenter
somewhere, with domain names that match the “from” address and
all linked URLs, with static IP addresses and properly configured reverse
DNS and even SPF records. Fortunately there are still a few telltale signs,
and I've been able to collect a database of several thousand IP addresses
and domain names used in this way.
The second problem is legitimate mail sent from horribly broken mail
servers. These are servers with reverse DNS names that either don't resolve
at all, or resolve to the wrong IP address. The server identifies itself
with a HELO line that matches the broken reverse DNS name, just like a lot
of spam. Everything about it looks sleazy, but it's legitimate mail that
my users need to receive.
Some of it is from mailing lists, and some of it is from individuals
with stupid ISPs. However, some of it comes from companies like PayPal
and AT&T Wireless, both of whom use a third-party marketing company to
e-mail promotional offers to their customers. The marketing company
apparently doesn't know how to configure their servers properly, and
there's no real way to distinguish between these legitimate messages
and well-done phishing attempts.
I saw something similarly annoying several months ago - a legitimate
message from Wachovia to one of their banking customers where all the
links in the HTML e-mail (including links pretending to go to visa.com)
actually went to a redirector, NOT on wachovia.com. You would think
a bank would know better. How can we educate users to avoid
phishing scams if banks are using the same sleazy tricks that the
phishers use?
Anyway, I don't mind blocking legitimate mail from a stupid marketing
company, but it seems there are a lot of incompetent mail administrators
out there. I've been analyzing five weeks of data, during which time
two of my servers saw about 25,000 messages from servers with broken
reverse DNS. Of those, all but a few hundred were rejected for other
reasons, and many of the rest were quarantined as possible spam. Still,
there's an awful lot of spam hitting users' mailboxes that I could
easily block, if it weren't for the handful of legitimate mail that
would be blocked too.
Ugh.
|