This article is more than 1 year old
How I poured a client's emails straight into the spam bin – with one Friday evening change
What's in a word? $string, apparently
Sysadmin blog By misunderstanding how a single word was being used, I caused a boo-boo that counts as "really stepped in it this time".
After a lot of research and testing, I thought that months of "the spam filter is crap, make all the spam go away" warring with "the spam filter is too restrictive because $client can't send me his JavaScript-laden PDFs of ultimate doom" were about to be ended. As it turns out, I was more than a little incorrect.
I had found a set of great spam filters that were actually cheap enough to use, and they all tested out fine. I had, in fact, two separate layers of anti-spam protection I was planning to use; short tests wherein real-world traffic was redirected through a temporary setup were more than promising – they were fantastic.
I figured "hey, it's Friday, I'm going to redirect traffic back through the existing setup for now and go enjoy my weekend." I forgot to disable one little thing and ended up having to ask everyone to check their junk email for false positives that snuck in over the weekend. Grand.
The solution
The client in question insists on running its own Exchange server. There is an attachment to the idea of Outlook + Exchange + Public Folders that no force in the universe is ever going to dislodge.
The "anti-spam" features built into Exchange itself are pathetic to the point of weeping. That makes sense, as if they were worth a tinker's damn Microsoft couldn't sell you Online Protection for Exchange (OPE). While OPE is better than its predecessor FOPE, there's still a booming industry out there of competitors providing better service, cheaper service, or both.
My personal preference would be to punt the entire kit and caboodle into Google Apps and be done with it, but price sensitivity combines with a demand of data locality to make finding and providing the right solution …awkward.
Ultimately, I turned to Netgear. I have a UTM 150 to hand and have been testing it extensively over the past few months, and I found it to be a more than acceptable solution. It lives on my premises, it does a bang-up job of blocking crap, and as a bonus it sits between my main infrastructure and the net, reducing the load on the ancient virtual servers that run everything.
In addition to the UTM 150, I have added a copy of the EFA Project's E-mail Filter Appliance (EFA) to the mix. Not too deep into the forums is some info on integrating the EFA with LDAP for user lookups. This helps deal with backscatter, which is something that will get you on a grey list in a heartbeat.
The UTM 150 will sit in the DMZ and the EFA will sit behind it. The EFA will talk to the Exchange server which in turn will send all mail back out through the EFA (and hence the UTM 150) as a Smart Host. For added security, the Exchange Server's default gateway (and hence its OWA page and so forth) are actually going through a separate firewall; the UTM 150 and the EFA are the only thing hanging on that IP, and together, they make a decent defensive combo for a small to medium biz.
This solution appears to work quite well; what the UTM 150 doesn't get, EFA does.
Matches versus contains
Where this all went wrong is when I pointed the email traffic back to the original outdated frankenspamserver I had been using before.
In preparation for implementing a new anti-spam setup, I created a Transport Rule in the Exchange server to set the Spam Confidence Level (SCL) to "7" for any emails it found that had a header called “X-Spam-Status” whose value was set to “Yes”. This would allow us to use any number of third-party anti-spam features on our email fairly easily – be they subscription-based or the UTM + EFA combo I built – because this is how most third-party anti-spam services flag emails as spam.
Unfortunately, our existing freankenspamserver has Amavis, Clamav, Spamassain, Pyzor, Razor and a few other odds and ends. Somewhere between that mixture of apps the X-Spam-Status header gains a whole lot more information than just "Yes" or "No".
As an example of one of the emails that ultimately got flagged as a "false positive" by the Exchange server's transport rule had an X-Spam-Status header that reads "No, score=2.8 required=4.1 tests=BAYES_”. As you can see, despite the “X-Spam-Status” header clearly being labelled as “No” to human readers, the text string “tests=BAYES_” contains the word “Yes”.
Face, meet palm.
I thought I was being all proper; a brief "live" test of less than five minutes to gather stats, followed by a "live" test of about half an hour to gather a broader range of possible emails. Revert things back and study the results for a few days. When I was absolutely confident in the new solution, I switch over permanently.
I even knew, somewhere in the back of my brain, that the existing server added this sort of information to the X-Spam-Status header. It never even occurred to me that it would be a problem for one simple reason: the Exchange 2010 UI says "when the message header matches text patterns".
Some part of my brain – which had spent a good chunk of that Friday buried in a pile of PHP code trying to solve some obscure problem with a middleware app – read the word "matches" and thought...
if ($header == $array_you_input) { set_SCL("7"); }
Translated from code, that means my mind read "matches" mean "will trigger only if the string matches exactly". Microsoft meant "will trigger if the string contains the string you entered."
Assume nothing
In this case, basic tests worked just fine. I could cheerfully send email from my home address after swapping the new anti-spam environment for the old one, because my home address is on a whitelist somewhere in the spam server. It never went through any sort of Bayesian scanning and thus never had “tests=BAYES_” added to its X-Spam-Status.
Mail from clients in the white list also made it through, so checking up on the server after the switchover showed what appeared to be perfectly normal traffic.
But I made a small assumption and a big oops. Matches versus contains. This is how life's little lessons are learned; even if you think you're being clever by testing everything – analysing results, reverting between tests and so forth – it is still entirely possible to make dumb mistakes based on nothing more than a misinterpretation of language. ®