SpamAssassin 2010 bug

Posted by on Saturday, 2 January, 2010

I use SpamAssassin to filter the spam out of my incoming email. Last night I noticed that a legitimate email had a particularly high spam score. On further investigation I found that a rule named FH_DATE_PAST_20XX was triggering:

* 3.2 FH_DATE_PAST_20XX The date is grossly in the future

I checked the Date header of the email and it looked totally fine to me. It had just changed from the year 2009 to the year 2010. Could that be a coincidence? A quick look in /usr/share/spamassassin/ turned up the rule:

header FH_DATE_PAST_20XX Date =~ /20[1-9][0-9]/ [if-unset: 2006]

Oops. That regex matches on any year between 2010 and 2099. I googled for the rule and came across this:

In the comments it mentioned the problem which I found: “Note: the current rule in 3.2 will start matching legitimate dates from 2010-01-01. See issue #5852.” Looking at issue 5852, the problem was first identified on 2008-Nov-05 and was “fixed” in CVS on 2009-Jun-30. I’m using the standard stable Debian package which doesn’t contain this fix yet so I had to stick the following in my file to apply a score of 0 to it:

score FH_DATE_PAST_20XX 0.0

I think a lot of systems will be experiencing false positives on their ham because of this at the moment. It is a particularly high scoring rule considering that the default threshold is 5.0.

As I understand it, rules aren’t distributed with SpamAssassin as of the next version (3.3) so hopefully problems like this wont happen in future. The “fix” which was supplied for this problem five months ago was to update the regex so it matches 2020-2099 instead.

You can read the thread I started about this issue on the SpamAssassin users list here. It’s the one started at “Fri, 01 Jan 2010 00:57:37 GMT” with the subject line “FH_DATE_PAST_20XX”