Help with Spam filter

Underdog · Apr 13, 2005

Hello,

Does anyone know what exactly do the different levels of agressiveness mean? Is there a technical definition or anything of that kind?
Say...What's the difference between very agressive, agressive, normal...
Of course I know the meaning of these words alone...but just wanted to know if there is any set parameters of classification.

Thanks

SubSpace · Apr 13, 2005

These correspond to certain SpamAssassin spam score thresholds. I believe Normal means a threshold of 5.0, I'm not sure about the others, but they'd be lower than 5.0

That 5.0 is an abitrary number. SpamAssassin has a list of rules it applies to emails, and each matching rule modifies the mail's score by a certain amount, ranging from -100 (whitelisted) to +100 (blacklisted). Most content based rules have values of +0.1 to +2.0.

Not much can be said about detection effectiveness at certain thresholds. With a score of 5.0 (Normal) it's highly unlikely that a genuine (ham) mail would be marked as spam.

How an email scores exactly depends on the rule files that SpamAssassin uses. There is a standard ruleset, but people maintain rulesets to improve detection. An example are the rulesets on RulesDuJour, which I use in my home antispam configuration. Incidentally, H-Sphere 2.4.3 (which is currently in beta) adds support for RulesDuJour, which should improve spam detection greatly.

More details can be found here:
http://www.spamassassin.org/
http://www.exit0.us/index.php?pagename=RulesDuJour

Basically, what it boils down to is that none of those settings have any guarantees on accuracy or effectiveness

Stephen · Apr 13, 2005

Subspace did a great job breaking that down, and I believe the numbers given for the threshold are correct, but I will check.

Ok now for a rant from me to the air. It seems that spammers are using more and more techniques to get around spam filters, that just make no sense at all why they can not be caught. I have seen such obvious random letter spam recently, yet it has a very slow score. End rant

Underdog · Apr 13, 2005

Thank you guys.

So, in the next version of Hsphere we would be able to create our own rules?...that will be great.

Would that be a domain based feature?

SubSpace · Apr 13, 2005

Stephen said:
It seems that spammers are using more and more techniques to get around spam filters, that just make no sense at all why they can not be caught. I have seen such obvious random letter spam recently, yet it has a very slow score.

This type of circumvention is pretty hard to battle using regular expression rules such as those used by SpamAssassin. RulesDuJour helps here with frequently updated lists of the latest misspellings of popular products, but it's not perfect.

I don't have a lot of experience with them, but my guess is that Bayesian filtering would do a good job on these types of emails. The problem is that bayesian filters need to be trained in order to function. You need to feed them known ham and spam regularly (specifying which is which ofc). On shared hosting this would be especially proplematic I think.

Stephen · Apr 13, 2005

The problem is that bayesian filters need to be trained in order to function. You need to feed them known ham and spam regularly (specifying which is which ofc). On shared hosting this would be especially proplematic I think.

not to mention it will literally eat CPU and RAM

Help with Spam filter

Underdog

Perch

SubSpace

Bass

Stephen

US Operations