You can also let us know how they need to be trained. This is a beta period because we are having to tune the mail spam and ham.
However newsletters are a real pain, and we had major problems at postini and mailwise with newsletters, I'd say only 2-3% made it through others stayed quarantied. Newsletters can have a lost of "spam like" content even not being spam, with paid ads and the like.
Stephen,
Trying to spot spam using long lists of rules is very prone to this problem, unless the rule set is trained for the specific person. For example, I use, very successfully, a solution called PopFile (
POPFile: POPFileDocumentationProject) which is a Perl based Pop3 Proxy server, with a Baysian filter. I've trained it, and it's quite accurate.
This doesn't work for ISP's though, as everyone has a different idea of what spam is - newsletters being a good example.
One really good solution to this though is BrightMail (recently taken over by Symantec - see
Symantec Brightmail AntiSpam: Overview - Symantec Corp.) It works by setting up 'honeypot' email addresses, which attract REAL spam, but as they are not used for any real email, no genuine mailing lists send them mail. The system then uses the signatures of the spam it recieves to spot spam in the mail boxes it scans.
It does have a few false negatives (some spam sneaks through, but very little - certainly far less than your current solution!!) but more importantly, it does claim to only mark 1 genuine email in a million as spam - 99.9999% accurate.
I'd recommend you have a good look at this system, as it really is effective.
Regards
Adam