[PLUG] a question for the spamasassin gurus

sean finney on 30 Mar 2004 15:12:03 -0000

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

[PLUG] a question for the spamasassin gurus

From: sean finney <seanius@seanius.net>

To: plug <plug@lists.phillylinux.org>

Subject: [PLUG] a question for the spamasassin gurus

Date: Tue, 30 Mar 2004 10:11:18 -0500

Reply-to: plug@lists.phillylinux.org

Sender: plug-admin@lists.phillylinux.org

User-agent: Mutt/1.5.5.1+cvs20040105i

hey folks, we're attempting to get sa's bayesian filtering implemented globally for the 2-3,000 users on our mail system, and was wondering if anyone with similar experiences had anything to say about the pros and cons of different ways of setting this up on a modestly large scale like this. currently, sa places tagged spam into a junk mailbox, and everything else defaults into an inbox. one idea we've had is to create a third mailbox "spam", where users could put spam that made it past sa. this way we wouldn't have to worry about dilution of the bayesian db's by already caught spam. the trouble with this is that it requires teaching users to do something, and isn't immediately effective (users have to learn from >= 200 messages before the bayesian filter even starts working). another idea is to have sa-learn process the junk mailbox. the pros i see to this are that many folks have already been trained to put their spam there, and there's already a sizeable corpus from which to learn. the cons are that the majority of this mail was probably already caught by sa, so i could see this diluting the effectiveness of the bayesian filter to catch stuff that sa missed on its own[1]. of course, maybe the messages are still similar enough that this would be helpful? the third idea we had was to administer a global bayesian db ourselves (us == mail admins or maybe the its dept.). the pros to this is there's less work to get that going, no 4 hour cron jobs every night, and more technically skilled folks are ensuring the effectiveness of the filter. the cons are of course that the individual user would not have the ability to report spam/ham (at least in an automated sense[2]). has anyone implemented anything like this? other ideas? thoughts would be greatly appreciated. thanks, sean [1] apparently sa-learn automatically ignores spamassassin markup, which is convenient, but i think it'd still be reinforcing sa to catch what it already knows is spam [2] we thought about this, but in the end a disgruntled user could exploit this to mark anything from certain higher-ups or internal mailing lists as spam, which wouldn't be all that great with a large database used by everyone.
Attachment: signature.asc
Description: Digital signature

Follow-Ups:

Re: [PLUG] a question for the spamasassin gurus
From: gabriel rosenkoetter <gr@eclipsed.net>

Prev by Date: [PLUG] Has this been done already?

Next by Date: [PLUG] Cannot change perms on fat32 partition?

Previous by thread: [PLUG] Has this been done already?

Next by thread: Re: [PLUG] a question for the spamasassin gurus

Index(es):

Date

Thread