bergman on 5 Oct 2007 16:01:34 -0000


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] spam traps and solutions



In the message dated: Fri, 05 Oct 2007 10:59:12 EDT,
The pithy ruminations from "Sean C. Sheridan" on 
<[PLUG] spam traps and solutions> were:
=> For the last 12 years I've been using email.  At one point it was actually
=> useful, but it's really becoming a burden.
=> 
=> Up until last week I was doing a fairly good job of trapping spam, most of
=> them end up in my Spamassassin (SA) trap.  This week is a different story.
=>  I'm now getting 300-400 spam per day that do not get trapped.
=> 

That's bad.

=> These new emails are short and, of course, use forged headers.  Many of
=> them score 6-6.5 on the SA filter, my cutoff is set at 7.0.

Why is your cutoff so high?

=> 
=> I like the direction Meng's "Sender Policy Framework" was headed, but has
=> it been adopted universally?

No.

=> 
=> I do not like the "use gmail to filter it" approach for a variety of
=> reasons the most important being I don't have any interest in sharing my
=> private email with a public company who will store it and search it at
=> their discretion.
=> 

Agreed.

=> "IMPORTANT NOTE: Spam Assassin is not 100% reliable.
=> It usually does a pretty good job, but it is HIGHLY RECOMMENDED that you
=> DO NOT just discard mail that Spam Assassin has flagged as spam. Rather,
=> save such messages to a separate file or folder where they can be reviewed
=> once a week or so to check for messages that aren't really spam."

So, what's new?

=> 
=> I just do not have time to look through my 27,000 currently trapped emails
=> to see if I am missing an important new client request.

How important is the request? Perhaps it's worth your time...

=> 
=> I could just refuse all the things SA thinks are spam, but then people
=> argue that is a bad solution that leads to endless loops and bandwidth
=> consumption.

Huh? You seem to be confusing "refuse" vrs. "bounce".

Bouncing spam, in today's world of forged headers, is generally a waste of 
bandwidth and annoys the poor shlub who's account was forged as the sender.

In the context of SMTP, "refuse" is ambiguous. Do you mean that you'd:

	refuse connections from mail servers that sent spam

	refuse connections from mail servers after receiving headers and 
	determining (via a milter rule) that the mail is [probably] spam

	accept connections (thus consuming network resources from the spammer,
	which is probably just a hijacked Windoze box) but not deliver the spam

	etc.

=> 
=> note:
=> It is uncommon that I'll get legitimate email from overseas, but uncommon
=> is not equal to never.  Some of our biggest accounts came from Europe and
=> Africa via email inquiries.

Unless you're implementing some pretty drastic filtering, that's not relevent.

=> 
=> Before I go back to the dark ages and turn off email completely... which
=> I'm strongly considering, is there any light at the end of the tunnel?
=> 
=> I suspect I'm not alone.  In fact I've argued for years that email is one
=> of the biggest burdens on American business creating untold hours of
=> inefficiency.
=> 

Right. However, what's the cost of that inefficiency vrs the lost business cost of 
"some of [y]our biggest accounts" or "an important new client request"? Only 
you can answer that, and it's clearly not a technical question.

=> Does anyone have a good solution that I can implement on my fedora box
=> that will trap the crap and never create a false positive?

Absolutely not. Completely impossible. Like Holy Grail being carried on the back
of a unicorn. Like not having any traffic on I76 at rush hour and finding a
parking space in center city in time to walk into Morimoto without a reservation
for a free dinner at your choice of tables.


=> 
=> I've received 15 spam in the time it took to write this email, please save
=> me...

One easy solution is for you to tune your spamassassin filters.

For example, I run SA at my ISP. There, I throw away (do not deliver) mail with 
an SA number over 10. I do log the message headers, and periodically review 
those logs. In this case, "review" means using grep, sed, sort, etc. to check 
for apparently legitimate mail. I've never had a false positive there.
That filtering eliminates about 1/2 of my incoming mail.

On my local machine, I run SA with much more customized rules. Anything with a 
score over 5 gets flagged as spam. Periodically, I filter that mail, doing 
frequency counts on text, adding to the SA rules, etc. The automated filtering 
eliminates most of that spam, but there are some false positives. Those 
messages get flagged as "not-spam", the addresses get added to a whitelist, etc.


Here are some rough stats:

	Spam SA filtered out at ISP:		~1000/day	(never seen)
	Remaining mail:				~1000/day
	Mail SA flagged as spam locally:	 ~350/day
	Spam recieved but not flagged:		  ~30/day
	False positives at ISP:			    0/day
	Post SA filtering locally:		 ~150/day	(never seen)
	Mail from daemons, etc. filtered:	 ~300/day	(never seen)
	False positives retrieved autmatically:	  ~25/day
	Suspected spam to review locally:	 ~175/day	
	False positives needing manual action:	   ~5/day

On top of this, I get about almost 400 pieces of legitimate mail a day. Of 
course, most of that is from system daemons, and is also filtered (ie., if the 
cron job was successful, throw away the mail message unseen).

Mark

=> 
=> 
=> Sean C. Sheridan
=> scs@CampusClients.com
=> 
=> Campus Party, Inc.
=> 444 North Third St.
=> Philadelphia, PA 19123
=> (215) 320-1810, xtn 117
=> (215) 320-1814 fax
=> http://www.CampusClients.com
=> http://www.CampusParty.com

___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug