Everyone hates spam - its many causes and challenges make filtering practically impossible - or is it? I believe that good spam filtering can be achieved - and that with the right tools you will "see" a small percentage of the spam that is actually sent to you. Let's take a look at how I set up my spam filters.
Spam filtering can happen at 3 levels on a typical mail server. The first level is the mail server itself. By detecting that mail is coming from an illegitimate source the mail server can reject the mail. It does this by using something called a "blacklist" which lists the ip addresses of known spammers. This used to be very effective because spammers used servers - usually outside the country - that were at fixed known addresses to spread their evil throughout the land. Now though spammers have given up on this - instead they rely on end users to distribute their mail for them - using a botnet. A botnet is a network of computers that have been "coopted" or taken over by malware, worms or other methods. These botnets mean that the spam you receive might come from any of millions of computers spread across the internet. No listing of known spam sources would be a particular help, because as an ip appears on the listing the botnet just moves the spamsource to another ip address in the network.
So the next level of spam filtering that happens on the server kicks in - statistical content filtering. At this leve the content of the message itself is examined to see if it resembles other spam. This worked ok - back when spam email was mostly text. The problem is that it becomes difficult to see what is stuff that you might WANT to see vs stuff you don't want to see. And spammers started salting their emails with a preponderance of random words to fool the filters into thinking the content was legitimate, including pasting sections of books into the content part of their messages. At the server level statistical spam filtering is doomed to only catch a small percentage of the spam. The reason for this is that if we server hosting companies set the statistical filter too tightly it starts to filter messages that our users think are really legitimate. This can cause all sorts of trouble - users are much more upset about finding that a legit message has been filtered than that some spam has been let through.
The next level of filtering - offered by some ISP's including OS-Cubed, offers the ability to create rules at the server that either discard or remove spam prior to it reaching your mailbox at work. This usually occurs by taking the spam that the server has already labelled and discarding it or tossing it in a junk mail filter.
But despite all this - over 500 of the 650 messages that came to me this weekend were headed for my inbox - so how did I cut that down to only 30? I use Outlook 2003 - with the spam filter turned on. Outlook versions older than 2003 (and all outlook express versions) have very poor and rudimentary spam filters. Outlook 2003 on the other hand has an excellent spam filter that you can teach what is and is not good mail. And it's filters are updated every month at the same time as the Microsoft Windows Updates occur (if you're signed up to update office as well as windows).
So 533 of the 550 messages that actually hit my mailbox ended up in my junk mail folder in outlook. And I had zero false positives - they were all really junk mail.
So my recommendations are:
- Find an isp that at least does rudimentary labelling of spam
- Set some filters on the server end to reduce the amount of spam if possible - but set them loosely
- Either use an email client like Outlook 2003 or some other client that has excellent filtering which is updated frequently, or get a product such as mailwasher to filter your mail.
Good luck and have a happy Turkey Day!