Can Spam
Here are three techniques to make your users' mailboxes spam-free.
by Keyur Shah
Posted March 23, 2004
Stopping spam is the most vexing problem e-mail administrators face today. Apart from being an annoying time waster, spam often carries viruses or malicious scripts that not only harm your applications, data, and operating system, but sap productivity, consume network bandwidth, and fill hard disks. Unfortunately, there is no easy way to avoid spam, and it's doubtful given the dubious ethics of spammers that any legal bar to spam would stop the rising tide anytime soon.
The biggest problem with blocking spam is that there's no method or product that works with total reliability for all users. In my experience, spam filters that run at the desktop tend to mistake legitimate messages for junk and require ongoing user intervention. Server-level filtering protects all users, is professionally managed, and when configured in the most aggressive manner possible, deletes spam before it reaches uses. This article looks at some of the techniques antispam tools employ, and weighs their strengths and weakness. It also outlines a quick-and-dirty method to stop spam at the desktop if your mail provider doesn't filter spam.
What is the Content of the E-mail?
An e-mail's content, in particular its word choice and grammar, can signal that it's spam. Spam contains certain patterns, such as liberal amounts of capital letters or multiple exclamation marks. The most common types of e-mail, such as multilevel marketing messages, are often attempts to entice the gullible with fraudulent schemes (see Figure 1).
Savvy people approach the following types of content with caution:
- Advertisements for products or services.
- Offers of money-making opportunities.
- Advertisements for pornographic Web sites.
- Vulgar content.
- Suspicious code in the message body or attachments.
On the other hand, a careful Spam tool will allow variations on the above to pass scrutiny:
- Legitimate messages from buyers or sellers at auction sites.
- Business plans if you are a venture capitalist or business professor.
- Messages from health-related organizations you support.
- The occasional off-color joke from a friend.
- Software patches from a trusted source.
The three most common ways of stopping spam are simple word filtering, pattern recognition, and connection filtering. A defense-in-depth approach uses these three, and sometimes more, to block spam. Here's an overview of each of these approaches.
Connection Filtering
A huge percentage of spam originates from a known list of servers. Blocking connections from servers on this list, or accepting messages and flagging them as highly suspect, eliminates a significant amount of spam. A company called Mail Abuse Prevention System maintains a database, called the Realtime Blackhole List (RBL), containing addresses of systems that should be prevented from making connections to your SMTP server. Other vendors offer similar lists, and it's a good idea to use multiple service providers to create a kind of fault tolerance in the event that one of them is offline or otherwise unreachable.
Fawcette Technical Publications, the publisher of Windows Server System Magazine, recently installed a spam filter that uses connection filtering as its primary mode. A preliminary analysis revealed this technique caught approximately 80 percent of the spam received. Although this is a terrific reduction and a ringing endorsement of the RBL, it indicates that no single technique is a silver bullet when stopping spam.
Exchange 2003 and other modern mail servers offer connection filtering. See "Stop Spam With Exchange 2003" in the March 2004 issue of Windows Server System Magazine for a detailed account of how to configure Exchange 2003 to operate with connection filtering and the RBL.
Pattern Recognition
Some antispam tools use an intelligent algorithm to determine if a message is spam. One algorithm is the Bayesian approach, which is based on the work of Paul Graham. This filtering technology is one of the most exact and reliable ways of differentiating spam from legitimate messages. The algorithm employs a statistical analysis of more than 30,000 spam letters to weed out the chaff with a typical accuracy rate of 97 percent.
A notable example is the SpamBayes project, which developed a Bayesian antispam filter. The major difference between this and similar projects is the emphasis on testing newer approaches to scoring messages. Whereas most antispam projects are still working with the original Graham algorithm, SpamBayes found a number of alternate methods that yield a more accurate result.
An antispam tool must be intelligent. If the tool can learn by experience it is much more useful than those that are static in their approaches to detecting spam. Learning can increase the filtering accuracy by analyzing a user's personal correspondence. The more a user works with the program, the better the level of accuracy of spam detection.
Many third-party spam filters allow you to designate friends and foes. You can create and use such lists to decide which e-mails to accept and which to reject. A list might include some common information such as names, e-mail addresses, or entire domains. For example, when you run Outlook Spam Filter for the first time, all contacts from your Address Book are automatically added to your friends list. You can edit your friends and foes lists to add contacts, and, more likely, spammers who somehow repeatedly thwart the automatic portion of the spam filter.
Perhaps a dozen vendors have created pattern recognition products that function either as plug-ins to Outlook or Outlook Exchange, or as add-ins to Exchange and other e-mail servers (see Table 1).
For small networks or home users, local plug-ins at the desktop are popular, but they are problematic because they require local administration and function only when the user is running his or her e-mail client. Make certain any product you purchase operates with all of the e-mail accounts you access with your client, such as POP3, IMAP, HTTP, and Exchange.
Word Filtration
Most modern e-mail clients and servers offer simple word filtration, which is a crude but effective spam tool. As the name implies, you create a list of obnoxious words or phrases you find in your spam. For space reasons, we won't include a list here. You can, however, find several exhaustive lists on the Internet by entering the search phrase spam word list.
E-mail clients and servers compare each new message to the list, and when they encounter a flagged word, the message is given some form of special treatment. Usually, suspect messages are placed in a subdirectory of your inbox called SPAM, upon which some users will have a two-week expiration date. This method gives users a chance to review the SPAM subdirectory at their leisure to determine if a legitimate message was snared in the dragnet by mistake. Brave users instruct their clients and servers to delete suspect messages automatically, but I don't recommend this for obvious reasons.
Creating Spam Filters for Microsoft Outlook
The most crude and reliable technique for getting rid of spam is to create your own spam filters. Here's how to create a simple spam filter in Outlook. Most other modern e-mail clients offer filters, and although the steps to create filters in other tools vary, they use the same basic concepts. To create a spam filter, open Microsoft Outlook and select the Rules Wizard command from the Tools menu. Next, click on the New button to create a new rule.
When you do, a screen will appear that contains a list of various templates. Select the Move Messages Based on Content rule and click Next.
On the next screen, make sure that the With Specific Words In Subject or Body checkbox is selected and click Next.
On the bottom half of the dialog box, click the Specified link and select the Deleted Items folder. Doing so will automatically move messages that meet your criteria to the Deleted Items folder.
Next comes the real trick. When you click on the Specific Words link, you're given the chance to input words or phrases that Outlook will use to differentiate between spam and legitimate e-mail. As mentioned earlier, the fastest way to get a good list is to peruse the lists others use and share on the Web.
After you've compiled your list of words you'll have a chance to create an exception rule. It is common to set the exceptions to people in your address book. You might also include some other specific users if you have friends who send you dirty jokes and aren't on your distribution list.
Click Next and you'll be asked to enter a name for the new rule. Call the rule something like Spam Filter and click Finish and OK to create the rule. Your simple filter is now complete.
About the Author
Keyur Shah has more than 14 years of IT experience and is involved in software engineering, development, architecture, and design. He is a senior developer at Verizon Communications and the author of several books.
|