• United States



Spam, a Lot

Jun 12, 20078 mins
Data and Information SecurityEmail ClientsMalware

Conversations with leading message filtering companies provide insight into the battle for e-mail security

Here are two stunning statistics from the war against spam. First, roughly 75 percent of Internet mail is now spam—that means for every legitimate e-mail message received, three pieces of spam are also received. There’s a lot of spam, and it’s more or less on the rise (although certain kinds of spam are becoming more or less popular).

The second statistic is about the effectiveness of businesses in handling spam. Apparently employees at businesses with 24 users or fewer see nearly 600 spam messages a month. What’s surprising here is that this is more than five times the spam that’s seen, on average, by employees at companies with 10,000 users or more.

Both of these statistics come from MessageLabs, one of the two dominant players in the world of spam filtering today. Spammers aren’t targeting small businesses, MessageLabs wrote in the March issue of its Internet Threat Watch. Instead, employees at small companies are less likely to have effective spam filtering measures.

This might seem like a self-serving finding from MessageLabs, which markets its service primarily to large corporations. But the conclusion is more or less in line with my own experience. Spam filtering is not something that you can set up and forget: An antispam system that works well today will slowly lose its potency as the spammers learn how to evade the filtering techniques that you’ve implemented. Large organizations can dedicate the time and money to staying current with their antispam technology, but small companies generally can’t. As a result, the level of spam seen by employees at small organizations slowly creeps up after each new system is deployed until the amount of spam becomes unbearable, then the next system is rolled out.

Recently I had the chance to speak with antispam specialists at MessageLabs and Postini (the other dominant player in the world of antispam). I asked both companies what they thought would be the greatest problems facing spam-fighters in the coming year. To understand the answers, it’s important to understand that spam has a lifecycle, and this lifecycle highlights many of the world’s persistent computer security problems.

Bot Economics

Most of the spam that reaches your mailbox was sent from a bot—an ordinary home or office PC that wouldn’t be notable other than the fact that it has a high-speed Internet connection and that it’s under the control of a malicious third party. I’ve seen estimates that there are between 1 million and 100 million infected computers in the world today. I have no idea how these estimates are made, whether they are reliable, and what they actually mean. But it’s clear that there are a lot of machines infected with bots, and that the existence of these machines represents a failure of today’s antivirus and antispyware approaches.

Because so much of today’s e-mail stream is spam, every message that’s received has to be filtered before it can reach your inbox. Today the best filtering systems perform a variety of tests, including content analysis and attribution—that is, they try to figure out who the real sender of the e-mail message is, as well as what product or website is being promoted, and then check the blacklists to see if the senders are known spammers. Attribution is also important in fighting other forms of Internet crime.

A significant amount of spam that reaches its intended destination contains phishing attacks. These attacks exploit a variety of security problems made possible by the Human/Computer Interface (HCI). As readers of this column know, HCI security is an important research area for both academia and industry.

Closing the cycle, those involved in this underground economy also need to recruit new computers to their botnets. Some spammers do this directly, while others rely on so-called bot-herder specialists. Typical herding techniques include sending out specially written infection programs by e-mail and spamming with the URLs of websites that are designed to exploit browser bugs. These techniques work because some people are still dumb enough to click on programs they receive, while other people are browsing the Internet with unpatched copies of Internet Explorer and Firefox.

Spammers have the upper hand in this cycle. Because herders have been so successful at recruiting bots, spammers have both more computational power and more Internet bandwidth available at their disposal than even the largest antispam providers. Spammers get instant feedback when their spam gets through because people click on the links. Because they are part of the underground economy, spammers generally don’t pay taxes on ill-gotten gains. Spammers can afford to experiment, because when their experiments fail, the worst that happens is that some of their spam doesn’t get sent. One result of this cycle is that spammers will continue to develop more effective spamming techniques as time passes because they are financially rewarded for doing so—there is positive market feedback.

As a result of this positive feedback, spammers are becoming increasingly sophisticated. “It’s become clear to anyone working in antispam that there have been a lot of developments,” says Matt Sergeant, MessageLabs’ senior antispam technologist. “Our speculation is that most of this is coming out of the ex-Soviet Russia and the Eastern Block. They really have teams of programmers on hand now. I am sure that somewhere there is a bunch of programmers, quality assurance teams [and other employees], all set up for creating this stuff. That presents a real challenge. They are thinking about this stuff on a technical level—exactly how they can get through our filters, what they can do to stay out of our blacklists.”

For example, one of the most difficult kinds of spam facing the filtering companies today is “stock spam”—spam that promotes a stock worth only a few pennies. Several studies have shown that stocks advertised in this manner generally jump for a few days and, as a result, the spammers can make thousands to tens of thousands of dollars for each batch of messages they send out. But stock spam is particularly difficult for spam filtering companies because there is no consistent brand name, phone number or website URL to be blacklisted. “All you have is a stock ticker symbol,” says Sergeant.

Dodging Blacklists

One way that spammers are avoiding the blacklists is by being much more selective in the way they send out spam. For example, says Sergeant, instead of sending a million messages from a single machine, spammers might instead send a thousand messages from a thousand machines. This is especially a problem when those machines are also sending legitimate e-mail, as might be the case when the infected machines are sending spam through the mail servers of their respective ISPs. Right now, says Sergeant, one of the biggest problems for his companies is the large number of relatively small and poorly administered Internet service providers doing business in the developing world.

Another big problem facing antispam companies is that individual spam messages are undergoing more processing by spammers and, as a result, can be more different from each other. “The arms race is chasing how these guys are morphing the context” of the spam, says Scott Petry, Postini’s founder and CTO.

The arms race is also moving into new areas. For example, both MessageLabs and Postini have antispam systems available for instant messaging systems. Recently the folks at Postini got an e-mail about spam on a public-access Web calendar: Somebody had added a repeating event advertising a mortgage broker.

MessageLabs and Postini operate as service bureaus. Companies that subscribe to these firms set up their name servers so that incoming e-mail gets sent directly to one of the bureaus’ data centers, where the mail is received, filtered, optionally archived and eventually sent to the intended destination (or not). One of the big advantages of this model is that the spam that’s filtered out never reaches the customer, so the customer doesn’t need to invest in servers, hard drives and Internet capacity to handle the spam. But a real disadvantage with this approach is that the spam kept in quarantine, including false positives, is usually deleted—typically after 30 days.

False Positives

My biggest problem with today’s antispam systems is the amount of false positives that they generate—mail that is not spam but is nevertheless classified as such. Browsing through my spam folder, I recently found invitations to review a paper for a conference (followed by nasty e-mails asking why I had not sent in my review); a dozen messages from a website for which I had lost a password (I had repeatedly clicked on the “password reset” button); e-mail from Sprint that my phone bill is available to view.

I try to minimize the impact of misclassified e-mail by keeping my spam messages forever. Although I may need to reevaluate this policy if my personal spam levels rise to 90 percent, right now hard drive capacities are growing faster than spam levels. And I’ve had too many important e-mail messages misidentified as spam, only to discover them weeks or months later. n

Simson Garfinkel, CISSP, is researching computer forensics and human cognition at Harvard University. Send feedback to