• United States



Image Spam: By the Numbers

May 15, 20074 mins
Data and Information SecurityEmail Clients

Ransom notes, pixel salad and GIF layering: How image spam dodges your email filters

Image Spam—an e-mail solicitation that uses graphical images of text to avoid filters—is not new. Recently, though, it reached an unprecedented level of sophistication and took off. A year ago, fewer than five out of 100 e-mails were image spam, according to Doug Bowers of Symantec. Today, up to 40 percent are. Meanwhile, image spam is the reason spam traffic overall doubled in 2006, according to antispam company Borderware. It is expected to keep rising.Here’s a graphical look at some of the techniques image spammers have used to try to beat your filters. First we’ll zoom in on some of the details in this sample email.

[image spam email]

1. GIF Layering

Just as word splitting divides words into multiple images to elude spam filters (see number three), an image spam can be divided into multiple images.

Like the transparent plastic overlays in Gray’s Anatomy, pieces of a message are layered to create a complete, legible message. In this rudimentary example, the spam is divided into three pieces (cut in the middle of letters for added obfuscation). But one message could comprise as many as a dozen layered GIFs.

2. Optical Character Recognition Duping (Through Color Alteration)

Optical character recognition (OCR) is the closest to sight that computers get. OCR works by measuring the geometry in images, searching for shapes that match the shapes of letters, then translating a matched geometric shape into real text. To defeat OCR, spammers upset the geometry of letters enough—by altering colors, for example—so that OCR can’t “see” a letter even as the human eye easily recognizes it. The effect is something like blurred characters in an eye test.

3. Word Splitting and Ransom Notes

If OCR catches up to the color tricks in image spam, a spammer’s next defense is word splitting. By dividing the image and leaving space in between the pieces, any image the OCR engine is examining is only a piece of a letter with its own distinct geometry.

word splitting

Instead of word splitting, some spammers have employed a ransom note technique in which each letter in the spam message is its own image, and each letter image includes background noise and other baffling techniques. A program cobbles together randomized letter images to make words. The effect looks like a classic ransom note with a mishmash of letters cut out from magazines.

4. Geometric Variance

geometic variance #2 in image spam

Many filters can intercept mass mailings based on their sameness. Images, though, can be altered easily without disturbing the message inside them. Thus one spam message will arrive as dozens of differently shaped images, and each time the colors of the text images will have changed, as will the randomly generated speckling and pixel and word salads. No two images are alike despite the fact that they carry similar messages.

Shown are two radically different images containing the same stock tip. The technique is popular as a scheme to boost prices of low-value stocks. In March, the SEC suspended trading on 35 such stocks that were the subject of these image spam messages, including some whose prices rose.

5. Speckling and Pixel Salad

Confetti-like speckles don’t affect the legibility of the necessary information but make every message unique to confuse a filter looking for patterns or high volumes of identical images.

Similarly, a bar of randomly generated color pixels can contain the vast majority of the image data. To a filter it’s full of patternless noise. We can see the words in the message while the image at the bottom doesn’t bother us.

Filters have improved their ability to find and trace spammy URLs and then block the message based on the inclusion of a bad link. To get around this, spammers will ask recipients to type the URL into their browsers.

hyperlink elimination, word salad, and animated GIFs

Other methods include word salads, text passages, often taken from classic novels, to confuse Bayesian filters and weighted dictionaries that rely on complex math or word scoring to determine the probability that some combination of words is spam. The filter sees predominantly natural text it can’t flag as illegitimate.

Another technique used to bypass filters consists of programming a GIF to slowly overlay its layers to create an animated GIF, similar to GIF layering. Here, with, each letter is a GIF layer. As they are stacked, it looks to the eye like someone typing in the letters into the address bar.