Image spam—e-mail solicitations that use graphical images of text—is not new. But its rising sophistication has made much of it invisible to spam filters so that it makes up one-third of all spam, according to Doug Bowers, director of antiabuse engineering at Symantec. E-mail traffic—83 percent of which was spam—rose in 2006, according to antispam company BorderWare, and researchers there expect image spam to grow.
The conceit of image spam is that people see things that computers can't. To fool a spam filter, you put text that humans understand in an image format; the computer sees a code, not letters and numbers.
Some spam filters try to recognize letters inside pictures using optical character recognition (OCR) technology. OCR was originally developed so that documents that were scanned into computers as images could be converted to text by matching the unique geometry of fonts to a dictionary of those geometrics. This bold A has a certain shape that an OCR engine can identify.
But the spammers outsmart OCR. They use unusual fonts or put noise in the picture (added color, gaps in letters that the eye overcomes and speckles of color on the page) so that the OCR engine doesn't see letters. The latest image spam uses tactics like word salads (nonsensical quotes from literature) as well as animated and layered GIF images that divide a message into several images layered on top of each other. Some have even gone old-school and removed links to click on and instead instruct users to type a link into their browser, since many filters refer to blacklists of known malicious links.
What's more, the image spam problem is getting mashed up with botnets. Bots distribute most spam, but the botnets are also being programmed to take one spam message and alter the image (by changing the size, shape, colors and other attributes) so that it's still readable but looks different to the filters that weed out identical e-mails.
Worst of all, says Andrew Graydon, CTO of BorderWare, image spam files are twice the size of previous spam messages, a network bandwidth and storage headache for companies required to store every received e-mail.
For years now, spam and spam filtering have waged a back-and-forth battle. Spam beats filters, filters improve. Repeat. Image spam is proving harder to filter because computers simply aren't that good at understanding what's in an image.
Some companies have started to update their spam filter engines to try to better control image spam, but companies should have no illusions about such reactive measures controlling the problem. New fronts in the fight—new spam delivery methods, like Google alert feeds and audio and video formats—are ahead.
–Scott Berinato