When BOTS make legal headlines, who wins?

The growing prevalence and sophistication of web scraping bots

botnet
Credit: REUTERS GRAPHICS

In 2013, the Associated Press and The New York Times won a case against Meltwater, which had been scraping news for its own offering.

In 2014, LinkedIn filed a lawsuit against unnamed parties after discovering that bots were used to scrape data from the profiles of hundreds of thousands of users (possibly by a startup trying to build their own database).

Currently, the online coupon code service provider Coupon Cabin has filed suit against several competitors over stolen codes.

Though these actors are in direct violation of the Computer Fraud and Abuse Act, it's really difficult to identify who is actually perpetrating these bots. When enterprises like LinkedIn suffers a bot attack that scrapes the data of millions of users, the question isn't only who wins in a legal suit, but who loses overall. It's bots vs. people.

The purpose of the data scraping? Rami Essaid, CEO of Distil Networks, said, "With LinkedIn, it’s a lot more competitive. Likely a sales navigator who is looking to grow their own business by using its stolen data. LinkedIn is all professional contact information, and the people that want to leverage that are likely providing this bot."

It's hard to identify the actual perpetrators because it's easier than ever to hide who you are as a bot maker. Essaid said, "Across the bot landscape, with every bot attack, 73% originated from more than one IP address, a quarter of them come from over 100 different IP addresses."

The bots are becoming so sophisticated that the perpetrators are mostly undetectable. In the LinkedIn case, said Essaid, "They may even know which companies are using their own data, which is why they named John Doe as the assailant so that they can go further exploring and get the VPNs and hosting providers to turn over the data."

A greater challenge in the search for justice in bot attacks is that web scrapers are currently evading detection because they have several layers of obfuscation. "The layers create a lot of steps, and that's the challenge that LinkedIn is trying to verify right now," said Essaid.

These thieves have a similar methodology to other criminals who steal sensitive data for monetary gain, but Essaid said, "They are skating on the backs of others. Many of the perpetrators are legitimate, venture backed companies with lots of money that are doing this. They are trying to build a business on the backs of others."

From the criminal's perspective, why wouldn't they take advantage of these technologies when there are scant legal consequences? "It's rare that we’ve seen criminal prosecution. The case with AT&T where the criminals accessed customer information that they accidentally made available online, the perpetrators got two years jail time, but usually it’s monetary repercussions if they are ever even caught," Essaid said.

So the attackers might have to pay damages on data and the use of those resources, but so far it's been a risk worth taking. "For the scale of LinkedIn, if they can figure out who did it, there can be a big price tag associated with it," Essaid said. 

But there is a cost impact for the victim as well. While there is likely little chance that LinkedIn will go after everybody that’s ever written a bot against them, they are investing resources to go after the worst of the worst. "They will go after the ones that are directly harming their business. Who are the biggest scarecrows? They are going to go after the biggest people as a warning flag," Essaid said.

The interesting thing is, even in the past, social networks have gone after people legally, but legal is usually the last resort. "There are strict security and technological barriers, and I’d ask if LinkedIn had done enough on their technological side. For you to prosecute, you need to show they went against your regulations but also that they were able to circumvent your barriers," Essaid said.

While the legal battle goes on, the scraped data is out there being used without the permission of those who trusted it to LinkedIn. I guess it's all fair in this digital age of big data. 

This article is published as part of the IDG Contributor Network. Want to Join?

To comment on this article and other CSO content, visit our Facebook page or our Twitter stream.
Insider: Hacking the elections: myths and realities
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.