The growing prevalence and sophistication of web scraping bots Credit: REUTERS GRAPHICS In 2013, the Associated Press and The New York Times won a case against Meltwater, which had been scraping news for its own offering.In 2014, LinkedIn filed a lawsuit against unnamed parties after discovering that bots were used to scrape data from the profiles of hundreds of thousands of users (possibly by a startup trying to build their own database).Currently, the online coupon code service provider Coupon Cabin has filed suit against several competitors over stolen codes.Though these actors are in direct violation of the Computer Fraud and Abuse Act, it’s really difficult to identify who is actually perpetrating these bots. When enterprises like LinkedIn suffers a bot attack that scrapes the data of millions of users, the question isn’t only who wins in a legal suit, but who loses overall. It’s bots vs. people. The purpose of the data scraping? Rami Essaid, CEO of Distil Networks, said, “With LinkedIn, it’s a lot more competitive. Likely a sales navigator who is looking to grow their own business by using its stolen data. LinkedIn is all professional contact information, and the people that want to leverage that are likely providing this bot.”It’s hard to identify the actual perpetrators because it’s easier than ever to hide who you are as a bot maker. Essaid said, “Across the bot landscape, with every bot attack, 73% originated from more than one IP address, a quarter of them come from over 100 different IP addresses.” The bots are becoming so sophisticated that the perpetrators are mostly undetectable. In the LinkedIn case, said Essaid, “They may even know which companies are using their own data, which is why they named John Doe as the assailant so that they can go further exploring and get the VPNs and hosting providers to turn over the data.”A greater challenge in the search for justice in bot attacks is that web scrapers are currently evading detection because they have several layers of obfuscation. “The layers create a lot of steps, and that’s the challenge that LinkedIn is trying to verify right now,” said Essaid.These thieves have a similar methodology to other criminals who steal sensitive data for monetary gain, but Essaid said, “They are skating on the backs of others. Many of the perpetrators are legitimate, venture backed companies with lots of money that are doing this. They are trying to build a business on the backs of others.”From the criminal’s perspective, why wouldn’t they take advantage of these technologies when there are scant legal consequences? “It’s rare that we’ve seen criminal prosecution. The case with AT&T where the criminals accessed customer information that they accidentally made available online, the perpetrators got two years jail time, but usually it’s monetary repercussions if they are ever even caught,” Essaid said.So the attackers might have to pay damages on data and the use of those resources, but so far it’s been a risk worth taking. “For the scale of LinkedIn, if they can figure out who did it, there can be a big price tag associated with it,” Essaid said. But there is a cost impact for the victim as well. While there is likely little chance that LinkedIn will go after everybody that’s ever written a bot against them, they are investing resources to go after the worst of the worst. “They will go after the ones that are directly harming their business. Who are the biggest scarecrows? They are going to go after the biggest people as a warning flag,” Essaid said. The interesting thing is, even in the past, social networks have gone after people legally, but legal is usually the last resort. “There are strict security and technological barriers, and I’d ask if LinkedIn had done enough on their technological side. For you to prosecute, you need to show they went against your regulations but also that they were able to circumvent your barriers,” Essaid said.While the legal battle goes on, the scraped data is out there being used without the permission of those who trusted it to LinkedIn. I guess it’s all fair in this digital age of big data. Related content news analysis Searching for unicorns: Managing expectations to find cybersecurity talent Finding the cybersecurity leaders of tomorrow means being realistic about job descriptions and providing training and mentoring for non-traditional tech people. By Kacy Zurkus Sep 29, 2017 4 mins IT Skills Careers IT Leadership feature Vulnerability vs. risk: Knowing the difference improves security Conflating security terms evokes fear but doesn't help security newbs understand the difference between vulnerabilities and actual risks. By Kacy Zurkus Sep 26, 2017 3 mins Risk Management Vulnerabilities IT Leadership opinion What the Equifax breach means to me — an end user perspective Recovery and resiliency or apathy. Which will prevail now that most everyone's PII has been exposed in another massive breach? By Kacy Zurkus Sep 15, 2017 4 mins Cyberattacks DLP Software Internet Security opinion Abandoned mobile apps, domain names raise information security risks When app creators abandon domains for bigger, better deals, what happens to all the app-specific data? By Kacy Zurkus Sep 08, 2017 3 mins Access Control Data and Information Security Vulnerabilities Podcasts Videos Resources Events SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe