• United States




Data exchanges know all about you; soon they’ll impact cybersecurity

Nov 05, 20188 mins
Data and Information SecurityData PrivacyPrivacy

Our digital lives, physical locations and credit card usage are traded on exchanges. If there’s such a thing as “surveillance capitalism,” data exchanges are the closest thing to three letter intelligence agencies.

Abstract numbers for stock exchange
Credit: Thinktstock

Many consumers have noticed online ads with a surprising knowledge of devices belonging to them. Ads displaying an awareness of purchases, travel and communications, made on sites theoretically separated from the ad delivery platform. They even display a knowledge of purchases at brick and mortar stores. This is because information about our digital lives, our physical locations, and our credit card usage, are now traded and rented on data exchanges. If there is such a thing as “surveillance capitalism,” data exchanges could become the closest thing to three letter intelligence agencies.

Before taking a peek behind the curtain, let’s walk through tracking a user across the web. Typically, a user’s IP address is known by websites that he or she visits. Yet this IP only narrows down to an ISP, network, or to the geographic precision of a town. Here customers are triangulated only to a vague demographic.

Depending on the settings, a user’s devices may volunteer his geolocation down to a 10-foot radius. At this level of granularity, it’s known which building his device is in now, and where it has been in the past. This provides a detailed profile of a specific consumer of information and products.

When apps are installed, users are often quick to accept terms of service. This may include allowing the makers of those apps to eavesdrop using the microphone of the device the apps are installed on. For instance, your child’s game apps may be working with a company like Alphonso. Per the New York Times, Alphonso listens on the microphone to track which TV shows are watched, even when not using the app. Ever seen a product advertisement magically pop up after only a verbal discussion? No authoritative studies have verified this capability, yet amateur studies inducing ads through verbal queues can be found on the internet. Sound paranoid? Read on.

How to surveil across websites and multiple devices

It used to be that a website could only see their own cookies, but today third-party cookies are used by ad delivery scripts. They allow ad platforms to see activities spanning numerous sites. Tracking pixels are also designed to record URLs when rendering in the next new site visited. Yet these approaches are still only tied to a single device and navigation within it’s browser.

So far in my detailing of the tradecraft of surveillance, there was a line that hadn’t been crossed: They may know activities on and around your device, but the device was not tied to an identity. Yet the “data industrial complex” had to pass this line for cross-device tracking of buyer behavior.

User information is gathered under a device ID, which comes in two types: probabilistic IDs and deterministic IDs. Probabilistic IDs are based on a variety of inferred metadata, crunched in algorithms to associate a device to a user. Some claim they’re 70-90% accurate. Deterministic IDs are more directly matched to virtual identity: usernames, emails and phone numbers. Add in a credit card and verifying physical identity is airtight. They know the devices belonging to you.

Introducing data exchanges

Data exchanges allow companies to trade or rent user data on a platform that earns commissions on these trades. Many businesses participate to monetize visitors by selling surveilled data, and for sellers to purchase data to reach buyers and build targeted customer experiences. Even credit card companies and brick and mortar stores are in on the collaboration.

In a nutshell, they plug into the platform and say, “I know these things about people with this IP, or the person using Device ID xyz. Who can tell me more about them?” There is a caveat: while highly personal data flows into the exchange and is tightly attributed to a person, this attribution is hidden behind a “walled garden.” On the other side of the wall, advertisers pay to reach highly precise groups of users.

There are many upstarts competing to build exchanges, and, unfortunately, they likely have your data. Big players like Factual, DataMarket and Microsoft Azure Data Marketplace have built large, mature platforms. Per Google’s blog, Google Attribution is working to, “measure consumer journeys that now span multiple devices and channels across both digital and physical worlds.” From 2014-2017 they tracked 5 billion physical store visits, and in 2017 claim to be capturing, “approximately 70% of all credit and debit card transactions.”

Even the U.S. government is proposing personal data exchanges for banks, phone companies and the Post Office. They may even begin selling access to these citizen databases. Your service provider is probably in on it too. For example, Verizon has a program called Precision Market Insights.

What our dossier says about us

We don’t know which companies are sharing our data, and what information is being exchanged, unless we can see behind the paywall and into these platforms. Even if we assume bad faith on the part of Google, they still have it in their best interest to keep user information to themselves. Google’s ad platform, after all, is sold on having a secret knowledge of users that sellers do not.

Yet, surveilled data is now being traded subject to supply and demand. The value of the data is determined by whether advertisers wish to target us based on this information. Considering all that we do online, the physical places we go, our personal communications and purchases, this lays bare our entire lives.

Ethical considerations

At a recent privacy symposium in Brussels, Apple CEO Tim Cook put it best: “Rogue actors and even governments have taken advantage of user trust to deepen divisions, incite violence, and even undermine our shared sense of what is true and what is false. This crisis is real. It is not imagined, or exaggerated, or crazy.”

As a longtime privacy advocate, it’s startling to me that most citizens don’t care about how their data is being collected and sold. With the full consent of the masses, driven by bribes of free technology, look what has been built. The way forward is to talk about these things and ensure users know what it costs to surrender their data.

We are heading towards a plutocracy where we’re prejudged, rated, herded and easily manipulated for only our value as workers, consumers and voters. China’s social credit system is a frightening example. A ranking system to dish out rewards and punishments based on things like “bad driving, smoking in non-smoking zones, buying too many video games and posting fake news online.”

While Americans typically push back on authorities more than do citizens in China, does anyone think our private digital lives couldn’t be used to deny us employment or rights in the future?

Impact of data exchanges on cybersecurity

While it might be difficult for cyber practitioners to leverage data exchanges for their work, well-funded threat actors could certainly have access. Could information on a data exchange be rented by a state backed front group? Nobody knows. But certainly the applications of this dataset to social engineering and foreign information warfare is frightening.

Blackmail is often a prelude to insider threats who leak data or assist breaches from the inside. In the clandestine world of cybersecurity, one could easily advertise to tightly targeted groups of IT, InfoSec and executive insiders. For example, target them based on drug dependencies or extramarital affairs. Get this vulnerable demographic to click on an ad, and one has an opportunity to harvest an identity to blackmail. Of course, even easier is to hack one of the data exchanges. Now you have dirt on everybody.

Besides being frightened by data exchanges, the cyber industry should marvel at their accomplishment. Individual security vendors have successfully built threat intelligence leveraging their customer install base and an ecosystem of researchers and partners. Yet broad collaborations of threat intel sharing have never taken off to the level of data exchanges.

Open Source Intelligence (OSINT) is often criticized for low quality and stale data. There are success stories only where the data sharing is monetized. Perhaps the best example of collaboration is FS-ISAC. The security industry should learn from data exchanges; their model involves data being traded or rented to other vendors, with these platforms taking a 20-30% commission.

Three predictions for the future

  1. It’s unlikely that internet citizens will one day wake up and organize to oppose these encroachments on their privacy. Trading personal details for free technology is too entrenched in modern American culture to do much about. At the same time, big business will drive government regulations to limit the threat from foreign information warfare and election meddling. Thus, these protections will be built in the interests of business, and not necessarily users.
  2. A new field will arise to complement digital forensics. This field will deal in evidence distributed across the web. It will work on cyber attribution, missing persons cases, homicides, data leakage, and insider threats. This new field would likely leverage data exchanges.
  3. A new threat intelligence giant will emerge by copying the data trading and rental model of data exchanges. The lack of incentive for OSINT sharing can be overcome by monetizing the swapping or renting of threat data.

The last word: The potential impact data exchanges could have on cybersecurity shouldn’t be underestimated.


Prior to becoming an independent analyst, Paul Shomo was one of the engineering and product leaders behind the forensics software EnCase. In addition to his work in the digital forensics and incident response (DFIR) space, he developed code for OSes that still power many of today’s IoT devices. He is the co-editor of an upcoming special issue of the Journal of the Association of Computing Machinery (ACM).

The opinions expressed in this blog are those of Paul Shomo and do not necessarily represent those of IDG Communications, Inc., its parent, subsidiary or affiliated companies.