The 5 worst big data privacy risks (and how to guard against them)

There are enormous benefits from Big Data analytics, but also massive potential for exposure that could result in anything from embarrassment to outright discrimination. Here's what to look out for — and how to protect yourself and your employees

balloons in sky celebration party
Credit: Shutterstock

Big data, as its proponents have been saying for nearly a decade now, can bring big benefits: advertisements focused on what you actually want to buy, smart cars that can help you avoid collisions or call for an ambulance if you happen to get in one anyway, wearable or implantable devices that can monitor your health and notify your doctor if something is going wrong. 

It can also lead to big privacy problems. By now it is glaringly obvious that when people generate thousands of data points every day — where they go, who they communicate with, what they read and write, what they buy, what they eat, what they watch, how much they exercise, how much they sleep and more — they are vulnerable to exposure in ways unimaginable a generation ago. 

It is just as obvious that such detailed information, in the hands of marketers, financial institutions, employers and government, can affect everything from relationships to getting a job, and from qualifying for a loan to even getting on a plane. While there have been multiple expressions of concern from privacy advocates and government, there has been little action to improve privacy protections in the online, always connected world. 

It was more than five years ago that the Obama administration published a blueprint for what it termed a Consumer Privacy Bill of Rights (CPBR), in February 2012. That document declared that, “the consumer privacy data framework in the U.S. is, in fact, strong … (but it) lacks two elements: A clear statement of basic privacy principles that apply to the commercial world, and a sustained commitment of all stakeholders to address consumer data privacy issues as they arise from advances in technologies and business models.” 

Three years later, in February 2015, that blueprint became proposed legislation by the same name, but it was immediately attacked, both by industry groups, who says it would impose “burdensome” regulations, and by privacy advocates, who says it was riddled with loopholes. It never made it to a vote. 

The CPBR declaration that the, “consumer privacy data framework in the U.S. is, in fact, strong …” ironically came about a year before revelations by former NSA contractor Edward Snowden that the U.S. government was, in fact, spying on its citizens.

Beyond that, government hasn’t been able to agree on other privacy initiatives. The so-called broadband privacy rules issued by the Federal Communications Commission (FCC) just before the 2016 election, which would have limited data collection by Internet service providers (ISPs), were repealed by Congress in March, before they took effect. 

Susan Grant, director of consumer protection and privacy at the Consumer Federation of America (CFA), called it “a terrible setback,” and says it would allow ISPs, “to spy on their customers and sell their data without consent.” Others, however, have argued that putting limits on ISPs would still leave other online giants like Google free to collect and sell the data they collect, and consumers would see few, if any, benefits. 

Given all that, it should be no surprise that experts say privacy risks are even more intense, and the challenges to protect privacy have become even more complicated. Organizations like the CFA, the Electronic Privacy Information Center (EPIC) and the Center for Democracy and Technology (CDT), along with individual advocates like Rebecca Herold, CEO of The Privacy Professor, have enumerated multiple ways that big data analytics, and resulting automated decision-making, can invade the personal privacy of individuals. They include: 

1. Discrimination

EPIC declared more than three years ago, in comments to the U.S. Office of Science and Technology Policy that, “The use of predictive analytics by the public and private sector … can now be used by the government and companies to make determinations about our ability to fly, to obtain a job, a clearance or a credit card. The use of our associations in predictive analytics to make decisions that have a negative impact on individuals directly inhibits freedom of association.” 

Since then, things have gotten worse, privacy advocates say. While discrimination is illegal, automated decision-making makes it more difficult to prove. “Big data algorithms have matured significantly over the past several years, along with the increasing flood of data from the nascent internet of things, and the ability to analyze these data using variants of artificial intelligence." says Edward McNicholas, global co-leader of the Privacy, Data Security, and Information Law Practice at Sidley Austin LLP. “But despite this technological growth, the legal protections have not advanced materially.” 

“I think the discussion around big data has moved beyond mere accusations of discrimination to larger concerns about automated decision-making,” says Joseph Jerome, policy counsel at the CDT, who noted that it has been used, “to direct calls at call service centers, evaluate and fire teachers, and even predict recidivism.” 

Herold has been saying for years that big data analytics can make discrimination essentially “automated,” and therefore more difficult to detect or prove. She says that is true, “in more ways than ever” today. “Big data analytics coupled with internet of things (IoT) data will be — and has already been — able to identify health problems and genetic details of individuals that those individuals didn’t even know themselves,” she says.  

McNicholas believes, “the most significant risk is that it is used to conceal discrimination based on illicit criteria, and to justify the disparate impact of decisions on vulnerable populations.” 

2. An embarrassment of breaches

By now, after catastrophic data breaches at multiple retailers like Target and Home Depot, restaurant chains like P.F. Chang’s, online marketplaces like eBay, the federal Office of Personnel Management that exposed the personal information of 22 million current and former federal employees, universities, and online services giants like Yahoo, public awareness about credit card fraud and identity theft is probably at an all-time high. 

Unfortunately, the risks remain just as high, especially given the reality that billions of IoT devices in everything from household appliances to cars, remain rampantly insecure, as encryption and security guru Bruce Schneier, CTO at IBM Resilient, frequently observes in his personal blog

[Related: The 15 biggest security breaches of the 21st century]

3. Goodbye anonymity

It is increasingly difficult to do much of anything in modern life, “without having your identity associated with it,” Herold says. She says even de-identified data does not necessarily remove privacy risks. “The standards used even just a year or two ago are no longer sufficient. Organizations that want to anonymize data to then use it for other purposes are going to find it increasingly difficult. “It will soon become almost impossible to effectively anonymize data in a way that the associated individuals cannot be re-identified,” she says.  

Besides being vulnerable to breaches, IoT device are a massive data collection engine of users’ most personal information. “Individuals are paying for smart devices, and the manufacturers can change their privacy terms at a moment's notice,” Jerome says. “It's one thing to tell a user to stop using a web service; it's another to tell them to unplug their smart TV or disconnect their connected car.” 

4. Government exemptions

According to EPIC, “Americans are in more government databases than ever,” including that of the FBI, which collects personally identifiable information (PII) including name, any aliases, race, sex, date and place of birth, Social Security number, passport and driver’s license numbers, address, telephone numbers, photographs, fingerprints, financial information like bank accounts, and employment and business information. 

Yet, “incredibly, the agency has exempted itself from Privacy Act (of 1974) requirements that the FBI maintain only, ‘accurate, relevant, timely and complete’ personal records,” along with other safeguards of that information required by the Privacy Act, EPIC says. The NSA also opened a storage facility in Bluffdale, Utah, in 2014 that is reportedly capable of storing 12 zettabytes of data — a single zettabyte is the amount of information it would take 750 billion DVDs to store.  

While there have been assurances, including from former President Obama, that government is “not listening to your phone calls or reading your emails,” that obviously ducks the question of whether government is storing them. 

5. Your data gets brokered

Numerous companies collect and sell consumer data that are used to profile individuals, without much control or limits. There was the famous case of companies beginning to market products to a pregnant woman before she had told others in her family, thanks to automated decision-making. The same can be true of things like sexual orientation or an illness like cancer.

“Since 2014, data brokers have been having a field day in selling all the data they can scoop up from anywhere they can find it on the internet. And there are few — none explicit that I know of — legal protections for involved individuals,” Herold says. “This practice is going to increase, unfettered, until privacy laws restricting such use are enacted. There is also little or no accountability or even guarantees that the information is accurate. 

Where do we go from here?

Those are not the only risks, and there is no way to eliminate them. But there are ways to limit them. One, according to Jerome, is to use big data analytics for good — to expose problems. “In many respects, big data is helping us make better, fairer decisions,” he says, noting that it can be, “a powerful tool to empower users and to fight discrimination. More data can be used to show where something is being done in a discriminatory way. Traditionally, one of the biggest problems in uncovering discrimination is a lack of data,” he says. 

There is general agreement among advocates that Congress needs to pass a version of the CPBR, which called for consumer rights to include: 

  • Individual control over what personal data companies collect from them and how they use it.
  • Transparency, or easily understandable and accessible information about privacy and security practices.
  • The collection, use and disclosure of personal data to be done in ways that are consistent with the context in which consumers provide the data.
  • Security and responsible handling of personal data.
  • Access to their personal data in usable formats, with the power to correct errors.
  • Reasonable limits on the personal data that companies collect and retain. 

McNicholas says that “transparency” should include an overhaul of “privacy policies,” which are so dense and filled with legalese that almost nobody reads them. “Telling consumers to read privacy policies and exercise opt-out rights seems to be a solution better suited to last century,” he says. “Consumer privacy must shift to consumer-centric, where consumers have real control over their information." 

Jerome agrees. “I certainly don't think we can expect consumers to read privacy policies. That's madness. What we should expect are better and more controls. It's a good thing that users can review and delete their Echo recordings. It's great that Twitter allows users to toggle all sorts of personalization and see who has targeted them,” he says. “But ultimately, if individuals aren't given more options over collection and sharing, we're going to have serious issues about our personal autonomy.” 

Given the contentious atmosphere in Congress, there is little chance of something resembling the CPBR being passed anytime soon. That doesn’t mean consumers are defenseless, however. What can they do?

Jerome says even if users don’t read an entire policy, they should, “still take a moment before clicking ‘OK’ to consider why and with whom they're sharing their information. A recent study suggested that individuals would give up sensitive information about themselves in exchange for homemade cookies.” 

Herold offers several other individual measures to lower your privacy risks: 

  • Quit sharing so much on social media. “If you only have a few people you want to see photos or videos, then send directly to them instead of posting where many can access them,” she says.
  • Don’t provide information to businesses or other organizations that are not necessary for the purposes for which you’re doing business with them. Unless they really need your address and phone number, don’t give it to them.
  • Use an anonymous browser, like Hotspot Shield or Tor (The Onion Router) when visiting sites that might yield information that could cause people to draw inaccurate conclusions about you.
  • Ask others not to share information online about you without your knowledge. “It may feel awkward, but you need to do it,” she says, adding that the hard truth is that consumers need to protect themselves because nobody else will be doing it for them. 

Regarding legislation, she says she has not heard about any other drafts of the CPBR in the works, “and I quite frankly do not expect to see anything in the next four years that will improve consumer privacy; Indeed, I expect to see government protections deteriorate. “I hope I am wrong,” she says.

Cybersecurity market research: Top 15 statistics for 2017