• United States



Senior Staff Writer

18 million targeted voter records exposed by database error

Jan 04, 201610 mins
Cloud SecurityData and Information SecurityData Breach

There were 56 million voters in the database, and more than 18 million of them were further singled out with targeted profile data

ballot box voting
Credit: Thinkstock

Last week, Salted Hash reported on the massive voter records leak after a database was exposed due to a configuration error. The story generated a good deal of public attention, despite the fact the database held records that some dismissed inconsequential.

Interestingly enough, those dismissing the database because it held public records were seemingly ignoring the privacy and security risks associated with such a collection (not to mention various state laws protecting such data and controlling how it is accessed and shared).

Around the same time the first database was discovered a second, smaller database was also found by researcher Chris Vickery. This second database contains voter profiles similar to those previously discovered, however, it also includes records that hold targeted demographic information.

While the overall total of records is lower (56,722,986 compared to 191 million) it’s still a concerning figure, but this discovery took a steep downturn when more than 18 million records containing targeted profile information were added to the mix.

This second database has voter information from states that began with the letters A-I, but excluding Illinois and Iowa. The scattered information suggests the data was being added in stages, and the exposed database wasn’t intended for public disclosure.

What’s in the database?

The second database contains the general voter profile, which includes a voter’s name, address, phone number, date of birth, voting record, etc. In fact, comparing records from both databases confirmed they are essentially the same, but the dates on the second database are newer (April 2015) and some of the field names are different – suggesting the core data came from the same source file.

This source file has been previously identified by political experts as Nation Builder Election Center data. This is further supported by the existence of an nbec_precinct_code and a voter ID code consisting of 32 letters and numbers separated by dashes.

As mentioned in the first story, Nation Builder is under no obligation to identify customers, and once the data has been obtained, they cannot control what happens to it. 

While the previously discovered voter database contained more records, this second database, though smaller, contains more information. The standout issue is that these additional data points are targeted towards building an issues-based profile of the voter. While that might be fine for any number of election campaigns, having this data exposed to the public is a goldmine for criminals.

The second database contains several fields for custom text. Depending on the record some of them have answers, while others do not. There’s also fields that flag the profile as being copied from another data source, and those that determine if the voter has been contacted. In addition, there are fields for determining of the voter is active and if they’re a donor.

Other fields include email address, something that wasn’t part of the larger voter database covered last week; as well as records focused on health issues, gun ownership, household values (e.g., religion / social issues), fishing and hunting interests, auto racing interests, longitude and latitude of the voter, income level, and occupation.

When it comes to overlap and additions to the basic voter file, the additional fields in this second database look at gender identification, political party affiliation, political contributions, religious affiliation and if they’re a religious donor, a field denoting bible lifestyle, as well as how many robocall (auto dialed) campaigns they’ve been part of.

Who owns this database?

As was the case with the previous voter database, no one wants to claim ownership directly. However, there were several threads to follow in this case. Dissent, the admin of and researcher Chris Vickery discovered several possible connections, but at the time this story was written, no one would go on record to discuss the database.

Political experts, who spoke to Salted Hash on the condition that their name or organization not be used, said the basic details in the second database suggest it’s a voter file extract, one that’s being used specifically for fundraising efforts.

“It’s an oddly narrow selection of consumer variables, merged-in with voter file variables. The selection of variables tells me its likely donor profiling for a GOP effort (bible, farming, hunting, etc…),” the expert said.

Vickery contacted the FBI to report the exposed database. They later told them there was nothing they could do. Along with Salted Hash, Dissent contacted other agencies, including law enforcement and officials in California.

The second database contained a number of interesting markers that could be used for identification. The data collection points referring to religion and values-based issues suggested a GOP-focused organization created and maintained this database.

Two data fields pioneer_status and pioneer_counter, two database users (Pioneer, Pioneer2), as well as a reference to Pioneer in the database schema further lent some credibility to this assumption. URLs in the database itself referenced Pioneer Solutions Inc., and Let’s Vote America.

Let’s Vote America is actually a United in Purpose campaign, one “whose mission is to unite and equip like-minded conservative organizations to increase their reach, impact, and influence through the latest technology, research and marketing strategies for the purpose of bringing about a culture change in America based on Judeo-Christian principles.”

United in Purpose is a 501(c)(4), or social welfare organization. As such, they can engage in political action, but they’re limited to spending no more than 50-percent of their money on political activities.

Bill Dallas, an ex-con and Tea Party activist, started United in Purpose in order to identify unregistered Christian voters. According to NPR, United in Purpose had compiled data on 120 million voters in 2012, largely by developing tools that allowed pastors to compare church membership with official voter registration files.

From NPR:

“The company buys lists to build a profile of each citizen, and then assigns points for certain characteristics. You get points if you’re on an anti-abortion list or a traditional marriage list. You get a point if you regularly attend church or home-school your kids. You get points if you like NASCAR or fishing.”

The data points referenced by NPR are the same basic data points included in the second database discovered by Vickery. In addition, the database contains fields for voter score, which Dallas told NPR determines how serious a person was about their faith.

In addition to the Nation Builder markers in the general voter file, other fields referenced a Pioneer survey as a data source, which is why Pioneer Solutions was considered a possible owner of this database.

But it still wasn’t clear if United in Purpose or Pioneer Solutions developed this database and placed it online. When Vickery registered for an account at Pioneer Solutions, the welcome email came from United in Purpose.

It’s also possible that neither one of them are responsible. The owner could be one of the many organizations that have partnered with United in Purpose, such as Americans for United Life, Bound4Life, Concerned Women for America, the Family Policy Institute of Washington, the Liberty Institute, or iVoteValues.

The full partner list, with dozens of organizations, can be found here.

Dissent contacted Pioneer Solutions for comment, warning them about the second database being disclosed and the information it contained. Less than a day later, the CEO and Founder of Digital Smart Technologies, Tamas Cser, emailed to say:

“This is definitely data that we provide however we do not have the content keywords table that you mention. We work with many organization so we should talk to see where the vulnerability is, and plug it asap.”

After Dissent contacted Cser in order to share the IP address of the second database, he promised a return call to answer questions.

Days later, after granting a request from Cser to withhold publication in order to investigate the issue, he has still not answered follow-up questions or returned calls aside from one brief conversation where he told Dissent that “they” were investigating – without actually stating who “they” were.

What’s interesting is that shortly after Cser was contacted, the second database was secured and no one was claiming credit.

So while it’s good news that the data has been taken offline, plenty of questions remain. Was this a United in Purpose breach, or a Pioneer Solutions breach? If neither, does the blame belong to one of the many partner organizations? How long was the data exposed? Who accessed it?

Unfortunately, we may never see answers to those questions.

Why does the discovery of such a database even matter?

Big data can be used to solve problems or increase sales, but it’s also a pain to manage and secure. In other words, big data is both a blessing and a curse.

The more data that’s collected, the larger the effort is to properly store and protect it. Once this database became exposed to the public, it turned from a voter-tracking tool to a massive repository of public and personal information.

True, voter data is public record for the most part, but each state has laws that govern how it is obtained, how it can be used, and how it can be shared. When you add additional data points, such as those discovered within the second database, you’re no longer talking about pure public record.

Speaking to Salted Hash, Khalil Sehnaoui, Information Security specialist and founder of Krypton Security, singled out Phishing as the primary threat represented by this data leak.

When it comes to Phishing, he said, the more information you have on a target, the better.

“This database is a Phishing crew’s dream come true, because there’s so much information here you can come up with a million ways to get the victim to do what you want,” Sehnaoui said.

Most of the additional details in the leaked data would allow an attacker to find people on social media (Facebook, Twitter, LinkedIn, etc…) and other places online.

“Once you have the victim’s social media account (or accounts), then it is easy to find people they associate with through intelligence gathering. Even if one social media account is locked through good privacy practices (which is practically never the case), there is always other social media one can gather information from. So once you find one or many people they associate with (whether its work related or personal) you can target these people too or use them as an identity spoof when phishing.”

“This data could also be used by large organizations to target people of interest (that never asked to be targeted) for let’s say political contributions (if they are known to contribute), same with religious organization, etc. Or to target [people] with negative campaigns as well,” Sehnaoui added.

Over the years, the public has become more aware about various Phishing and Social Engineering scams and crimes, particularly those that target financial data. However, while an email or phone call claiming to be from your bank might be ignored, one that addresses your core beliefs or political leanings might have a higher chance at success.

This is especially true if your leaked records state that you’re willing to receive communications related to those items.

The upside to this story is, the database is secured. But like before, no one wants to claim ownership of the data. This story will be updated if additional details become available.