Researchers' tool uncovers website breaches

UCSD researchers' Tripwire tool uncovered website breaches, yet none of the sites disclosed the breaches to customers after being informed. The study is another harsh reminder about the dangers of password reuse.

Researchers from the University of California San Diego (UCSD) designed a prototype system to determine if websites were hacked. They conducted their study and monitored over 2,300 sites from January 2015 to February 2017.

In the end, the system detected 1 percent, or 19 sites, were compromised, “including what appears to be a plaintext password compromise at an Alexa top-500 site with more than 45 million active users.” None of the sites disclosed the breach to their customers.

Tripwire prototype detects compromised sites

We can’t seem to get way from password reuse and the problems people suffer after reusing the same password. The researchers designed a prototype system, dubbed Tripwire, to detect the compromise of sites by attackers attempting to use the password from the breached website to access the email account on record that was used when registering for the website.

First, a bot automatically crawled and registered accounts on about 2,300 sites. Every account was associated with a unique email address, but it relied on shared passwords, using the same password when registering on a website as the password for the corresponding email account.

Then, to make sure any compromise was actually related to hacked websites, the researchers used 100,000 email accounts created with the same email provider but did not use those email addresses to register accounts. This was to act as a control group and was supposed to ensure any compromise was not related to the email provider. In the end, hackers did not access any of those accounts.

Additionally, in an attempt to determine if websites were storing passwords in plaintext or hashing them, the researchers created at least two accounts for each website. One account was secured with a seven-character password that wouldn’t be hard for hackers to guess, and one was secured with a random 10-character string. If both accounts were breached, it would indicate the site was storing passwords in plaintext. If only the simple password was breached, then the sites were likely using encryption for password storage.

Using Tripwire, the researchers determined that 19 sites had been compromised — one was a popular startup with more than 45 million active users. Despite the researchers reaching out to the websites about the breaches, not one disclosed the breach to their customers.

According to the research paper, “Tripwire detected both plaintext and hashed-password breaches and has predominantly discovered breaches that have previously been undisclosed.”

The paper, “Tripwire: Inferring Internet Site Compromise” (pdf), written by Joe DeBlasio, Stefan Savage, Geoffrey M. Voelker and Alex C. Snoeren, was presented at the ACM Internet Measurement Conference, which was held in London in November. They also made the repository for the registration crawler available on GitHub.

Since the sites didn’t ask to be part of the study, the researchers decided not to name-blame-shame them and left any potential breach notification to each site. Once an email account was breached, very few were used to send spam. Mostly, attackers just monitored the mail.

What it came down to was 1 percent of the sites from the study being hacked. With over a billion sites online, some huge and some small, 1 percent would really add up in the long term. An example given included that 10 of the top 1,000 most-visited sites on the internet are likely to be hacked every year. In fact, DeBlasio, a Ph.D. student of Jacobs School of Engineering at UCSD, said, “One percent of the really big shops getting owned is terrifying.”

Naturally, the researchers urge people not to resuse passwords but to use a password manager and not to disclose more information than is required.

“The truth of the matter is that your information is going to get out, and you’re not going to know that it got out,” DeBlasio said.

Tripwire registration crawler on GitHub

The repository for the registration crawler is available on GitHub, but DeBlasio said it worked better at registering for accounts on sites at the start of the study than it would now. To detect site compromise, he said you would also need a partnership with a big email provider to get good data. He added that it was probably best for me not to recommend trying to setup the crawler.

It says as much about the crawler source code on GitHub:

While we provide complete source for the crawler, I highly discourage you from actually trying to run it, and you do so at your own risk. If, however, you are interested in the heuristics that our crawler uses, or how the system works, the code is all here!

But really, if you've been tasked with getting this crawler running, turn back all ye who enter here. This code is very old, very fragile, and requires a lot of moving parts to get working well.

Net neutrality

Lastly, I beg of you to take action in the net neutrality fight if you have not done so because unless some miracle happens, tomorrow the FCC will wreck the internet.

SUBSCRIBE! Get the best of CSO delivered to your email inbox.