HTTP compression continues to put encrypted communications at risk

Researchers improve the BREACH attack to extract sensitive data from encrypted HTTPS connections faster

Many Web applications still vulnerable to the BREACH attack.
Credit: bottled_void

Security researchers have expanded and improved a 3-year-old attack that exploits the compression mechanism used to speed up browsing in order to recover sensitive information from encrypted Web traffic.

The attack, known as BREACH, takes advantage of the gzip/DEFLATE algorithm used by many Web servers to reduce latency when responding to HTTP requests. This compression mechanism leaks information about encrypted connections and allows man-in-the-middle attackers to recover authentication cookies and other sensitive information.

The BREACH (Browser Reconnaissance and Exfiltration via Adaptive Compression of Hypertext) attack was first presented at the Black Hat USA security conference in August 2013 by security researchers Angelo Prado, Neal Harris and Yoel Gluck. While it theoretically affects all SSL/TLS ciphers, their version of the attack was most effective against connections encrypted with stream ciphers, such as RC4.

Another team of researchers, Dimitris Karakostas from the National Technical University of Athens and Dionysis Zindros from the University of Athens, have since made improvements to BREACH that make it practical for attacking TLS block ciphers, like AES, that are more commonly used today than RC4.

Karakostas and Zindros presented their BREACH optimizations at the Black Hat Asia security conference last week and also released an open-source framework called Rupture that can be used to launch such compression-related attacks.

Their presentation included two proof-of-concept attacks against Gmail and Facebook Chat to demonstrate that many websites, including some of the most security-conscious ones, are vulnerable.

BREACH requires the attacker to be in a network position that allows the interception of a victim's Web traffic. This can be achieved on an wireless network, by compromising a router, or higher up in the Internet infrastructure by ISPs or intelligence agencies like the NSA.

The attacker will then have to find a vulnerable part of an application that accepts input through an URL parameter and reflects that input somewhere into the encrypted response.

In the case of Gmail, the researchers found that the search function on its mobile site allowed for such input reflection: a search string passed through an URL parameter was included in the response page, for example in a message saying that there were no results for that particular string. Also, if the request was made from an authenticated session, the response also included an authentication token identifying that session.

The way gzip compression works in HTTP is that, if there are multiple instances of the same string in a response, the first instance is kept and the rest will be replaced with short references to the first instance's location. This reduces the size of the response.

Therefore, in the Gmail case, if the user searches for the exact string that matches the authentication token -- or even a portion of it -- there would be two instances of the same sequence of characters in the response. Because of compression, the response would be smaller in size than other responses for a different search string.

With BREACH, the goal of the attacker is to trick the user's browser to send a large number of requests to a vulnerable application -- like the mobile search feature in Gmail -- with the goal of guessing the authentication token. The authentication token would be encrypted in the response, but every time the search string would match a bit of the authentication token, the response observed over the wire would be smaller.

This eventually allows the sequential guessing of every character in the authentication token by constantly modifying the search string in new requests to include the already discovered characters. It is essentially a brute-force attack on every character, with variations in HTTP compression serving as success indicators.

The Rupture framework allows the attacker to inject rogue code into every unencrypted HTTP connection opened by a user's browser. That code is designed to force the browser to make requests to a vulnerable HTTPS application in the background.

Unlike stream ciphers, block ciphers introduce noise into responses because they add dummy bits known as padding to data before encrypting it, so that it can be split into blocks of a specific size. Canceling out this noise and recovering the encrypted data using the BREACH technique requires executing a significantly larger number of requests than would be necessary had the same data been encrypted with a stream cipher.

At first glance this would appear to make the attack less practical. However, Karakostas and Zindros have devised a statistical-based method of bypassing the noise by calculating the mean response length of multiple responses sent for the same tested character. They also made other optimizations and introduced browser parallelization that drastically improve the original attack's speed against TLS connections that use block ciphers.

Three years later after BREACH was announced, RC4 is considered unsafe and most websites use the AES block cipher, the researchers said in their technical paper. "Some services, such as Facebook, also went on to incorporate mechanisms to prevent BREACH. However, the fundamental aspects of BREACH are still not mitigated and popular websites, including Facebook, continue support for vulnerable end-points."

"Our work demonstrates that BREACH can evolve to attack major web applications, confirming the fact that TLS traffic is still practically vulnerable," the researchers concluded.

A proposed Internet standard called first-party or same-site cookies could protect websites against the BREACH attack. If adopted by browsers, this mechanism would prevent cookies from being included in requests sent to a website if those requests were initiated by a different website.

That is, if code on site A instructs the browser to initiate a request to site B, that request will not include the user's authentication cookie for site B, even if the browser does have an active, authenticated session with site B.

This mechanism was primarily intended to protect against cross-site request forgery (CSRF) attacks, but breaks BREACH as well, because the attack relies on a similar method of initiating rogue cross-site requests.

Google Chrome will enable support for same-site cookies in version 51, which will reach stable status in May. However, unless the mechanism is implemented in all browsers there will be little incentive for website owners to start using the new "SameSite" flag for their cookies.

To comment on this article and other CSO content, visit our Facebook page or our Twitter stream.
Insider: Hacking the elections: myths and realities
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.