TLBleed is a new side channel attack that has been proven to work on Intel CPU’s with Hyperthreading (generally Simultaneous Multi-threading, or SMT, or HT on Intel) enabled. It relies on concurrent access to the TLB, and it being shared between threads. We find that the L1dtlb and the STLB (L2 TLB) is shared between threads on Intel CPU cores.
This means that co-resident hyperthreads can get a certain amount of information on the data memory accesses of the other HT, without needing any shared cache. Whereas the cache can be partitioned to protect against cache attacks (such as in Cloak using TSX, or using CAT, or using page coloring), this is practically impossible for the TLB, and no such systems have been proposed. Thus in the presence of cache defenses, TLBleed remains a risk in this threat model.
Requirements (a.k.a. threat model)
The threat model is identical to Colin Pervical’s seminal 2005 work, Cache Missing for Fun and Profit, which arguably introduced practical cache side channels. Different from TLBleed, it requires concurrent access to the cache shared between Hyperthreads. TLBleed gets all information through the shared TLB, which can’t be partitioned between processes in software (either application or OS).
Impact & Coverage
As a result of seeing this work, OpenBSD decided to disable Hyperthreading by default. This has prompted some speculation that TLBleed is a spectre-like attack, but that is not the case. OpenBSD also realizes the exact impact of TLBleed. Red Hat has a very thoughtful piece here. There has been significant news coverage: TheRegister (and this one), ArsTechnica, ZDnet, Techrepublic, TechTarget, ITwire, tweakers, and a personal favorite, the SecurityNow Podcast episode 669 (mp3, show notes, youtube).
Overview of results
We briefly tour the results of our paper with some technical details that a technical audience might be interested in. Full technical details can be found in the paper, linked below.
libgcrypt ECC point multiplication
To demonstrate the effectiveness of TLBleed, we side-channel the non-side-channel resistant version of the libgcrypt EdDSA ECC scalar-point multiplication function. In later versions it has been hardened against this attack. We demonstrate that even if cache protections are in place, when this code would be safe, it would still be vulnerable from TLB leakage. The chart below shows the progression of analysis. The spy process acquires the TLB usage signal while the other hyperthread is executing the point-scalar multiplication using a secret key (this would happen e.g. when a signing operation takes place). The raw TLB signal looks noisy, but we are able to apply a machine learning technique to distinguish when a 0 or a 1 secret key bit is being processed.
After we capture the signal, to reconstruct the key sometimes we have to guess some values (sometimes the SVM classifier output makes no sense, sometimes it’s wrong in 1 or 2 bits), so we apply some heuristics and brute force until we guess the right key. We try this on 3 different microarchitectures and reach the following success rates, while analyzing the data from just a single TLB capture. We call it a ‘success’ if we can guess the key after a limited amount of brute force effort (see below).
This is the distribution of brute force effort needed for the 3 different cases.
We also apply our technique to an older implementation of RSA, using square-and-multiply, in libgcrypt, which was written to be side-channel-resistant. (Updates since that time have improved its side channel resistance.) Its side channel resistance stems from having a near-constant control flow. TLBleed however relies on the data access pattern and can see the difference between the multiply result being used (1-bit in exponent) or not (0-bit in exponent). The error rate for a 1024-bit key was much higher than for a 256-bit EdDSA key however, and we could not brute force our way to a full reliable key recovery. We do believe that with some advanced number theoretical techniques (see paper and previous work, e.g. Cachebleed), this number of unknown bits out of 1024 can lead to a reliable key recovery with just a single capture.
Also: defense bypass, covert channel
The paper contains more material: we implement a covert channel over the TLB, and demonstrate bypasses of cache protections. See the paper for full details.
Presentations & Full Paper
The authors would like to thank Yuval Yarom, Colin Percival, and Taylor `Riastradh’ Campbell for early feedback on this paper, and impact consideration.