GLitch

What is GLitch?

GLitch is one part of our series of Rowhammer attacks. We started by breaking the EDGE browser and the cloud. Then we moved towards Android devices showing how to root them with bit flips. This time we wanted to show that also mobile phones can be attacked remotely via the browser.
Meet GLitch: the first instance of a remote Rowhammer exploit on ARM Android devices. This makes it possible for an attacker who controls a malicious website to get remote code execution on a smartphone without relying on any software bug.
You want to know what makes this attack even cooler? It is carried out by the GPU. This is the first GPU-accelerated Rowhammer attack.

Wut? 🤔 How is it possible to trigger bit flips from the browser through the GPU?
The answer to this question is WebGL. WebGL is a graphic API that was designed with the purpose of providing developers with GPU acceleration for their graphics intensive applications. Unfortunately, as a byproduct of this API a new attack vector is introduced: the Grand Pwning Unit.

How does this sorcery work??
GLitch exploits a series of microarchitectural flaws of the system in order to leak and corrupt data.  The attack can be divided in two stages:

  1. In the first stage of the attack we take advantage of a timing side channel to gain a better understanding of the (physical) memory layout of the system.
  2. In the second stage we use the information extracted from the previous part to carry out a more reliable Rowhammer attack against the browser – in our case Firefox. For more details about the exploitation go down.

Ok! I got the gist of it… But then why from the GPU?
The reason is that Rowhammer requires uncached memory access. That is, it needs to bypass the processor caches to reach  DRAM.  While natively the attacker has more power and he is allowed to directly bypass the caches, this doesn’t apply to JavaScript. From JS this can be achieved only by means of cache evictions. And this technique was proven unfeasible on Android (ARM) platforms (check Drammer).  Therefore we needed a different attack vector. And here is where the GPU came into play. GPU caches were nicer and had deterministic behavior making it easier for us to build low-noise side channels and remote Rowhammer attacks.

FAQs

Sooo… Am I vulnerable to this exploit?
This question doesn’t have a yes or no answer. It depends on multiple factors. First of all your phone needs to be vulnerable to Rowhammer — very vulnerable since we need to implement eviction-based Rowhammer. If you’re unlucky and your DRAM cells are worse than a colander, then the other variable you need to take into account is the GPU architecture. Our PoC heavily relies on the insights we recovered by reverse engineering the architecture of our target systems: the Snapdragon 800 and 801. This means that our PoC works only on phones  such as the LG Nexus 5, HTC One M8 or LG G2.

Wait… You are making all this fuzz for phones that are 4 years old!?!? C’mon guys…
Well… Take it easy! First off we’re a university! We are not swimming in gold – feel free to donate your (old ) phone btw 🙃. Second, we implemented the attack on these phones cause we knew they were vulnerable to Rowhammer and we had multiple samples in the office to test against. You know… we still need to make scientific research somehow.
tl;dr  This doesn’t mean you’re not vulnerable. Different GPU architectures require different implementations of the attack which imply more reverse engineering effort. So we cannot tell you if your phone is vulnerable. We suspect it could be possible to port our attack on different architectures. But while on some of them may work even better on other may not work at all.

Ok… But I don’t use Firefox on Android (who does that anyway?) So no biggie!
Well some people actually use it (it has Adblock 😉). But this is a bit off topic…
Anyway, I have bad news again. The exploit serves only the purpose of a proof of concept. We wanted to show how we can get control over a phone remotely. But it has nothing to do with the core of the project:  microarchitectural attacks. So if you’re wondering if we can trigger bit flips on Chrome the answer is yes, we can. As a matter of fact most of our research was carried out on Chrome. We then switched to Firefox for the exploit just because we had prior knowledge of the platform and found more documentation (see credits).

If you’re interested in more details about the exploit or other techincal details we encourage you to read the paper or go down to the  technical walkthrough .

Demo

This demo shows GLitch in action on a LG Nexus 5 running Android 6.0.1. The video report an attack against Firefox 57. We extract the base address of libxul.so from procfs and we show how GLitch provides us with an arbitrary read/write primitive to leak such data.  The recording is taken from the Firefox Web Console — screen recording and GPU bit flips don’t get along that well   ¯\_(ツ)_/¯

Papers

[1] P. Frigo, C. Giuffrida, H. Bos, K. Razavi, Grand Pwning Unit: Accelerating Microarchitectural Attacks with the GPU, in: S&P, 2018.

Reception

We followed responsible disclosure and contacted the interested parties who acknowledge the issue. The vulnerability eventually got assigned CVE-2018-10229. The Dutch NCSC helped us throughout this (lengthy) process — thanks guys 🙃.

Current mitigations on the browser are tackling the timing channel introduced by the GPU.
Both Chrome and Firefox disabled the EXT_DISJOINT_TIMER_QUERY WebGL extension on the latest version of the browsers and redesigned (or are planning to in case of Firefox) the WebGLSync objects to avoid high precision timing as required by new WebGL specifications.

As of now there is no proposed mitigation to block the GPU-accelerated bit flips. Nonetheless, we’ve been directly discussing with Google (who has been very open during the disclosure process) possible options to solve the issue.

 

Technical details:

Here you can read more about the technicalities of the project. We first analyse the GPU and we then move forward to the exploit.

The Grand Pwning Unit

If you’re wondering how is it possible to program the GPU to trigger bit flips then you should read this.

The GPU  is a processor used to accelerate graphics rendering. The rendering pipeline consists of 2 main stages: geometry and rasterization. These stages are carried out by developer-provided programs called shaders. The vertex shader performs geometrical operations on vertices and the fragment shader fill the color of the pixels. These shaders are provided to the GPU at runtime. And this can be done also from JavaScript thanks to WebGL.

Let’s have a look at how the GPU aids the rendering pipeline. The CPU provides the GPU with vertices as inputs (Step 0). Then the GPU runs the vertex shader on every vertex (Step 1) producing polygons as outputs (Step 2). These polygons are composed of different fragments (≊pixels). Each of these fragments then gets  modified by the fragment shader (Step 3). The fragments usually get filled with colours extracted from textures . This process is known as texture sampling. [SPOILER]  textures are big and need to be stored in DRAM. The final step is to expose the outcome to the Framebuffer (Step 4).

Now we want to build a bug-free exploit. This means that we will rely simply on microarchitectural properties of the system. For this reason we need to share resources with the CPU who usually runs the sensitive “stuff”. As you may guess we can use texture sampling to gain access to DRAM, which is shared with the rest of the system. Even though we previously complained about the CPU caches also the GPU has caches. However, due to their deterministic behaviour they don’t pose much of a threat to our success and we are able to systematically bypass them. This means that through texture sampling we can access DRAM and can carry out the 2 stages of our attack.

The GLitch exploit

If you’re interested in the details of the exploit then you’re in the right place. Here we dive into the technicalities of the attack which, even if quite Firefox specific, are really interesting to understand how to compromise a device by relying on Rowhammer.
In this walkthrough however we will only describe the actual browser exploitation. We won’t dive into the details of the Flip Feng Shui technique used to get exploitable bit flips. But we DO explain which bits are actually exploitable. So let’s start from there. Let’s have a look to our main Rowhammer primitive: type flipping. 

Type flipping
S Exponent    Fraction
0 01111111111 01110000000000000000000000000000000000000000000000002 +20·(1 + 0.875)
1 11111111111 10001100010011000000011100000000000000000001111100002 NaN

The IEEE-754 specification stores double precision floats in 64 bits by using the exponential notation. That is, (-1)sign*(1.b{51}b{50}b{0})2*2(exp-1023)  where the MSB is the sign bit, the next 11 bits are the exponent and the final 52 bits are the mantissa as we show in the example above. The key concept behind type flipping is that the IEEE-754 specification treats any double having all the 11 exponent bits set to 1 (and the mantissa different from 0) as a peculiar value known as Not-a-Number (NaN). This means that all the 52 mantissa bits are completely useless for any mathematical computation when the exponent is set to all 1s. As a consequence, multiple JavaScript engines, among which SpiderMonkey (i.e., Firefox JS engine), have been using NaN in order to encode other values such as pointers in order to not waste these 252-1 unused values. SpiderMonkey uses two different encodings for this purpose depending on the architecture of the system: NuN-boxing for 32-bit systems and PuN-boxing for 64 bits. Since our attack targets 32-bit platforms let’s have a look at the NuN-boxing.

JSVAL_TAG_CLEAR  = 0xFFFFFF80,
JSVAL_TAG_INT32  = JSVAL_TAG_CLEAR | JSVAL_TYPE_INT32, // 0x01
JSVAL_TAG_STRING = JSVAL_TAG_CLEAR | JSVAL_TYPE_STRING, // 0x06
JSVAL_TAG_OBJECT = JSVAL_TAG_CLEAR | JSVAL_TYPE_OBJECT // 0x0c

1 11111111111 11111111111110000000000000000000000000000000000000002 //0xFFFFFF80

NuN-boxing uses the first 32 bits (one word) as a tag value to identify the type of the variable. Every tag smaller than JSVAL_TAG_CLEAR identifies a IEEE-754 double of 64 bits. Bigger tag values instead are used to identify object references. In this case the first word will contain details of the variable type, as represented in the snippet of /js/public/Value.h above . While the second word will contain the actual pointer.
Type flipping relies on the fact that any 1-to-0 bit flip in the first 25 bits of an IEEE-754 double can transform a pointer into a double while any 0-to-1 bit flip in the exponent bits can craft an arbitrary pointer. By exploiting this powerful property then we are able to gain two extremely powerful primitives, namely the ability to leak any pointer (arbitrary leak) and the ability to craft any pointer of our choosing (arbitrary craft).  For our attack, we will use both of them. This means that we actually need two bit flips. A 1-to-0 for arbitrary leak and a 0-to-1 for arbitrary craft.
We trigger these bit flips on normal JavaScript arrays’ slots. These in SpiderMonkey are internally known as ArrayObjects  and store data using the NuN-boxing technique. If we spray the memory with ArrayObjects we can simply fill them with marker values and then trigger the bit flips to identify the vulnerable slots. If arr[i] != MARKER after triggering the bit flip we have found our vulnerable slot.

Arbitrary read/write

We consider the attacker to be in possession of the two bit flips at two known locations within the ArrayObjects. Now we are only missing a JavaScript object that can be used to scan the memory. The obvious targets to obtain this primitive are ArrayBuffer objects.
ArrayBuffers allow an attacker to read (raw) binary data with byte granularity from memory. As a consequence we want to craft a fake ArrayBuffer which we can then reference (by using our arbitrary craft primitive).
The SpiderMonkey ArrayBufferObject class is represented as follows:

GCPtrObjectGroup group_;  // JSObject
GCPtrShape shape_;        // ShapedObject 
HeapSlots* slots_;        // NativeObject 
HeapSlots* elements_;     // NativeObject

// Slot offsets from ArrayBufferObject (two words each)
static const uint8_t DATA_SLOT = 0; 
static const uint8_t BYTE_LENGTH_SLOT = 1;
static const uint8_t FIRST_VIEW_SLOT = 2;
static const uint8_t FLAGS_SLOT = 3;

TheDATA_SLOT  field is what controls the memory referenced by the array buffer. Therefore, crafting a fake ArrayBufferObject header that we can then reference is the ultimate goal of our exploit. However, if we want to craft a fake object we are required to know the internal fields of the ArrayBuffer header (i.e., group_, shape_, …).
Therefore, we need to proceed in 3 steps:

  1. We need to leak a pointer in order to break ASLR.
  2. We need an arbitrary read to leak the unknown content.
  3. We neet to craft (and reference) our own fake ArrayBuffer.

So let’s go step by step.

1. Leak ASLR: Well this is pretty trivial. We simply store a reference of the object we want to leak in the 1-to-0 vulnerable slot of the ArrayObject. Then we trigger the bit flip, et voila. You got a used-to-be pointer that you can read as a double with a Float64Array. Now the question is what object do we want to leak. Again the answer is natural: ArrayBuffers. However, a specific type of ArrayBuffer which stores header and data inlined. Modern browsers in order to harden overflows exploitation have started allocating header and body of objects in different heaps so that an overflow doesn’t provide an attacker with control over the next object’s header. However, for performance reasons SpiderMonkey keeps header and data inlined for any ArrayBuffer with size smaller than 96 bytes. If we leak the pointer to an ArrayBuffer‘s header we can therefore derandomize also the location of its data (i.e, buff+sizeof(buff_header) whichis buff+0x30, as you can see from the snippet above). Knowledge of this relative offset makes it easier to craft our fake object.

2. Arbitrary Read: Now that we have derandomized the location of the ArrayBuffer‘s header we want to leak its header’s content. We do this by exploiting another extremely powerful JavaScript class: JSString. JSStrings are immutable, hence the read-only primitive. However, the UTF-16 standard provides us with an almost-arbitrary read. Firefox, defines different types of JSStrings. We exploit a sub class known as JSAtom. However, our only constraint is the non-inlined nature of the string.
Our fake JSAtom has the following structure:

class JSAtom: ... : JSString {
    struct Data {
         uint32_t    flags;     // 0x09 JSAtom
         uint32_t    length;    // sizeof(buff_header) 0x30
         char16_t*   string;    // *buff 
         } d;
}

We store the fake JSString at the beginning of the leaked ArrayBuffer. Which means it is located at address buff+0x30. Now we can craft a fake reference by building a fake double that after a 0-to-1 bit flip will reference an object of type String. This means we craft our double soon-to-be pointer as <0x???FFF86, buff+0x30> where the ??? depict the future 0xFFF. After the bit flip we can reference our fake JSString that points to the ArrayBuffer‘s header leaking its content.

3. Counterfeit ArrayBuffer: Now that we have the content of our ArrayBuffer‘s header we can craft our fake ArrayBuffer. We need to copy the 4 internal fields <group_,shape_,slots_,elements_> which are needed by the JS engine itself. Then we can set the DATA_SLOT field to any memory address mapped by the process and read/write at that location. That is, we have an arbitrary read/write. Now we can use this to gain control over the PC and gain remote code execution.

Credits (where credits is due):

The complete bibliography can be found on the paper but now we want to explicitly thank the authors of some very detailed walkthrough that helped us in developing the exploit.
The first person we want to thank is @argp for his awesome Phrack article which describes in details the internals of the SpiderMonkey JavaScript engine.
And second we want to thank the members of the Phonehex Team (in particular Samuel Groß) for their very thorough exploit walkthroughs.

Finally, on a separate note, we want to thank Rob Clark for his valuable inputs regarding the Adreno GPUs architecture.

 

Share on Facebook48Tweet about this on TwitterShare on Google+0Email this to someonePrint this page