Re: [PATCH RFC v2] Randomized slab caches for kmalloc()

From: Gong Ruiqi
Date: Wed May 31 2023 - 04:01:13 EST


Sorry for the late reply. I was trapped by other in-house kernel issues
these days.

On 2023/05/17 3:34, Kees Cook wrote:
> For new CCs, the start of this thread is here[0].
>
> On Mon, May 08, 2023 at 03:55:07PM +0800, GONG, Ruiqi wrote:
>> When exploiting memory vulnerabilities, "heap spraying" is a common
>> technique targeting those related to dynamic memory allocation (i.e. the
>> "heap"), and it plays an important role in a successful exploitation.
>> Basically, it is to overwrite the memory area of vulnerable object by
>> triggering allocation in other subsystems or modules and therefore
>> getting a reference to the targeted memory location. It's usable on
>> various types of vulnerablity including use after free (UAF), heap out-
>> of-bound write and etc.
>
> I heartily agree we need some better approaches to deal with UAF, and
> by extension, heap spraying.

Thanks Kees :) Good to hear that!

>
>> There are (at least) two reasons why the heap can be sprayed: 1) generic
>> slab caches are shared among different subsystems and modules, and
>> 2) dedicated slab caches could be merged with the generic ones.
>> Currently these two factors cannot be prevented at a low cost: the first
>> one is a widely used memory allocation mechanism, and shutting down slab
>> merging completely via `slub_nomerge` would be overkill.
>>
>> To efficiently prevent heap spraying, we propose the following approach:
>> to create multiple copies of generic slab caches that will never be
>> merged, and random one of them will be used at allocation. The random
>> selection is based on the address of code that calls `kmalloc()`, which
>> means it is static at runtime (rather than dynamically determined at
>> each time of allocation, which could be bypassed by repeatedly spraying
>> in brute force). In this way, the vulnerable object and memory allocated
>> in other subsystems and modules will (most probably) be on different
>> slab caches, which prevents the object from being sprayed.
>
> This is a nice balance between the best option we have now
> ("slub_nomerge") and most invasive changes (type-based allocation
> segregation, which requires at least extensive compiler support),
> forcing some caches to be "out of reach".

Yes it is, and it's also cost-effective: achieving a quite satisfactory
mitigation with a small amount of code (only ~130 lines).

I get this impression also because (believe it or not) we did try to
implement similar idea as the latter one you mention, and that was super
complex, and the workload was really huge ...

>
>>
>> The overhead of performance has been tested on a 40-core x86 server by
>> comparing the results of `perf bench all` between the kernels with and
>> without this patch based on the latest linux-next kernel, which shows
>> minor difference. A subset of benchmarks are listed below:
>>
>> control experiment (avg of 3 samples)
>> sched/messaging (sec) 0.019 0.019
>> sched/pipe (sec) 5.253 5.340
>> syscall/basic (sec) 0.741 0.742
>> mem/memcpy (GB/sec) 15.258789 14.860495
>> mem/memset (GB/sec) 48.828125 50.431069
>>
>> The overhead of memory usage was measured by executing `free` after boot
>> on a QEMU VM with 1GB total memory, and as expected, it's positively
>> correlated with # of cache copies:
>>
>> control 4 copies 8 copies 16 copies
>> total 969.8M 968.2M 968.2M 968.2M
>> used 20.0M 21.9M 24.1M 26.7M
>> free 936.9M 933.6M 931.4M 928.6M
>> available 932.2M 928.8M 926.6M 923.9M
>
> Great to see the impact: it's relatively tiny. Nice!
>
> Back when we looked at cache quarantines, Jann pointed out that it
> was still possible to perform heap spraying -- it just needed more
> allocations. In this case, I think that's addressed (probabilistically)
> by making it less likely that a cache where a UAF is reachable is merged
> with something with strong exploitation primitives (e.g. msgsnd).
>
> In light of all the UAF attack/defense breakdowns in Jann's blog
> post[1], I'm curious where this defense lands. It seems like it would
> keep the primitives described there (i.e. "upgrading" the heap spray
> into a page table "type confusion") would be addressed probabilistically
> just like any other style of attack. Jann, what do you think, and how
> does it compare to the KCTF work[2] you've been doing?

A kindly ping to Jann ;)

>
> In addition to this work, I'd like to see something like the kmalloc
> caches, but for kmem_cache_alloc(), where a dedicated cache of
> variably-sized allocations can be managed. With that, we can split off
> _dedicated_ caches where we know there are strong exploitation
> primitives (i.e. msgsnd, etc). Then we can carve off known weak heap
> allocation caches as well as make merging probabilistically harder.

Would you please explain more about the necessity of applying similar
mitigation mechanism to dedicated caches?

Based on my knowledge, usually we believe dedicated caches are more
secure, although it's still possible to spray them, e.g. by the
technique that allocates & frees large amounts of slab objects to
manipulate the heap in pages. Nevertheless in most of cases they are
still good since such spraying is (considered to be) hard to implement.

Meanwhile, the aforementioned spraying technique can hardly be mitigated
within SLAB since it operates at the page level, and our randomization
idea cannot protect against it either, so it also makes me inclined to
believe it's not meaningful to apply randomization to dedicated caches.

> I imagine it would be possible to then split this series into two
> halves: one that creates the "make arbitrary-sized caches" API, and the
> second that applies that to kmalloc globally (as done here).
>
>>
>> Signed-off-by: GONG, Ruiqi <gongruiqi1@xxxxxxxxxx>
>> ---
>>
>> v2:
>> - Use hash_64() and a per-boot random seed to select kmalloc() caches.
>
> This is good: I was hoping there would be something to make it per-boot
> randomized beyond just compile-time.
>
> So, yes, I think this is worth it, but I'd like to see what design holes
> Jann can poke in it first. :)

Thanks again! I'm looking forward to receiving more comments from mm and
hardening developers.

>
> -Kees
>
> [0] https://lore.kernel.org/lkml/20230508075507.1720950-1-gongruiqi1@xxxxxxxxxx/
> [1] https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html
> [2] https://github.com/thejh/linux/commit/a87ad16046f6f7fd61080ebfb93753366466b761
>