Re: [RFC PATCH 11/14] mm/slub: allocate slabs from virtual memory

From: Kees Cook
Date: Fri Sep 15 2023 - 17:23:15 EST


On Fri, Sep 15, 2023 at 10:59:30AM +0000, Matteo Rizzo wrote:
> From: Jann Horn <jannh@xxxxxxxxxx>
>
> This is the main implementation of SLAB_VIRTUAL. With SLAB_VIRTUAL
> enabled, slab memory is not allocated from the linear map but from a
> dedicated region of virtual memory. The code ensures that once a range
> of virtual addresses is assigned to a slab cache, that virtual memory is
> never reused again except for other slabs in that same cache. This lets
> us mitigate some exploits for use-after-free vulnerabilities where the
> attacker makes SLUB release a slab page to the page allocator and then
> makes it reuse that same page for a different slab cache ("cross-cache
> attacks").
>
> With SLAB_VIRTUAL enabled struct slab no longer overlaps struct page but
> instead it is allocated from a dedicated region of virtual memory. This
> makes it possible to have references to slabs whose physical memory has
> been freed.
>
> SLAB_VIRTUAL has a small performance overhead, about 1-2% on kernel
> compilation time. We are using 4 KiB pages to map slab pages and slab
> metadata area, instead of the 2 MiB pages that the kernel uses to map
> the physmap. We experimented with a version of the patch that uses 2 MiB
> pages and we did see some performance improvement but the code also
> became much more complicated and ugly because we would need to allocate
> and free multiple slabs at once.

I think these hints about performance should be also noted in the
Kconfig help.

> In addition to the TLB contention, SLAB_VIRTUAL also adds new locks to
> the slow path of the allocator. Lock contention also contributes to the
> performance penalty to some extent, and this is more visible on machines
> with many CPUs.
>
> Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
> Co-developed-by: Matteo Rizzo <matteorizzo@xxxxxxxxxx>
> Signed-off-by: Matteo Rizzo <matteorizzo@xxxxxxxxxx>
> ---
> arch/x86/include/asm/page_64.h | 10 +
> arch/x86/include/asm/pgtable_64_types.h | 5 +
> arch/x86/mm/physaddr.c | 10 +
> include/linux/slab.h | 7 +
> init/main.c | 1 +
> mm/slab.h | 106 ++++++
> mm/slab_common.c | 4 +
> mm/slub.c | 439 +++++++++++++++++++++++-
> mm/usercopy.c | 12 +-
> 9 files changed, 587 insertions(+), 7 deletions(-)

Much of this needs review by MM people -- I can't speak well to the
specifics of the implementation. On coding style, I wonder if we can get
away with reducing the amount of #ifdef code by either using "if
(IS_ENABLED(...)) { ... }" style code, or, in the case of the allocation
function, splitting it out into two separate files, one for standard
page allocator, and one for the new virt allocator. But, again, MM
preferences reign. :)

--
Kees Cook