Re: [PATCH bpf-next 1/3] mm/vmalloc: introduce vmalloc_exec which allocates RO+X memory

From: Peter Zijlstra
Date: Thu Jul 14 2022 - 06:16:44 EST


On Wed, Jul 13, 2022 at 09:20:55PM +0000, Song Liu wrote:
>
>
> > On Jul 13, 2022, at 1:26 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Wed, Jul 13, 2022 at 03:48:35PM +0000, Song Liu wrote:
> >
> >>> So how about instead we separate them? Then much of the problem goes
> >>> away, you don't need to track these 2M chunks at all.
> >>
> >> If we manage the memory in < 2MiB granularity, either 4kB or smaller,
> >> we still need some way to track which parts are being used, no? I mean
> >> the bitmap.
> >
> > I was thinking the vmalloc vmap_area tree could help out there.
>
> Interesting. vmap_area tree indeed keeps a lot of useful information.
>
> Currently, powerpc supports CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC,

Only PPC32; and it's due to a constraint in their MMU vs page
protections.

> which leaves module_alloc just for module text. If this works, we get
> separation between RO+X and RW memory. What would it take to enable
> CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC for x86_64?

The VM_TOPDOWN_VMAP flag and ensuring the data and code regions never
overlap. Once you have that you can enable it.

Specifically the problem is that data needs to be in the s32 immediate
range just like code, so we're constrained to the module range. Given
that constraint, the easiest solution is to use the different ends of
that range.