Re: [PATCH v8 0/4] Introduce mseal

From: Liam R. Howlett
Date: Wed Jan 31 2024 - 14:34:49 EST


Please add me to the Cc list of these patches.

* jeffxu@xxxxxxxxxxxx <jeffxu@xxxxxxxxxxxx> [240131 12:50]:
> From: Jeff Xu <jeffxu@xxxxxxxxxxxx>
>
> This patchset proposes a new mseal() syscall for the Linux kernel.
>
> In a nutshell, mseal() protects the VMAs of a given virtual memory
> range against modifications, such as changes to their permission bits.
>
> Modern CPUs support memory permissions, such as the read/write (RW)
> and no-execute (NX) bits. Linux has supported NX since the release of
> kernel version 2.6.8 in August 2004 [1]. The memory permission feature
> improves the security stance on memory corruption bugs, as an attacker
> cannot simply write to arbitrary memory and point the code to it. The
> memory must be marked with the X bit, or else an exception will occur.
> Internally, the kernel maintains the memory permissions in a data
> structure called VMA (vm_area_struct). mseal() additionally protects
> the VMA itself against modifications of the selected seal type.

.. The v8 cut Jonathan's email discussion [1] off and instead of
replying there, I'm going to add my question here.

The best plan to ensure it is a general safety measure for all of linux
is to work with the community before it lands upstream. It's much
harder to change functionality provided to users after it is upstream.
I'm happy to hear google is super excited about sharing this, but so
far, the community isn't as excited.

It seems Theo has a lot of experience trying to add a feature very close
to what you are doing and has real data on how this went [2]. Can we
see if there is a solution that is, at least, different enough from what
he tried to do for a shot of success? Do we have anyone in the
toolchain groups that sees this working well? If this means Stephen
needs to do something, can we get that to happen please?

I mean, you specifically state that this is a 'very specific
requirement' in your cover letter. Does this mean even other browsers
have no use for it?

I am very concerned this feature will land and have to be maintained by
the core mm people for the one user it was specifically targeting.

Can we also get some benchmarking on the impact of this feature? I
believe my answer in v7 removed the worst offender, but since there is
no benchmarking we really are guessing (educated or not, hard data would
help). We still have an extra loop in madvise, mprotect_pkey, mremap_to
(and mreamp syscall?).

You also did not clean up the loop you copied from mlock, which I
pointed out [3]. Stating that your copy/paste is easier to review is
not sufficient to keep unneeded assignments around.

[1]. https://lore.kernel.org/linux-mm/87a5ong41h.fsf@xxxxxxxxxxxx/
[2]. https://lore.kernel.org/linux-mm/86181.1705962897@xxxxxxxxxxxxxxx/
[3]. https://lore.kernel.org/linux-mm/20240124200628.ti327diy7arb7byb@revolver/