Re: [RFC PATCH 1/7] mseal: expose interface to seal / unseal user memory ranges

From: Jeff Xu
Date: Fri Oct 04 2024 - 14:00:19 EST


Hi Fares,

Please add me to this series and I'm interested in everything related
to mseal :-)

I also added Kees, since mseal is a security feature, and kees is CCed
on security matters.

On Wed, Sep 25, 2024 at 8:25 AM Fares Mehanna <faresx@xxxxxxxxx> wrote:
>
> Hi,
>
> Thanks for taking a look and apologies for my delayed response.
>
> > It is not clear from the change log above or the cover letter as to why
> > you need to go this route instead of using the mmap lock.
>
> In the current form of the patches I use memfd_secret() to allocate the pages
> and remove them from kernel linear address. [1]
>
> This allocate pages, map them in user virtual addresses and track them in a VMA.
>
> Before flipping the permissions on those pages to be used by the kernel, I need
> to make sure that those virtual addresses and this VMA is off-limits to the
> owning process.
>
> memfd_secret() pages are locked by default, so they won't swap out. I need to
> seal the VMA to make sure the owner process can't unmap/remap/... or change the
> protection of this VMA.
>
> So before changing the permissions on the secret pages, I make sure the pages
> are faulted in, locked and sealed. So userspace can't influence this mapping.
>
> > We can't use the mseal feature for this; it is supposed to be a one way
> > transition.
>
> For this approach, I need the unseal operation when releasing the memory range.
>
> The kernel can be done with the secret pages in one of two scenarios:
> 1. During lifecycle of the process.
> 2. When the process terminates.
>
> For the first case, I need to unmap the VMA so it can be reused by the owning
> process later, so I need the unseal operation. For the second case however we
> don't need that since the process mm is already destructed or just about to be
> destructed anyway, regardless of sealed/unsealed VMAs. [1]
>
> I didn't expose the unseal operation to userspace.
>
In general, we should avoid having do_unseal, even though the
operation is restricted to the kernel itself.

However, from what you have described, without looking at your code,
the case is closer to mseal, except that you need to unmap it within
the kernel code.

For this, there might be two options that I can think of now, post
here for discussion:

1> Add a new flag in vm_flags, to allow unmap while sealed. However,
this will not prevent user space from unmap the area.

2> pass a flag in do_vmi_align_munmap() to skip sealing checks for
your particular call. The do_vmi_align_munmap() already has a flag
such as unlock.

will the above work for your case ? or I miss-understood the requirement.

Thanks
-Jeff



> [1] https://lore.kernel.org/linux-arm-kernel/20240911143421.85612-3-faresx@xxxxxxxxx/
>
> Thanks!
> Fares.
>
>
>
> Amazon Web Services Development Center Germany GmbH
> Krausenstr. 38
> 10117 Berlin
> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
> Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
> Sitz: Berlin
> Ust-ID: DE 365 538 597
>
>