Re: [PATCH v8 2/4] mseal: add mseal syscall

From: Jeff Xu
Date: Thu Feb 01 2024 - 23:04:23 EST


On Thu, Feb 1, 2024 at 7:54 PM Theo de Raadt <deraadt@xxxxxxxxxxx> wrote:
>
> Jeff Xu <jeffxu@xxxxxxxxxxxx> wrote:
>
> > On Thu, Feb 1, 2024 at 3:11 PM Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
> > >
> > > On Wed, Jan 31, 2024 at 05:50:24PM +0000, jeffxu@xxxxxxxxxxxx wrote:
> > > > [PATCH v8 2/4] mseal: add mseal syscall
> > > [...]
> > > > +/*
> > > > + * The PROT_SEAL defines memory sealing in the prot argument of mmap().
> > > > + */
> > > > +#define PROT_SEAL 0x04000000 /* _BITUL(26) */
> > > > +
> > > > /* 0x01 - 0x03 are defined in linux/mman.h */
> > > > #define MAP_TYPE 0x0f /* Mask for type of mapping */
> > > > #define MAP_FIXED 0x10 /* Interpret addr exactly */
> > > > @@ -33,6 +38,9 @@
> > > > #define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be
> > > > * uninitialized */
> > > >
> > > > +/* map is sealable */
> > > > +#define MAP_SEALABLE 0x8000000 /* _BITUL(27) */
> > >
> > > IMO this patch is misleading, as it claims to just be adding a new syscall, but
> > > it actually adds three new UAPIs, only one of which is the new syscall. The
> > > other two new UAPIs are new flags to the mmap syscall.
> > >
> > The description does include all three. I could update the patch title.
> >
> > > Based on recent discussions, it seems the usefulness of the new mmap flags has
> > > not yet been established. Note also that there are only a limited number of
> > > mmap flags remaining, so we should be careful about allocating them.
> > >
> > > Therefore, why not start by just adding the mseal syscall, without the new mmap
> > > flags alongside it?
> > >
> > > I'll also note that the existing PROT_* flags seem to be conventionally used for
> > > the CPU page protections, as opposed to kernel-specific properties of the VMA
> > > object. As such, PROT_SEAL feels a bit out of place anyway. If it's added at
> > > all it perhaps should be a MAP_* flag, not PROT_*. I'm not sure this aspect has
> > > been properly discussed yet, seeing as the patchset is presented as just adding
> > > sys_mseal(). Some reviewers may not have noticed or considered the new flags.
> > >
> > MAP_ flags is more used for type of mapping, such as MAP_FIXED_NOREPLACE.
> >
> > The PROT_SEAL might make more sense because sealing the protection bit
> > is the main functionality of the sealing at this moment.
>
> Jeff, please show a piece of software that needs to do PROT_SEAL as
> mprotect() or mmap() argument.
>
I didn't propose mprotect().

for mmap() here is a potential use case:

fs/binfmt_elf.c
if (current->personality & MMAP_PAGE_ZERO) {
/* Why this, you ask??? Well SVr4 maps page 0 as read-only,
and some applications "depend" upon this behavior.
Since we do not have the power to recompile these, we
emulate the SVr4 behavior. Sigh. */

error = vm_mmap(NULL, 0, PAGE_SIZE,
PROT_READ | PROT_EXEC, <-- add PROT_SEAL
MAP_FIXED | MAP_PRIVATE, 0);
}

I don't see the benefit of RWX page 0, which might make a null
pointers error to become executable for some code.


> Please don't write it as a vague essay.
>
> Instead, take a piece of existing code, write a diff, and show your work.
>
> Then explain that diff, justify why doing the PROT_SEAL as an argument
> of mprotect() or mmap() is a required improvement, and show your Linux
> developer peers that you can do computer science.
>
> I did the same work in OpenBSD, at least 25% time over 2 years, and I
> had to prove my work inside my development community. I had to prove
> that it worked system wide, not in 1 program, with hand-waving for the
> rest. If I had said "Looks, it works in ssh, trust me it works in other
> programs", it would not have gone further.
>
> glibc is the best example to demonstrate, but smaller examples might
> convince.