Re: [RFC PATCH RESEND v2 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
From: Lorenzo Stoakes
Date: Wed Jan 08 2025 - 16:11:13 EST
On Wed, Jan 08, 2025 at 08:43:38PM +0000, Lorenzo Stoakes wrote:
> On Mon, Jan 06, 2025 at 05:14:26PM -0800, Isaac Manjarres wrote:
> > On Fri, Jan 03, 2025 at 04:03:44PM +0100, Jann Horn wrote:
> > > On Fri, Jan 3, 2025 at 12:32 AM Isaac J. Manjarres
> > > <isaacmanjarres@xxxxxxxxxx> wrote:
> > > > Android currently uses the ashmem driver [1] for creating shared memory
> > > > regions between processes. Ashmem buffers can initially be mapped with
> > > > PROT_READ, PROT_WRITE, and PROT_EXEC. Processes can then use the
> > > > ASHMEM_SET_PROT_MASK ioctl command to restrict--never add--the
> > > > permissions that the buffer can be mapped with.
> > > >
> > > > Processes can remove the ability to map ashmem buffers as executable to
> > > > ensure that those buffers cannot be exploited to run unintended code.
> > >
> > > Is there really code out there that first maps an ashmem buffer with
> > > PROT_EXEC, then uses the ioctl to remove execute permission for future
> > > mappings? I don't see why anyone would do that.
> >
> > Hi Jann,
> >
> > Thanks for your feedback and for taking the time to review these
> > patches!
> >
> > Not that I'm aware of. The reason why I made this seal have semantics
> > where it prevents future executable mappings is because there are
> > existing applications that allocate an ashmem buffer (default
> > permissions are RWX), map the buffer as RW, and then restrict
> > the permissions to just R.
> >
> > When the buffer is mapped as RW, do_mmap() unconditionally sets
> > VM_MAYEXEC on the VMA for the mapping, which means that the mapping
> > could later be mapped as executable via mprotect(). Therefore, having
> > the semantics of the seal be that it prevents any executable mappings
> > would break existing code that has already been released. It would
> > make transitioning clients to memfd difficult, because to amend that,
> > the ashmem users would have to first restrict the permissions of the
> > buffer to be RW, then map it as RW, and then restrict the permissions
> > again to be just R, which also means an additional system call.
>
> You could do something similar to my adjustments to the F_SEAL_WRITE
> changes that clears VM_MAYEXEC in cases where do_mmap() maps an
> F_SEAL_EXEC'd without PROT_EXEC.
>
> Please note that F_SEAL_EXEC implies:
>
> F_SEAL_SHRINK
> F_SEAL_GROW
> F_SEAL_WRITE <- important, obviously
> F_SEAL_FUTURE_WRITE <- also important
>
> if (seals & F_SEAL_EXEC && inode->i_mode & 0111)
> seals |= F_SEAL_SHRINK|F_SEAL_GROW|F_SEAL_WRITE|F_SEAL_FUTURE_WRITE;
>
> Though interestingly only _after_ the mapping_deny_writable() call is
> performed which is... odd.
>
> Probably worth exploring F_SEAL_EXEC semantics in detail if you haven't
> already to see how viable this is.
>
OK please disregard this (+ other waffling on F_SEAL_EXEC), I dug in and this
flag is weirdly named and simply prevents chmod() changes to the mapping (...!).
For the semantics you need, you do very much appear to need something completely
new and what you suggest with F_SEAL_FUTURE_EXEC does appear to fit the bill
nicely.