Re: [PATCH v2 1/2] memfd: fix MFD_NOEXEC_SEAL to be non-sealable by default
From: Jeff Xu
Date: Fri Jun 07 2024 - 11:59:45 EST
Hi Barnabás
On Fri, May 31, 2024 at 11:56 AM Barnabás Pőcze <pobrn@xxxxxxxxxxxxxx> wrote:
>
> 2024. május 30., csütörtök 0:24 keltezéssel, Jeff Xu <jeffxu@xxxxxxxxxx> írta:
>
> > On Wed, May 29, 2024 at 2:46 PM Barnabás Pőcze <pobrn@xxxxxxxxxxxxxx> wrote:
> > >
> > > Hi
> > >
> > >
> > > 2024. május 29., szerda 23:30 keltezéssel, Jeff Xu <jeffxu@xxxxxxxxxx> írta:
> > >
> > > > Hi David and Barnabás
> > > >
> > > > On Fri, May 24, 2024 at 7:15 AM David Rheinsberg <david@xxxxxxxxxxxx> wrote:
> > > > >
> > > > > Hi
> > > > >
> > > > > On Fri, May 24, 2024, at 5:39 AM, jeffxu@xxxxxxxxxxxx wrote:
> > > > > > From: Jeff Xu <jeffxu@xxxxxxxxxx>
> > > > > >
> > > > > > By default, memfd_create() creates a non-sealable MFD, unless the
> > > > > > MFD_ALLOW_SEALING flag is set.
> > > > > >
> > > > > > When the MFD_NOEXEC_SEAL flag is initially introduced, the MFD created
> > > > > > with that flag is sealable, even though MFD_ALLOW_SEALING is not set.
> > > > > > This patch changes MFD_NOEXEC_SEAL to be non-sealable by default,
> > > > > > unless MFD_ALLOW_SEALING is explicitly set.
> > > > > >
> > > > > > This is a non-backward compatible change. However, as MFD_NOEXEC_SEAL
> > > > > > is new, we expect not many applications will rely on the nature of
> > > > > > MFD_NOEXEC_SEAL being sealable. In most cases, the application already
> > > > > > sets MFD_ALLOW_SEALING if they need a sealable MFD.
> > > > >
> > > > > This does not really reflect the effort that went into this. Shouldn't this be something along the lines of:
> > > > >
> > > > > This is a non-backward compatible change. However, MFD_NOEXEC_SEAL
> > > > > was only recently introduced and a codesearch revealed no breaking
> > > > > users apart from dbus-broker unit-tests (which have a patch pending
> > > > > and explicitly support this change).
> > > > >
> > > > Actually, I think we might need to hold on to this change. With debian
> > > > code search, I found more codes that already use MFD_NOEXEC_SEAL
> > > > without MFD_ALLOW_SEALING. e.g. systemd [1], [2] [3]
> > >
> > > Yes, I have looked at those as well, and as far as I could tell,
> > > they are not affected. Have I missed something?
> > >
> > In the example, the MFD was created then passed into somewhere else
> > (safe_fork_full, open_serialization_fd, etc.), the scope and usage of
> > mfd isn't that clear to me, you might have checked all the user cases.
> > In addition, MFD_NOEXEC_SEAL exists in libc and rust and go lib. I
> > don't know if debian code search is sufficient to cover enough apps .
> > There is a certain risk.
> >
> > Fundamentally, I'm not convinced that making MFD default-non-sealable
> > has meaningful benefit, especially when MFD_NOEXEC_SEAL is new.
>
> Certainly, there is always a risk, I did not mean to imply that there isn't.
> However, I believe this risk is low enough to at least warrant an attempt at
> eliminating this inconsistency. It can always be reverted if it turns out that
> the effects have been vastly underestimated by me.
>
> So I would still like to see this change merged.
>
The MFD_NOEXEC_SEAL is a new flag, technically, ABI is not broken.
The sysctl vm.memfd_noexec=1 or 2, is meant to help
migration/enforcement of MFD_NOEXEC_SEAL, so it will break application
if it is used pre-maturely, that is by-design.
I think the main problem here is lack of documentation, instead of a code bug.
ABI change shouldn't be treated lightly, given the risk, I would like
to keep the API the same and add the documentation instead. I think
that is the best route forward.
Best Regards,
-Jeff
>
> Regards,
> Barnabás Pőcze
>
>
> >
> >
> > >
> > > Regards,
> > > Barnabás
> > >
> > >
> > > >
> > > > I'm not sure if this will break more applications not-knowingly that
> > > > have started relying on MFD_NOEXEC_SEAL being sealable. The feature
> > > > has been out for more than a year.
> > > >
> > > > Would you consider my augments in [4] to make MFD to be sealable by default ?
> > > >
> > > > At this moment, I'm willing to add a document to clarify that
> > > > MFD_NOEXEC_SEAL is sealable by default, and that an app that needs
> > > > non-sealable MFD can set SEAL_SEAL. Because both MFD_NOEXEC_SEAL
> > > > and vm.memfd_noexec are new, I don't think it breaks the existing
> > > > ABI, and vm.memfd_noexec=0 is there for backward compatibility
> > > > reasons. Besides, I honestly think there is little reason that MFD
> > > > needs to be non-sealable by default. There might be few rare cases,
> > > > but the majority of apps don't need that. On the flip side, the fact
> > > > that MFD is set up to be sealable by default is a nice bonus for an
> > > > app - it makes it easier for apps to use the sealing feature.
> > > >
> > > > What do you think ?
> > > >
> > > > Thanks
> > > > -Jeff
> > > >
> > > > [1] https://codesearch.debian.net/search?q=MFD_NOEXEC_SEAL
> > > > [2] https://codesearch.debian.net/show?file=systemd_256~rc3-5%2Fsrc%2Fhome%2Fhomed-home.c&line=1274
> > > > [3] https://sources.debian.org/src/elogind/255.5-1debian1/src/shared/serialize.c/?hl=558#L558
> > > > [4] https://lore.kernel.org/lkml/CALmYWFuPBEM2DE97mQvB2eEgSO9Dvt=uO9OewMhGfhGCY66Hbw@xxxxxxxxxxxxxx/
> > > >
> > > >
> > > > > > Additionally, this enhances the useability of pid namespace sysctl
> > > > > > vm.memfd_noexec. When vm.memfd_noexec equals 1 or 2, the kernel will
> > > > > > add MFD_NOEXEC_SEAL if mfd_create does not specify MFD_EXEC or
> > > > > > MFD_NOEXEC_SEAL, and the addition of MFD_NOEXEC_SEAL enables the MFD
> > > > > > to be sealable. This means, any application that does not desire this
> > > > > > behavior will be unable to utilize vm.memfd_noexec = 1 or 2 to
> > > > > > migrate/enforce non-executable MFD. This adjustment ensures that
> > > > > > applications can anticipate that the sealable characteristic will
> > > > > > remain unmodified by vm.memfd_noexec.
> > > > > >
> > > > > > This patch was initially developed by Barnabás Pőcze, and Barnabás
> > > > > > used Debian Code Search and GitHub to try to find potential breakages
> > > > > > and could only find a single one. Dbus-broker's memfd_create() wrapper
> > > > > > is aware of this implicit `MFD_ALLOW_SEALING` behavior, and tries to
> > > > > > work around it [1]. This workaround will break. Luckily, this only
> > > > > > affects the test suite, it does not affect
> > > > > > the normal operations of dbus-broker. There is a PR with a fix[2]. In
> > > > > > addition, David Rheinsberg also raised similar fix in [3]
> > > > > >
> > > > > > [1]:
> > > > > > https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114
> > > > > > [2]: https://github.com/bus1/dbus-broker/pull/366
> > > > > > [3]:
> > > > > > https://lore.kernel.org/lkml/20230714114753.170814-1-david@xxxxxxxxxxxx/
> > > > > >
> > > > > > Cc: stable@xxxxxxxxxxxxxxx
> > > > > > Fixes: 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
> > > > > > Signed-off-by: Barnabás Pőcze <pobrn@xxxxxxxxxxxxxx>
> > > > > > Signed-off-by: Jeff Xu <jeffxu@xxxxxxxxxx>
> > > > > > Reviewed-by: David Rheinsberg <david@xxxxxxxxxxxx>
> > > > >
> > > > > Looks good! Thanks!
> > > > > David
> > > >