Re: [PATCH] userns: Default to 'yes' when CONFIG_MEMCG option is enabled

From: Christian Brauner
Date: Fri May 24 2024 - 09:19:37 EST


On Fri, May 24, 2024 at 02:33:19PM +0200, Javier Martinez Canillas wrote:
> Christian Brauner <brauner@xxxxxxxxxx> writes:
>
> Hello Christian,
>
> Thanks a lot for your feedback.
>
> > On Fri, May 24, 2024 at 10:24:16AM +0200, Javier Martinez Canillas wrote:
> >> The default value for the CONFIG_USER_NS Kconfig symbol changed over time.
> >>
> >> When first was introduced by commit acce292c82d4 ("user namespace: add the
> >> framework"), the default was 'no'. But then it was changed to 'yes' if the
> >> CONFIG_NAMESPACES option was enabled, by commit 17a6d4411a4d ("namespaces:
> >> default all the namespaces to 'yes' when CONFIG_NAMESPACES is selected").
> >>
> >> Then, commit 5673a94c1457 ("userns: Add a Kconfig option to enforce strict
> >> kuid and kgid type checks") changed the default to 'no' again and selected
> >> the (now defunct) UIDGID_STRICT_TYPE_CHECKS option.
> >>
> >> This selected option was removed by commit 261000a56b63 ("userns: Remove
> >> UIDGID_STRICT_TYPE_CHECKS"), but CONFIG_USER_NS default was left to 'no'.
> >>
> >> Finally, the commit e11f0ae388f2 ("userns: Recommend use of memory control
> >> groups") added to the Kconfig symbol's help text a recommendation that the
> >> memory control groups should be used, to limit the amount of memory that a
> >> user who can create user namespaces can consume.
> >>
> >> Looking at the changes' history, a default to 'yes' when the CONFIG_MEMCG
> >> option is enabled seems like a sane thing to do. Specially since systemd
> >> requires user namespaces support for services that use the PrivateUsers=
> >> property in their unit files (e.g: the UPower daemon).
> >
> > Fyi, user namespaces are an entirely optional feature in systemd and it
> > gracefully falls back if they are not available with PrivateUsers= set.
> > If that isn't the case then it's a bug in systemd with PrivateUsers=
> > handling and should be reported.
> >
>
> Interesting, it definitely failed for me:
>
> $ systemctl status upower
> ● upower.service - Daemon for power management
> Loaded: loaded (/lib/systemd/system/upower.service; disabled; vendor preset: enabled)
> Active: failed (Result: exit-code) since Fri 2024-05-24 12:23:49 UTC; 34s ago
> Docs: man:upowerd(8)
> Process: 390 ExecStart=/usr/libexec/upowerd (code=exited, status=217/USER)
> Main PID: 390 (code=exited, status=217/USER)
> CPU: 122ms
>
> May 24 12:23:49 igep systemd[1]: upower.service: Scheduled restart job, restart counter is at 5.
> May 24 12:23:49 igep systemd[1]: Stopped Daemon for power management.
> May 24 12:23:49 igep systemd[1]: upower.service: Start request repeated too quickly.
> May 24 12:23:49 igep systemd[1]: upower.service: Failed with result 'exit-code'.
> May 24 12:23:49 igep systemd[1]: Failed to start Daemon for power management.
>
> $ journalctl -u upower
> May 24 12:23:49 igep systemd[1]: Starting Daemon for power management...
> May 24 12:23:49 igep systemd[404]: upower.service: Failed to set up user namespacing: Invalid argument
> May 24 12:23:49 igep systemd[404]: upower.service: Failed at step USER spawning /usr/libexec/upowerd: Invalid argument
> May 24 12:23:49 igep systemd[1]: upower.service: Main process exited, code=exited, status=217/USER
> May 24 12:23:49 igep systemd[1]: upower.service: Failed with result 'exit-code'.
> May 24 12:23:49 igep systemd[1]: Failed to start Daemon for power management.
> May 24 12:23:49 igep systemd[1]: upower.service: Scheduled restart job, restart counter is at 1.
> May 24 12:23:49 igep systemd[1]: Stopped Daemon for power management.
>
> That lead me to https://gitlab.freedesktop.org/upower/upower/-/issues/104
> and finally to systemd's README:
>
> https://github.com/systemd/systemd/blob/main/README#L89C22-L89C34
>
> But I'll investigate more if is upower or systemd to be blamed here...
>
> > But specifically to you change, afair CONFIG_MEMCG and userns are
> > unrelated so tying them together like this in the kconfig seems
> > misguided.
> >
>
> Yes, but the config USER_NS help text already tieds them toghether:
>
> help
> This allows containers, i.e. vservers, to use user namespaces
> to provide different user info for different servers.
>
> When user namespaces are enabled in the kernel it is
> recommended that the MEMCG option also be enabled and that

But then the patch your patch is the wrong way around and you should
make CONFIG_USER_NS select CONFIG_MEMCG. IOW, if you do userns, do memcg
but _not_ if you do memcg, do userns.

> user-space use the memory control groups to limit the amount
> of memory a memory unprivileged users can use.
>
> And as mentioned in the commit message, it seems to be the reason why the
> default for this Kconfig symbol is no. Maybe I misunderstood though or do
> you think that could be switched unconditionally to 'default y' ?

No, definitely not.

>
> Or is there a reason to be the only namespace to be default to no instead
> of yes? Specially since important system services are trying to use it.

Yes, it has a lot more security implications than all of the other ones
and makes them available to unprivileged users by default. So that
definitely requires a conscious choice.

>
> --
> Best regards,
>
> Javier Martinez Canillas
> Core Platforms
> Red Hat
>