Re: 6.15-rc1/regression/bisected - commit 474f7825d533 is broke systemd-nspawn on my system

From: Mikhail Gavrilov
Date: Wed Apr 09 2025 - 10:31:31 EST


On Wed, Apr 9, 2025 at 1:36 PM Christian Brauner <brauner@xxxxxxxxxx> wrote:
>
> Ok, I see the bug. It's caused by pecularity in systemd that a specific
> implementation detail of mount_setattr() papered over.
>
> Basically, I added a shortcut to mount_settatr():
>
> /* Don't bother walking through the mounts if this is a nop. */
> if (attr.attr_set == 0 &&
> attr.attr_clr == 0 &&
> attr.propagation == 0)
> return 0;
>
> So that we:
>
> * don't pointlessly do path lookup
> * don't pointlessly walk the mount tree and hold the namespace semaphore etc.
>
> When I added copy_mount_setattr() this cycle this optimization got
> broken because I moved it into this helper and we now do path lookup and
> walk the mount tree even if there's no mount properties to change at
> all.
>
> That's just a performance thing, not a correctness thing though.
>
> systemd has the following code:
>
> int make_fsmount(
> int error_log_level,
> const char *what,
> const char *type,
> unsigned long flags,
> const char *options,
> int userns_fd) {
>
> <snip>
>
> mnt_fd = fsmount(fs_fd, FSMOUNT_CLOEXEC, 0);
> if (mnt_fd < 0)
> return log_full_errno(error_log_level, errno, "Failed to create mount fd for \"%s\" (\"%s\"): %m", what, type);
>
> if (mount_setattr(mnt_fd, "", AT_EMPTY_PATH|AT_RECURSIVE,
> &(struct mount_attr) {
> .attr_set = ms_flags_to_mount_attr(f) | (userns_fd >= 0 ? MOUNT_ATTR_IDMAP : 0),
> .userns_fd = userns_fd,
> }, MOUNT_ATTR_SIZE_VER0) < 0)
>
> <snip>
>
> So if userns_fd is greater or equal than zero MOUNT_ATTR_IDMAP will be
> raised otherwise not.
>
> Later in the code we find this function used in nspawn during:
>
> static int get_fuse_version(uint32_t *ret_major, uint32_t *ret_minor) {
>
> <snip>
> /* Get a FUSE handle. */
> fuse_fd = open("/dev/fuse", O_CLOEXEC|O_RDWR);
> <snip>
> mnt_fd = make_fsmount(LOG_DEBUG, "nspawn-fuse", "fuse.nspawn", 0, opts, -EBADF);
>
> This will cause the aforementioned mount_setattr() call to be called
> with:
>
> if (mount_setattr(mnt_fd, "", AT_EMPTY_PATH|AT_RECURSIVE,
> &(struct mount_attr) {
> .attr_set = 0,
> .userns_fd = -EBADF,
> }, MOUNT_ATTR_SIZE_VER0) < 0)
>
> This means:
>
> attr_set == 0 && attr_clear == 0 and propagation == 0 and we'd thus
> never trigger a path lookup on older kernels. But now we do thus causing
> the hang.
>
> I've restored the old behavior. Can you please test?:
>
> https://web.git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/log/?h=work.mount.fixes

Thanks,
I can confirm systemd-nspawn working as expected on the kernel built
from the branch work.mount.fixes.

Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@xxxxxxxxx>

--
Best Regards,
Mike Gavrilov.

Attachment: 6.15.0-rc1-work.mount.fixes.zip
Description: Zip archive