Re: [PATCH RFC v2 18/23] fs: add umh argument to struct kernel_clone_args

From: Christian Brauner

Date: Tue Mar 10 2026 - 08:59:32 EST


On Mon, Mar 09, 2026 at 05:06:24PM +0100, Jann Horn wrote:
> On Fri, Mar 6, 2026 at 12:31 AM Christian Brauner <brauner@xxxxxxxxxx> wrote:
> > Add a umh field to struct kernel_clone_args. When set, copy_fs() copies
> > from pid 1's fs_struct instead of the kthread's fs_struct. This ensures
> > usermodehelper threads always get init's filesystem state regardless of
> > their parent's (kthreadd's) fs.
> >
> > Usermodehelper threads are not allowed to create mount namespaces
> > (CLONE_NEWNS), share filesystem state (CLONE_FS), or be started from
> > a non-initial mount namespace. No usermodehelper currently does this so
> > we don't need to worry about this restriction.
> >
> > Set .umh = 1 in user_mode_thread(). At this stage pid 1's fs points to
> > rootfs which is the same as kthreadd's fs, so this is functionally
> > equivalent.
> [...]
> > -static int copy_fs(u64 clone_flags, struct task_struct *tsk)
> > +static int copy_fs(u64 clone_flags, struct task_struct *tsk, bool umh)
> > {
> > - struct fs_struct *fs = current->fs;
> > + struct fs_struct *fs;
> > +
> > + /*
> > + * Usermodehelper may use userspace_init_fs filesystem state but
> > + * they don't get to create mount namespaces, share the
> > + * filesystem state, or be started from a non-initial mount
> > + * namespace.
> > + */
> > + if (umh) {
> > + if (clone_flags & (CLONE_NEWNS | CLONE_FS))
> > + return -EINVAL;
> > + if (current->nsproxy->mnt_ns != &init_mnt_ns)
> > + return -EINVAL;
> > + fs = userspace_init_fs;
> > + } else {
> > + fs = current->fs;
> > + }
> >
> > VFS_WARN_ON_ONCE(current->fs != current->real_fs);
>
> Should that VFS_WARN_ON_ONCE() be in the else {} block?

I think in spirit the placement you suggest makes more sense.

> I don't know if it could happen that a VFS operation that happens with
> overwritten current->fs calls back into firmware loading or such, or
> if that is anyway impossible for locking reasons or such...

Usermodehelper are terrible hybrids that always go through workqueue
dispatch. So let's say somehow someone ends up triggering a
usermodehelper under scoped_with_init_fs() - no matter if regular task
or kthread - what would actually happen is:

INIT_WORK(&sub_info->work, call_usermodehelper_exec_work)

which means all usermodehelper creations are done from some other
kthread and never by the caller. IOW, even if the caller has overridden
current->fs the usermodehelper would be created from a pristine kthread
context.