Re: [syzbot] general protection fault in kernfs_get_inode
From: Christian Brauner
Date: Sun Oct 09 2022 - 04:43:13 EST
On Sat, Oct 08, 2022 at 08:29:40PM +0200, Christian A. Ehrhardt wrote:
>
> Hi (from another Christian),
>
> On Fri, Oct 07, 2022 at 11:35:49AM -1000, Tejun Heo wrote:
> > (cc'ing Christian and quoting whole body)
> >
> > Christan, I can't repro it here but think what we need is sth like the
> > following. The problem is that cgroup_is_dead() check in the fork path isn't
> > enough as the perm check depends on cgrp->procs_file being available but
> > that is cleared while DYING before DEAD. So, depending on the timing, we can
> > end up trying to deref NULL pointer in may_write.
> >
> > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> > index 8ad2c267ff471..603b7167450a1 100644
> > --- a/kernel/cgroup/cgroup.c
> > +++ b/kernel/cgroup/cgroup.c
> > @@ -4934,6 +4934,10 @@ static int cgroup_may_write(const struct cgroup *cgrp, struct super_block *sb)
> >
> > lockdep_assert_held(&cgroup_mutex);
> >
> > + /*if @cgrp is being removed, procs_file may already be gone */
> > + if (!cgrp->procs_file.kn)
> > + return -ENODEV;
> > +
> > inode = kernfs_get_inode(sb, cgrp->procs_file.kn);
> > if (!inode)
> > return -ENOMEM;
>
> I had syzbot hit the same crash here and as can be seen from the
> reproducer the root cause is that the target cgroup (specified
> via CLONE_INTO_CGROUP) is a version 1 cgroup. These cgroups just
> don't initialize ->procs_file.kn.
>
> This is a regression from v6.0 caused by
>
> f3a2aebdd6 ("cgroup: enable cgroup_get_from_file() on cgroup1")
Yeah, this patch is wrong in its simple form and definitely breaks CLONE_INTO_CGROUP.
CLONE_INTO_CGROUP can only work with cgroup2 fds. It absolutely cannot
work with cgroup1 fds. The semantics would be terrible as controllers
can be mounted into separate hierarchies.
>
> Unless we want to revert this patch the correct fix might be
> something like this:
>
> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> index f487c54a0087..67474b1ae087 100644
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@ -6249,6 +6249,11 @@ static int cgroup_css_set_fork(struct kernel_clone_args *kargs)
> goto err;
> }
>
> + if (!cgroup_on_dfl(dst_cgrp)) {
> + ret = -EBADF;
> + goto err;
> + }
That seems like a good enough patch to me.