Re: [GIT PULL] sched/urgent for 5.17-rc4

From: Peter Zijlstra
Date: Mon Feb 14 2022 - 15:55:52 EST


On Mon, Feb 14, 2022 at 09:17:12AM -1000, Tejun Heo wrote:
> Hello, Peter.
>
> On Mon, Feb 14, 2022 at 10:16:57AM +0100, Peter Zijlstra wrote:
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index d75a528f7b21..05faebafe2b5 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -2266,6 +2266,13 @@ static __latent_entropy struct task_struct *copy_process(
> > if (retval)
> > goto bad_fork_put_pidfd;
> >
> > + /*
> > + * Now that the cgroups are pinned, re-clone the parent cgroup and put
> > + * the new task on the correct runqueue. All this *before* the task
> > + * becomes visible.
> > + */
> > + sched_cgroup_fork(p, args);
>
> Would it be less confusing to comment that this isn't ->can_fork() because
> scheduler task_group needs to be initialized for autogroup even when cgroup
> is disabled and maybe name it sched_cgroup_can_fork() even if it always
> succeeds?

So there's two things that need doing; the re-cloning of the task_group
thing, but also calling of __set_task_cpu() which sets up the proper
runqueue links.

The first is CGroup only, and *could* in theory be done in ->can_fork(),
but the second needs to be done unconditionally, and it doesn't make
much sense to split this up.

I actually tried, but it made the patch bigger/uglier -- but maybe I
didn't try hard enough.

> > +void sched_cgroup_fork(struct task_struct *p, struct kernel_clone_args *kargs)
> > {
> > unsigned long flags;
> > -#ifdef CONFIG_CGROUP_SCHED
> > - struct task_group *tg;
> > -#endif
> >
> > + /*
> > + * Because we're not yet on the pid-hash, p->pi_lock isn't strictly
> > + * required yet, but lockdep gets upset if rules are violated.
> > + */
> > raw_spin_lock_irqsave(&p->pi_lock, flags);
> > #ifdef CONFIG_CGROUP_SCHED
> > - tg = container_of(kargs->cset->subsys[cpu_cgrp_id],
> > - struct task_group, css);
> > - p->sched_task_group = autogroup_task_group(p, tg);
> > + if (1) {
> > + struct task_group *tg;
> > + tg = container_of(kargs->cset->subsys[cpu_cgrp_id],
> > + struct task_group, css);
> > + tg = autogroup_task_group(p, tg);
> > + p->sched_task_group = autogroup_task_group(p, tg);
> > + }
>
> I suppose the double autogroup_task_group() call is unintentional?

Yeah, that's a silly fail. Will ammend.

> Otherwise, looks good to me. The only requirement from cgroup side is that
> the membership should be initialized between ->can_fork() and ->fork()
> inclusively, and sans autogroup this would have been done as a part of
> ->can_fork() so the proposed change makes sense to me.

Thanks! I suppose I should go write me a Changelog then... assuming it
actually works :-)