Re: task_group unthrottling and removal race (was Re: [PATCH] sched/fair: Use rq->lock when checking cfs_rq list) presence

From: Mathias Krause
Date: Wed Nov 03 2021 - 06:51:18 EST


Heh, sometimes a good night sleep helps unfolding the knot in the head!

Am 03.11.21 um 10:51 schrieb Mathias Krause:
> [snip]
>
> We tried the below patch which, unfortunately, doesn't fix the issue. So
> there must be something else. :(
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 978460f891a1..afee07e9faf9 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -9506,13 +9506,17 @@ void sched_offline_group(struct task_group *tg)
> {
> unsigned long flags;
>
> - /* End participation in shares distribution: */
> - unregister_fair_sched_group(tg);
> -
> + /*
> + * Unlink first, to avoid walk_tg_tree_from() from finding us
> + * (via sched_cfs_period_timer()).
> + */
> spin_lock_irqsave(&task_group_lock, flags);
> list_del_rcu(&tg->list);
> list_del_rcu(&tg->siblings);
> spin_unlock_irqrestore(&task_group_lock, flags);
> +
> + /* End participation in shares distribution: */

Adding synchronize_rcu() here will ensure all concurrent RCU "readers"
will have finished what they're doing, so we can unlink safely. That
was, apparently, the missing piece.

> + unregister_fair_sched_group(tg);
> }
>
> static void sched_change_group(struct task_struct *tsk, int type)
>

Now, synchronize_rcu() is quite a heavy hammer. So using a RCU callback
should be more appropriate. I'll hack up something and post a proper
patch, if you don't beat me to.

Mathias