Re: [PATCH] sched/fair: Fix detection of per-CPU kthreads waking a task
From: Vincent Guittot
Date: Fri Nov 26 2021 - 10:01:54 EST
On Fri, 26 Nov 2021 at 14:32, Valentin Schneider
<Valentin.Schneider@xxxxxxx> wrote:
>
> On 26/11/21 09:23, Vincent Guittot wrote:
> > On Thu, 25 Nov 2021 at 16:30, Valentin Schneider
> > <Valentin.Schneider@xxxxxxx> wrote:
> >> On 25/11/21 14:23, Vincent Guittot wrote:
> >> > If we want to filter wakeup
> >> > generated by interrupt context while a per cpu kthread is running, it
> >> > would be better to fix all cases and test the running context like
> >> > this
> >> >
> >>
> >> I think that could make sense - though can the idle task issue wakeups in
> >> process context? If so that won't be sufficient. A quick audit tells me:
> >>
> >> o rcu_nocb_flush_deferred_wakeup() happens before calling into cpuidle
> >> o I didn't see any wakeup issued from the cpu_pm_notifier call chain
> >> o I'm not entirely sure about flush_smp_call_function_from_idle(). I found
> >> this thing in RCU:
> >>
> >> smp_call_function_single(cpu, rcu_exp_handler)
> >>
> >> rcu_exp_handler()
> >> rcu_report_exp_rdp()
> >> rcu_report_exp_cpu_mult()
> >> __rcu_report_exp_rnp()
> >> swake_up_one()
> >>
> >> IIUC if set_nr_if_polling() then the smp_call won't send an IPI and should be
> >> handled in that flush_foo_from_idle() call.
> >
> > Aren't all these planned to wakeup on local cpu ? so i don't see any
> > real problem there
> >
>
> Hm so other than boot time oddities I think that does end up with threads
> of an !UNBOUND (so pcpu) workqueue...
>
> >>
> >> I'd be tempted to stick your VincentD's conditions together, just to be
> >> safe...
> >
> > More than safe I would prefer that we fix the correct root cause
> > instead of hiding it
> >
>
> I did play around a bit to see if this could be true when evaluating that
> is_per_cpu_kthread() condition:
>
> is_idle_task(current) && in_task() && p->nr_cpus_allowed > 1
>
> but no luck so far. An in_task() check would appear sufficient, but how's
> this?
>
> ---
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 884f29d07963..f45806b7f47a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6390,14 +6390,18 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> return prev;
>
> /*
> - * Allow a per-cpu kthread to stack with the wakee if the
> - * kworker thread and the tasks previous CPUs are the same.
> - * The assumption is that the wakee queued work for the
> - * per-cpu kthread that is now complete and the wakeup is
> - * essentially a sync wakeup. An obvious example of this
> + * Allow a per-cpu kthread to stack with the wakee if the kworker thread
> + * and the tasks previous CPUs are the same. The assumption is that the
> + * wakee queued work for the per-cpu kthread that is now complete and
> + * the wakeup is essentially a sync wakeup. An obvious example of this
> * pattern is IO completions.
> + *
> + * Ensure the wakeup is issued by the kthread itself, and don't match
> + * against the idle task because that could override the
> + * available_idle_cpu(target) check done higher up.
> */
> - if (is_per_cpu_kthread(current) &&
> + if (is_per_cpu_kthread(current) && !is_idle_task(current) &&
still i don't see the need of !is_idle_task(current)
> + in_task() &&
> prev == smp_processor_id() &&
> this_rq()->nr_running <= 1) {
> return prev;
>