Re: Endless soft-lockups for compiling workload since next-20200519

From: Peter Zijlstra
Date: Mon May 25 2020 - 10:39:21 EST


On Mon, May 25, 2020 at 04:05:49PM +0200, Frederic Weisbecker wrote:
> On Mon, May 25, 2020 at 03:21:05PM +0200, Peter Zijlstra wrote:
> > @@ -2320,7 +2304,7 @@ static void ttwu_queue_remote(struct task_struct *p, int cpu, int wake_flags)
> >
> > if (llist_add(&p->wake_entry, &rq->wake_list)) {
> > if (!set_nr_if_polling(rq->idle))
> > - smp_call_function_single_async(cpu, &rq->wake_csd);
> > + smp_call_function_single_async(cpu, &p->wake_csd);
> > else
> > trace_sched_wake_idle_without_ipi(cpu);
>
> Ok that's of course very unlikely but could it be possible to have the
> following:
>
> CPU 0 CPU 1 CPU 2
> -----
>
> //Wake up A
> ttwu_queue(TASK A, CPU 1) idle_loop {
> ttwu_queue_pending {
> ....
> raw_spin_unlock_irqrestore(rq)
> # VMEXIT (with IPI still pending)
> //task A migrates here
> wait_event(....)
> //sleep
>
> //Wake up A
> ttwu_queue(TASK A, CPU 2) {
> //IPI on CPU 2 ignored
> // due to csd->flags == CSD_LOCK
>

Right you are.

Bah!

More thinking....