Re: [BUG] "sched: Remove rq->lock from the first half of ttwu()"locks up on ARM

From: Marc Zyngier
Date: Thu May 26 2011 - 10:55:47 EST


On Thu, 2011-05-26 at 14:21 +0200, Peter Zijlstra wrote:
> On Thu, 2011-05-26 at 13:32 +0200, Peter Zijlstra wrote:
> >
> > The bad news is of course that I've got a little more head-scratching to
> > do, will keep you informed.
>
> OK, that wasn't too hard.. (/me crosses fingers and prays Marc doesn't
> find more funnies ;-).
>
> Does the below cure all woes?

So far so good. The box just went through it's two first iterations of
kernel building without a sweat, carried on, and still feels snappy
enough.

Thanks for having fixed that quickly!

> ---
> Subject: sched: Fix ttwu() for __ARCH_WANT_INTERRUPTS_ON_CTXSW
> From: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> Date: Thu May 26 14:21:33 CEST 2011
>
> Marc reported that e4a52bcb9 (sched: Remove rq->lock from the first
> half of ttwu()) broke his ARM-SMP machine. Now ARM is one of the few
> __ARCH_WANT_INTERRUPTS_ON_CTXSW users, so that exception in the ttwu()
> code was suspect.
>
> Yong found that the interrupt could hit hits after context_switch() changes
> current but before it clears p->on_cpu, if that interrupt were to
> attempt a wake-up of p we would indeed find ourselves spinning in IRQ
> context.
>
> Sort this by reverting to the old behaviour for this situation and
> perform a full remote wake-up.
>
> Cc: Frank Rowand <frank.rowand@xxxxxxxxxxx>
> Cc: Yong Zhang <yong.zhang0@xxxxxxxxx>
> Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
> Reported-by: Marc Zyngier <Marc.Zyngier@xxxxxxx>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>

Tested-by: Marc Zyngier <marc.zyngier@xxxxxxx>

M.
--
Reality is an implementation detail.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/