Re: sched: softlockups in multi_cpu_stop

From: Linus Torvalds
Date: Fri Mar 06 2015 - 14:05:35 EST


On Fri, Mar 6, 2015 at 10:57 AM, Jason Low <jason.low2@xxxxxx> wrote:
>
> Right, the can_spin_on_owner() was originally added to the mutex
> spinning code for optimization purposes, particularly so that we can
> avoid adding the spinner to the OSQ only to find that it doesn't need to
> spin. This function needing to return a correct value should really only
> affect performance, so yes, lockups due to this seems surprising.

Well, softlockups aren't about "correct behavior". They are about
certain things not happening in a timely manner.

Clearly the mutex code now tries to hold on to the CPU too aggressively.

At some point people need to admit that busy-looping isn't always a
good idea. Especially if

(a) we could idle the core instead

(b) the tuning has been done based on som especial-purpose benchmark
that is likely not realistic

(c) we get reports from people that it causes problems.

In other words: Let's just undo that excessive busy-looping. The
performance numbers were dubious to begin with. Real scalability comes
from fixing the locking, not from trying to play games with the locks
themselves. Particularly games that then cause problems.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/