Re: [PATCH/RFC] mutex: Fix optimistic spinning vs. BKL
From: Peter Zijlstra
Date: Tue May 11 2010 - 14:20:00 EST
On Tue, 2010-05-11 at 11:06 -0700, Linus Torvalds wrote:
>
> On Mon, 10 May 2010, Peter Zijlstra wrote:
> >
> > As to the 2 jiffy spin timeout, I guess we should add a lockdep warning
> > for that, because anybody holding a mutex for longer than 2 jiffies and
> > not sleeping does need fixing anyway.
>
> I really hate the jiffies thing, but looking at the optimistic spinning, I
> do wonder about two things..
>
> First - we check "need_resched()" only if owner is NULL. That sounds
> wrong. If we need to reschedule, we need to stop spinning _regardless_ of
> whether the owner may have been preempted before setting the owner field.
There is a second need_resched() in the inner spin loop in
kernel/sched.c:mutex_spin_on_owner().
> Second: we allow "owner" to change, and we'll continue spinning. This is
> how you can end up spinning for a long time - not because anybody holds
> the mutex for longer than 2 jiffies, but because a lot of other threads
> _together_ hold the mutex for longer than 2 jiffies.
Granted.
> Now, I think we do want some limited "continue spinning even if somebody
> else ended up getting it instead", but I think we should at least limit
> it. Otherwise we end up being potentially rather unfair, since we don't
> have any fair queueing logic for the optimistic spinning phase.
>
> Now, we could just count the number of times "owner" has changed, and I
> suspect that would be sufficient. Now, that trivial counting sceme would
> fail if "owner" stays the same (ie the same process re-takes the lock over
> and over again, possibly due to hot cacheline things being very unfair
> to the person who already owns it), but quite frankly, I don't think we
> can get into that kind of situation.
>
> Why? Mutexes may end up being very heavily contended, but they can't be
> contended by just _one_ thread. So if we're really in a starvation issue,
> the thread that is waiting _will_ see multiple different owners.
>
> So once you have seen X number of other owners, you just say "screw it,
> this spinning thing isn't working for me, I'll go to the sleeping case".
Right, so basically count the number of mutex_spin_on_owner() calls and
bail when >N.
> Of course, it's quite possible that as long as "need_resched()" isn't set,
> spinning really _is_ the right thing to do. Maybe it causes horrible CPU
> load on some odd "everybody synchronize" loads, but maybe that really is
> the best we can do.
Ben's argument was that spinning for a long time wrecks power usage.
That said, I'd still like a counter/event/warning to see if someone
actually manages to hold onto a mutex for long (2 jiffies) without
scheduling at all. If we ever run into something like that, that needs
to get fixed regardless.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/