Re: [PATCH 0/4] locks: avoid thundering-herd wake-ups

From: J. Bruce Fields
Date: Thu Aug 09 2018 - 09:00:05 EST


On Wed, Aug 08, 2018 at 06:50:06PM -0400, Jeff Layton wrote:
> That seems like a legit problem.
>
> One possible fix might be to have the waiter on (1,2) walk down the
> entire subtree and wake up any waiter that is waiting on a lock that
> doesn't conflict with the lock on which it's waiting.
>
> So, before the task waiting on 1,2 goes back to sleep to wait on 2,2, it
> could walk down its entire fl_blocked subtree and wake up anything
> waiting on a lock that doesn't conflict with (2,2).
>
> That's potentially an expensive operation, but:
>
> a) the task is going back to sleep anyway, so letting it do a little
> extra work before that should be no big deal

I don't understand why cpu used by a process going to sleep is cheaper
than cpu used in any other situation.

> b) it's probably still cheaper than waking up the whole herd

Yeah, I'd like to understand this.

I feel like Neil's addressing two different performance costs:

- the cost of waking up all the waiters
- the cost of walking the list of waiters

Are they equally important?

If we only cared about the former, and only in simple cases, we could
walk the entire list and skip waking up only the locks that conflict
with the first one we wake. We wouldn't need the tree.

--b.