RE: [PATCH 0/4] locks: avoid thundering-herd wake-ups

From: Frank Filz
Date: Wed Aug 08 2018 - 19:34:36 EST


> On Wed, 2018-08-08 at 17:28 -0400, J. Bruce Fields wrote:
> > On Wed, Aug 08, 2018 at 04:09:12PM -0400, J. Bruce Fields wrote:
> > > On Wed, Aug 08, 2018 at 03:54:45PM -0400, J. Bruce Fields wrote:
> > > > On Wed, Aug 08, 2018 at 11:51:07AM +1000, NeilBrown wrote:
> > > > > If you have a many-core machine, and have many threads all
> > > > > wanting to briefly lock a give file (udev is known to do this),
> > > > > you can get quite poor performance.
> > > > >
> > > > > When one thread releases a lock, it wakes up all other threads
> > > > > that are waiting (classic thundering-herd) - one will get the
> > > > > lock and the others go to sleep.
> > > > > When you have few cores, this is not very noticeable: by the
> > > > > time the 4th or 5th thread gets enough CPU time to try to claim
> > > > > the lock, the earlier threads have claimed it, done what was needed, and
> released.
> > > > > With 50+ cores, the contention can easily be measured.
> > > > >
> > > > > This patchset creates a tree of pending lock request in which
> > > > > siblings don't conflict and each lock request does conflict with its parent.
> > > > > When a lock is released, only requests which don't conflict with
> > > > > each other a woken.
> > > >
> > > > Are you sure you aren't depending on the (incorrect) assumption
> > > > that "X blocks Y" is a transitive relation?
> > > >
> > > > OK I should be able to answer that question myself, my patience
> > > > for code-reading is at a real low this afternoon....
> > >
> > > In other words, is there the possibility of a tree of, say,
> > > exclusive locks with (offset, length) like:
> > >
> > > (0, 2) waiting on (1, 2) waiting on (2, 2) waiting on (0, 4)
> > >
> > > and when waking (0, 4) you could wake up (2, 2) but not (0, 2),
> > > leaving a process waiting without there being an actual conflict.
> >
> > After batting it back and forth with Jeff on IRC.... So do I
> > understand right that when we wake a waiter, we leave its own tree of
> > waiters intact, and when it wakes if it finds a conflict it just adds
> > it lock (with tree of waiters) in to the tree of the conflicting lock?
> >
> > If so then yes I think that depends on the transitivity
> > assumption--you're assuming that finding a conflict between the root
> > of the tree and a lock proves that all the other members of the tree
> > also conflict.
> >
> > So maybe this example works. (All locks are exclusive and written
> > (offset, length), X->Y means X is waiting on Y.)
> >
> > process acquires (0,3)
> > 2nd process requests (1,2), is put to sleep.
> > 3rd process requests (0,2), is put to sleep.
> >
> > The tree of waiters now looks like (0,2)->(1,2)->(0,3)
> >
> > (0,3) is unlocked.
> > A 4th process races in and locks (2,2).
> > The 2nd process wakes up, sees this new conflict, and waits on
> > (2,2). Now the tree looks like (0,2)->(1,2)->(2,2), and (0,2)
> > is waiting for no reason.
> >
>
> That seems like a legit problem.
>
> One possible fix might be to have the waiter on (1,2) walk down the entire
> subtree and wake up any waiter that is waiting on a lock that doesn't conflict
> with the lock on which it's waiting.
>
> So, before the task waiting on 1,2 goes back to sleep to wait on 2,2, it could
> walk down its entire fl_blocked subtree and wake up anything waiting on a lock
> that doesn't conflict with (2,2).
>
> That's potentially an expensive operation, but:
>
> a) the task is going back to sleep anyway, so letting it do a little extra work
> before that should be no big deal
>
> b) it's probably still cheaper than waking up the whole herd

Yea, I think so.

Now here's another question... How does this new logic play with Open File Description Locks? Should still be ok since there's a thread waiting on each of those.

Frank