Re: [PATCH 0/5 - V2] locks: avoid thundering-herd wake-ups

From: Jeff Layton
Date: Sat Aug 11 2018 - 09:15:32 EST


On Sat, 2018-08-11 at 08:21 -0400, J. Bruce Fields wrote:
> On Sat, Aug 11, 2018 at 07:51:13AM -0400, Jeff Layton wrote:
> > On Thu, 2018-08-09 at 20:29 -0400, J. Bruce Fields wrote:
> > > On Fri, Aug 10, 2018 at 08:12:43AM +1000, NeilBrown wrote:
> > > > On Thu, Aug 09 2018, J. Bruce Fields wrote:
> > > >
> > > > > I think there's also a problem with multiple tasks sharing the same
> > > > > lock owner.
> > > > >
> > > > > So, all locks are exclusive locks for the same range. We have four
> > > > > tasks. Tasks 1 and 4 share the same owner, the others' owners are
> > > > > distinct.
> > > > >
> > > > > - Task 1 gets a lock.
> > > > > - Task 2 gets a conflicting lock.
> > > > > - Task 3 gets another conflicting lock. So now we the tree is
> > > > > 3->2->1.
> > > > > - Task 1's lock is released.
> > > > > - Before task 2 is scheduled, task 4 acquires a new lock.
> > > > > - Task 2 waits on task 4's lock, we now have
> > > > > 3->2->4.
> > > > >
> > > > > Task 3 shouldn't be waiting--the lock it's requesting has the same owner
> > > > > as the lock task 4 holds--but we fail to wake up task 3.
> > > >
> > > > So task 1 and task 4 are threads in the one process - OK.
> > > > Tasks 2 and 3 are threads in two other processes.
> > > >
> > > > So 2 and 3 conflict with either 1 or 4 equally - why should task 3 be
> > > > woken?
> > > >
> > > > I suspect you got the numbers bit mixed up,
> > >
> > > Whoops.
> > >
> > > > but in any case, the "conflict()" function that is passed around takes
> > > > ownership into account when assessing if one lock conflicts with
> > > > another.
> > >
> > > Right, I know, but, let me try again:
> > >
> > > All locks are exclusive locks for the same range. Only tasks 3 and 4
> > > share the the same owner.
> > >
> > > - Task 1 gets a lock.
> > > - Task 2 requests a conflicting lock, so we have 2->1.
> > > - Task 3 requests a conflicting lock, so we have 3->2->1.
> > > - Task 1 unlocks. We wake up task 2, but it isn't scheduled yet.
> > > - Task 4 gets a new lock.
> > > - Task 2 runs, discovers the conflict, and waits. Now we have:
> > > 3->2->4.
> > >
> > > There is no conflict between the lock 3 requested and the lock 4 holds,
> > > but 3 is not woken up.
> > >
> > > This is another version of the first problem: there's information we
> > > need (the owners of the waiting locks in the tree) that we can't
> > > determine just from looking at the root of the tree.
> > >
> > > I'm not sure what to do about that.
> > >
> >
> > Is this still a problem in the v2 set?
> >
> > wake_non_conflicts walks the whole tree of requests that were blocked on
> > it,
>
> Not in the FL_TRANSITIVE_CONFLICT case, which is the case here.
>

Got it. Yeah, I'm not sure that the idea of FL_TRANSITIVE_CONFLICT
really works. I think you could fix this by just getting rid of that and
folding it into the FL_CONFLICT case.

In more complex situations (like the one you describe), you may end up
cycling through several blocked requests before you hit one that can get
the lock. It may slow things down some for those cases. In the more
common locking scenarios (whole file locks, different owners), I think
this will be much more efficient by avoiding so many wakeups.

--
Jeff Layton <jlayton@xxxxxxxxxx>