Re: [PATCH 0/5 v2] locks: avoid thundering-herd wake-ups
From: Jeff Layton
Date: Tue Aug 14 2018 - 15:12:38 EST
On Tue, 2018-08-14 at 14:41 -0400, J. Bruce Fields wrote:
> This version looks correct to me, and simpler. I'll be curious to hear
> whatever you learn from testing!
>
> --b.
>
Agreed. I'll go ahead and put this in linux-next with an eye toward
merging in v4.20 if we don't hit any major problems with it.
Thanks again and nice work, Neil!
> On Tue, Aug 14, 2018 at 01:56:51PM +1000, NeilBrown wrote:
> >
> > V2, which added wake_non_conflicts() was more broken than V1 - as
> > Bruce explained there is no transitivity in the blocking relation
> > between locks.
> > So this series takes a simpler approach.
> > It still attached waiters between other waiters as necessary to ensure
> > that:
> > - a waiter is blocked by it's parent (fl->blocker) and all further
> > ancestors, and
> > - the list of waiters on fl_blocked are mutually non-conflicting.
> >
> > When a lock (the root of a tree of requests) is released, only its
> > immediate children (fl_blocked) are woken.
> > When any lock is woken (either because its fl_blocker was released
> > to due to a signal or similar) it with either:
> > - be granted
> > - be aborted
> > - be re-queued beneath some other lock.
> >
> > In the first case tree of blocked locks is moved across to the newly
> > created lock, and the invariants still hold.
> > In the order two cases, the tree or blocked waiters are all detached
> > and woken.
> >
> > Note that this series has not received much testing yet.
> >
> > Original description:
> > If you have a many-core machine, and have many threads all wanting to
> > briefly lock a give file (udev is known to do this), you can get quite
> > poor performance.
> >
> > When one thread releases a lock, it wakes up all other threads that
> > are waiting (classic thundering-herd) - one will get the lock and the
> > others go to sleep.
> > When you have few cores, this is not very noticeable: by the time the
> > 4th or 5th thread gets enough CPU time to try to claim the lock, the
> > earlier threads have claimed it, done what was needed, and released.
> > With 50+ cores, the contention can easily be measured.
> >
> > This patchset creates a tree of pending lock request in which siblings
> > don't conflict and each lock request does conflict with its parent.
> > When a lock is released, only requests which don't conflict with each
> > other a woken.
> >
> > Testing shows that lock-acquisitions-per-second is now fairly stable even
> > as number of contending process goes to 1000. Without this patch,
> > locks-per-second drops off steeply after a few 10s of processes.
> >
> > There is a small cost to this extra complexity.
> > At 20 processes running a particular test on 72 cores, the lock
> > acquisitions per second drops from 1.8 million to 1.4 million with
> > this patch. For 100 processes, this patch still provides 1.4 million
> > while without this patch there are about 700,000.
> >
> > NeilBrown
> >
> > ---
> >
> > NeilBrown (5):
> > fs/locks: rename some lists and pointers.
> > fs/locks: split out __locks_wake_up_blocks().
> > fs/locks: allow a lock request to block other requests.
> > fs/locks: change all *_conflict() functions to return bool.
> > fs/locks: create a tree of dependent requests.
> >
> >
> > fs/cifs/file.c | 2 -
> > fs/locks.c | 156 ++++++++++++++++++++++++++-------------
> > include/linux/fs.h | 7 +-
> > include/trace/events/filelock.h | 16 ++--
> > 4 files changed, 119 insertions(+), 62 deletions(-)
> >
> > --
> > Signature
--
Jeff Layton <jlayton@xxxxxxxxxx>