Re: [PATCH] locking/mutex: Avoid spinner vs waiter starvation

From: Jason Low
Date: Wed Feb 03 2016 - 20:37:24 EST


On Mon, 2016-02-01 at 11:08 +0100, Peter Zijlstra wrote:
> On Sat, Jan 30, 2016 at 09:18:44AM +0800, Ding Tianhong wrote:
> > On 2016/1/29 17:53, Peter Zijlstra wrote:
> > > On Sun, Jan 24, 2016 at 04:03:50PM +0800, Ding Tianhong wrote:
> > >
> > >> looks good to me, I will try this solution and report the result, thanks everyone.
> > >
> > > Did you get a change to run with this?
> > >
> > > .
> > >
> >
> > I backport this patch to 3.10 lts kernel, and didn't change any logic,
> > Till now, the patch works fine to me, and no need to change anything,
> > So I think this patch is no problem, could you formal release this
> > patch to the latest kernel? :)
>
> Thanks for testing, I've queued the below patch.
>
> ---
> Subject: locking/mutex: Avoid spinner vs waiter starvation
> From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Date: Fri, 22 Jan 2016 12:06:53 +0100
>
> Ding Tianhong reported that under his load the optimistic spinners
> would totally starve a task that ended up on the wait list.
>
> Fix this by ensuring the top waiter also partakes in the optimistic
> spin queue.
>
> There are a few subtle differences between the assumed state of
> regular optimistic spinners and those already on the wait list, which
> result in the @acquired complication of the acquire path.
>
> Most notable are:
>
> - waiters are on the wait list and need to be taken off
> - mutex_optimistic_spin() sets the lock->count to 0 on acquire
> even though there might be more tasks on the wait list.
>
> Cc: Jason Low <jason.low2@xxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Waiman Long <waiman.long@xxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
> Cc: Davidlohr Bueso <dave@xxxxxxxxxxxx>
> Cc: Will Deacon <Will.Deacon@xxxxxxx>
> Reported-by: Ding Tianhong <dingtianhong@xxxxxxxxxx>
> Tested-by: Ding Tianhong <dingtianhong@xxxxxxxxxx>
> Tested-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
> Suggested-by: Waiman Long <Waiman.Long@xxxxxx>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> Link: http://lkml.kernel.org/r/20160122110653.GF6375@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

I've done some testing with this patch with some of the AIM7 workloads
and found that this reduced throughput by about 10%. The reduction in
throughput is expected since spinning as a waiter is less efficient.

Another observation I made is that the top waiter spinners would often
times require needing to reschedule before being able to acquire the
lock from spinning when there was high contention. A waiter can go into
the cycle of spin -> reschedule -> spin -> reschedule. So although the
chance of starvation is reduced, this patch doesn't fully address the
issue of waiter starvation.

Jason