Re: [RFC][PATCH 5/5] sched: Reduce ttwu rq->lock contention

From: Frederic Weisbecker
Date: Thu Dec 16 2010 - 10:31:17 EST


On Thu, Dec 16, 2010 at 03:56:07PM +0100, Peter Zijlstra wrote:
> Reduce rq->lock contention on try_to_wake_up() by changing the task
> state using a cmpxchg loop.
>
> Once the task is set to TASK_WAKING we're guaranteed the only one
> poking at it, then proceed to pick a new cpu without holding the
> rq->lock (XXX this opens some races).
>
> Then instead of locking the remote rq and activating the task, place
> the task on a remote queue, again using cmpxchg, and notify the remote
> cpu per IPI if this queue was empty to start processing its wakeups.
>
> This avoids (in most cases) having to lock the remote runqueue (and
> therefore the exclusive cacheline transfer thereof) but also touching
> all the remote runqueue data structures needed for the actual
> activation.
>
> As measured using: http://oss.oracle.com/~mason/sembench.c
>
> $ echo 4096 32000 64 128 > /proc/sys/kernel/sem
> $ ./sembench -t 2048 -w 1900 -o 0
>
> unpatched: run time 30 seconds 537953 worker burns per second
> patched: run time 30 seconds 657336 worker burns per second
>
> Still need to sort out all the races marked XXX (non-trivial), and its
> x86 only for the moment.
>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> ---
> arch/x86/kernel/smp.c | 1
> include/linux/sched.h | 7 -
> kernel/sched.c | 241 ++++++++++++++++++++++++++++++++++--------------
> kernel/sched_fair.c | 5
> kernel/sched_features.h | 3
> kernel/sched_idletask.c | 2
> kernel/sched_rt.c | 4
> kernel/sched_stoptask.c | 3
> 8 files changed, 190 insertions(+), 76 deletions(-)
>
> Index: linux-2.6/arch/x86/kernel/smp.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/smp.c
> +++ linux-2.6/arch/x86/kernel/smp.c
> @@ -205,6 +205,7 @@ void smp_reschedule_interrupt(struct pt_
> /*
> * KVM uses this interrupt to force a cpu out of guest mode
> */
> + sched_ttwu_pending();
> }

Great, that's going to greatly simplify and lower the overhead of
the remote tick restart I'm doing on wake up for the nohz task thing.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/