Re: [PATCH] perf/bench-futex: Avoid worker cacheline bouncing

From: Sebastian Andrzej Siewior
Date: Wed Oct 19 2016 - 14:13:54 EST


On 2016-10-19 10:59:33 [-0700], Davidlohr Bueso wrote:
> Sebastian noted that overhead for worker thread ops (throughput)
> accounting was producing 'perf' to appear in the profiles, consuming
> a non-trivial (ie 13%) amount of CPU. This is due to cacheline
> bouncing due to the increment of w->ops. We can easily fix this by
> just working on a local copy and updating the actual worker once
> done running, and ready to show the program summary. There is no
> danger of the worker being concurrent, so we can trust that no stale
> value is being seen by another thread.
>
> Reported-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
Acked-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>

> --- a/tools/perf/bench/futex-hash.c
> +++ b/tools/perf/bench/futex-hash.c
> @@ -63,8 +63,9 @@ static const char * const bench_futex_hash_usage[] = {
> static void *workerfn(void *arg)
> {
> int ret;
> - unsigned int i;
> struct worker *w = (struct worker *) arg;
> + unsigned int i;
> + unsigned long ops = w->ops; /* avoid cacheline bouncing */

we start at 0 so there is probably no need to init it with w->ops.

Sebastian