Re: [PATCH 1/3] perf bench futex: benchmark only online CPUs

From: Arnaldo Carvalho de Melo
Date: Fri Nov 24 2017 - 14:08:55 EST


Em Fri, Nov 24, 2017 at 07:32:49AM -0800, Davidlohr Bueso escreveu:
> On Thu, 23 Nov 2017, Arnaldo Carvalho de Melo wrote:
>
> > Em Thu, Nov 23, 2017 at 12:09:48PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Em Wed, Nov 22, 2017 at 06:25:28PM -0600, Kim Phillips escreveu:
> > > > From: James Yang <james.yang@xxxxxxx>
> > > >
> > > > The "perf bench futex" benchmarks have a problem when not all CPUs in
> > > > the system are online: perf assumes the CPUs that are online are
> > > > contiguously numbered and assigns processor affinity to the threads by
> > > > modulo striping. When the online CPUs are not contiguously numbered,
> > > > perf errors out with:
>
> Good catch. Non contiguously numbered cpus was certainly not something I had
> considered.
>
> > > >
> > > > $ echo 0 | sudo tee /sys/devices/system/cpu/cpu3/online
> > > > 0
> > > > $ ./oldperf bench futex all
> > > > perf: pthread_create: Operation not permitted
> > > > Run summary [PID 14934]: 7 threads, each operating on 1024 [private] futexes for 10 secs.
> > > >
> > > > $
> > > >
> > > > This patch makes perf not assume all CPUs configured are online, and
> > > > adds a mapping table to stripe affinity across the CPUs that are
> > > > online.
> > >
> > > So, have you looked at tools/perf/util/cpumap.c? I think you can use:
> > >
> > > int i;
> > > struct cpu_map *cpus = cpu_map__new(NULL);
> > >
> > > for (i = 0; i < cpus->nr; ++i) {
> > > int cpu = cpus->map[i];
> > > ...
> > > }
> > >
> > > No?
>
> Ah, I was just thinking there should be something like this in perf already :)
>
> >
> > But then, this can be done later, as probably will result in more
> > changes, I'm continuing to review the other patches.
>
> Unsure if you're implying otherwise, but I would strongly prefer a v2 was sent
> to use the perf's cpumap.c.

I try not to ask too much from people, i.e. their work already improves
the current situation, for their use case, so could be applied, but
yeah, it would be better, if James (or somebody else) is willing to try
and use the perf cpumap infrastructure to reduce the bloat and actually
validate even more it, James?

- Arnaldo