Re: [PATCH 1/3] tools/perf: Fix the nrcpus in perf bench futex to enable the run when all CPU's are not online

From: Athira Rajeev
Date: Wed Jun 12 2024 - 08:12:11 EST




> On 10 Jun 2024, at 7:52 PM, Disha Goel <disgoel@xxxxxxxxxxxxx> wrote:
>
> On 07/06/24 10:13 am, Athira Rajeev wrote:
>
>> Perf bench futex fails as below when attempted to run on
>> on a powerpc system:
>>
>> ./perf bench futex all
>> Running futex/hash benchmark...
>> Run summary [PID 626307]: 80 threads, each operating on 1024 [private] futexes for 10 secs.
>>
>> perf: pthread_create: No such file or directory
>>
>> In the setup where this perf bench was ran, difference was that
>> partition had 640 CPU's, but not all CPUs were online. 80 CPUs
>> were online. While blocking the threads with futex_wait, code
>> sets the affinity using cpumask. The cpumask size used is 80
>> which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the
>> benchmark reports fail while setting affinity for cpu number which
>> is greater than 80 or higher, because it attempts to set a bit
>> position which is not allocated on the cpumask. Fix this by changing
>> the size of cpumask to number of possible cpus and not the number
>> of online cpus.
>>
>> Signed-off-by: Athira Rajeev <atrajeev@xxxxxxxxxxxxxxxxxx>
>
> Thanks for the fix patches, Athira.
> I have tested all three patches on a power machine (both small and max config),
> and the perf bench futex and epoll tests run fine.
>
> For the series:
> Tested-by: Disha Goel <disgoel@xxxxxxxxxxxxx>

Thanks Disha for testing the patchset.

Athira
>
>> ---
>> tools/perf/bench/futex-hash.c | 2 +-
>> tools/perf/bench/futex-lock-pi.c | 2 +-
>> tools/perf/bench/futex-requeue.c | 2 +-
>> tools/perf/bench/futex-wake-parallel.c | 2 +-
>> tools/perf/bench/futex-wake.c | 2 +-
>> 5 files changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
>> index 0c69d20efa32..b472eded521b 100644
>> --- a/tools/perf/bench/futex-hash.c
>> +++ b/tools/perf/bench/futex-hash.c
>> @@ -174,7 +174,7 @@ int bench_futex_hash(int argc, const char **argv)
>> pthread_attr_init(&thread_attr);
>> gettimeofday(&bench__start, NULL);
>> - nrcpus = perf_cpu_map__nr(cpu);
>> + nrcpus = cpu__max_cpu().cpu;
>> cpuset = CPU_ALLOC(nrcpus);
>> BUG_ON(!cpuset);
>> size = CPU_ALLOC_SIZE(nrcpus);
>> diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
>> index 7a4973346180..0416120c091b 100644
>> --- a/tools/perf/bench/futex-lock-pi.c
>> +++ b/tools/perf/bench/futex-lock-pi.c
>> @@ -122,7 +122,7 @@ static void create_threads(struct worker *w, struct perf_cpu_map *cpu)
>> {
>> cpu_set_t *cpuset;
>> unsigned int i;
>> - int nrcpus = perf_cpu_map__nr(cpu);
>> + int nrcpus = cpu__max_cpu().cpu;
>> size_t size;
>> threads_starting = params.nthreads;
>> diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c
>> index d9ad736c1a3e..aad5bfc4fe18 100644
>> --- a/tools/perf/bench/futex-requeue.c
>> +++ b/tools/perf/bench/futex-requeue.c
>> @@ -125,7 +125,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
>> {
>> cpu_set_t *cpuset;
>> unsigned int i;
>> - int nrcpus = perf_cpu_map__nr(cpu);
>> + int nrcpus = cpu__max_cpu().cpu;
>> size_t size;
>> threads_starting = params.nthreads;
>> diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
>> index b66df553e561..90a5b91bf139 100644
>> --- a/tools/perf/bench/futex-wake-parallel.c
>> +++ b/tools/perf/bench/futex-wake-parallel.c
>> @@ -149,7 +149,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
>> {
>> cpu_set_t *cpuset;
>> unsigned int i;
>> - int nrcpus = perf_cpu_map__nr(cpu);
>> + int nrcpus = cpu__max_cpu().cpu;
>> size_t size;
>> threads_starting = params.nthreads;
>> diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c
>> index 690fd6d3da13..49b3c89b0b35 100644
>> --- a/tools/perf/bench/futex-wake.c
>> +++ b/tools/perf/bench/futex-wake.c
>> @@ -100,7 +100,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
>> cpu_set_t *cpuset;
>> unsigned int i;
>> size_t size;
>> - int nrcpus = perf_cpu_map__nr(cpu);
>> + int nrcpus = cpu__max_cpu().cpu;
>> threads_starting = params.nthreads;
>> cpuset = CPU_ALLOC(nrcpus);