[BUG] v6.3-rc2 regresses sched_getaffinity() for arm64

From: Ryan Roberts
Date: Tue Mar 14 2023 - 18:41:53 EST


Hi Linus,

I need to report a regression in v6.3-rc2 where sched_getaffinity() returns an
incorrect cpu_set, at least when running on arm64. Git bisect shows this patch
as the culprit, authored by you:

596ff4a09b89 cpumask: re-introduce constant-sized cpumask optimizations

Apologies if this is the wrong channel for reporting this - I couldn't find a
suitable mail on the list for this patch to reply to. Happy to direct it
somewhere else if appropriate.


Details:

I'm running v6.3-rc2 kernel in a VM on Ampere Altra (arm64 system). The VM is
assigned 8 vCPUs. The kernel is defconfig and I'm booting into an Ubuntu
user-space. `nproc` returns a value that fluctuates from call to call in the
range ~80-100. If I run with v6.2, nproc always returns 8, as expected.

nproc is calling sched_getaffinity() with a 1024 entry cpu_set mask, then adds
up all the set bits to find the number of CPUs. I wrote a test program and can
see that the first 8 bits are always correctly set and most of the other bits
are always correctly 0. But bits ~64-224 are randomly set/clear from call to call.


Test program:

#define _GNU_SOURCE /* See feature_test_macros(7) */
#include <sched.h>
#include <stdio.h>

#define SET_SIZE 1024

static void print_cpu_set(cpu_set_t *cpu_set)
{
int ret, i, j, k;

printf("cpu_count=%d\n", CPU_COUNT(cpu_set));
for (i = 0; i < SET_SIZE;) {
printf("[%03d]: ", i);
for (k = 0; k < 8; k++) {
for (j = 0; j < 8; j++, i++) {
printf("%d", CPU_ISSET(i, cpu_set));
}
printf(" ");
}
printf("\n");
}
}

int main()
{
int ret;
cpu_set_t *cpu_set;
size_t size;

cpu_set = CPU_ALLOC(SET_SIZE);
size = CPU_ALLOC_SIZE(SET_SIZE);
CPU_ZERO(cpu_set);

printf("before:\n");
print_cpu_set(cpu_set);
ret = sched_getaffinity(0, size, cpu_set);
printf("ret=%d\n", ret);
printf("after:\n");
print_cpu_set(cpu_set);

return 0;
}


Broken output on v6.3-rc2:

before:
cpu_count=0
[000]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[064]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[128]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[192]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[256]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[320]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[384]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[448]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[512]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[576]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[640]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[704]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[768]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[832]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[896]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[960]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ret=0
after:
cpu_count=82
[000]: 11111111 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[064]: 00000100 10110111 00110010 01101001 11111111 11111111 00000000 00000000
[128]: 00010101 00001101 11011111 10001110 11110001 10100101 11111111 11111111
[192]: 00000000 00001000 00000000 00000100 00000000 00000000 00000000 00000000
[256]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[320]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[384]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[448]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[512]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[576]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[640]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[704]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[768]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[832]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[896]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[960]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000


Correct output in v6.2:

before:
cpu_count=0
[000]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[064]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[128]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[192]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[256]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[320]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[384]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[448]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[512]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[576]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[640]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[704]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[768]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[832]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[896]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[960]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ret=0
after:
cpu_count=8
[000]: 11111111 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[064]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[128]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[192]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[256]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[320]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[384]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[448]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[512]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[576]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[640]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[704]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[768]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[832]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[896]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[960]: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000


Thanks,
Ryan