Re: [PATCH 2/3] Linux: Use rseq in sched_getcpu if available (v9)
From: Florian Weimer
Date: Mon Jul 06 2020 - 09:59:50 EST
* Mathieu Desnoyers:
> When available, use the cpu_id field from __rseq_abi on Linux to
> implement sched_getcpu(). Fall-back on the vgetcpu vDSO if
> unavailable.
I've pushed this to glibc master, but unfortunately it looks like this
exposes a kernel bug related to affinity mask changes.
After building and testing glibc, this
for x in {1..2000} ; do posix/tst-affinity-static & done
produces some âerror:â lines for me:
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 138, expected 0
error: Unexpected CPU 138, expected 0
error: Unexpected CPU 138, expected 0
error: Unexpected CPU 138, expected 0
âexpected 0â is a result of how the test has been written, it bails out
on the first failure, which happens with CPU ID 0.
Smaller systems can use a smaller count than 2000 to reproduce this. It
also happens sporadically when running the glibc test suite itself
(which is why it took further testing to reveal this issue).
I can reproduce this with the Debian 4.19.118-2+deb10u1 kernel, the
Fedora 5.6.19-300.fc32 kernel, and the Red Hat Enterprise Linux kernel
4.18.0-193.el8 (all x86_64).
As to the cause, I'd guess that the exit path in the sched_setaffinity
system call fails to update the rseq area, so that userspace can observe
the outdated CPU ID there.
Thanks,
Florian