Re: [patch 00/12] futex: Cure robust/PI futex exit races
From: Florian Weimer
Date: Mon Nov 11 2019 - 04:49:07 EST
* Thomas Gleixner:
> pthread_create() returns EAGAIN while the underlying problem is ENOMEM
> which causes this bonkers output:
>
> error: pthread_create for thread 253 failed: Resource temporarily unavailable
>
> There is nothing temporarily. The process has its address space exhausted.
Thanks for analyzing the failure. I thought we had already covered
that. I've fixed the test locally and will submit the changes. The
fixed test passes, as expected.
I expected that we've fixed all such occurrences of per-CPU thread
creation, but apparently not. 8-(
> That test's output is anyway strange:
>
> info: Detected CPU set size (in bits): 225
> info: Maximum test CPU: 255
>
> Interesting how it fits 256 CPUs into a cpuset with a size of 225 bits.
That's an unfortunate side effect of how the CPU set allocation works in
userspace. The allocation uses a size meaured in bits (which are
rounded up, out of necessity), but the kernel interface is byte-based.
The kernel does not not know that some bits are padding, and happily
writes result data there. So we get bits back for which no space had
been allocated explicitly.
I hesitated to clean this up because the story on the kernel side was
equally mystifying. getaffinity requires space in the mask for CPUs
that are currently not present and whose affinity bits are not set. Due
to firmware bugs, this means that we can cross the magic 1024 bits
boundary (corresponding to the 128 byte legacy mask size), and some
applications will refuse to start. 8-( There was considerable
controversy on the kernel side the last time this came up IIRC.
Thanks,
Florian