Re: Help requested: futex(..., FUTEX_WAIT_PRIVATE, ...) returns EPERM

From: Mikael Pettersson
Date: Wed Nov 13 2019 - 08:30:03 EST


On Tue, Nov 12, 2019 at 6:43 PM Harris, Robert
<robert.harris@xxxxxxxxxxxxxx> wrote:
>
> I am investigating an issue on 4.9.184 in which futex() returns EPERM
> intermittently for
>
> futex(uaddr, FUTEX_WAIT_PRIVATE, val, &timeout, NULL, 0)
>
> The failure affects an application in an AWS lambda; traditional
> debugging approaches vary from difficult to impossible. I cannot
> reproduce the problem at will, instrument the kernel, install a new
> kernel or get an application core dump.
>
> Understanding the circumstances under which EPERM can be returned for
> FUTEX_WAIT_PRIVATE would be useful but it is not a documented failure
> mode. I have spent some time looking through futex.c but have not
> found anything yet. I would be grateful for a hint from someone more
> knowledgeable.


I just wanted to add that a colleague of mine reported the exact same
issue to me two days ago: a highly threaded application (the Erlang
VM) running in AWS lambda, futex wait calls occasionally failing with
EPERM. I don't have more specifics than that, I've asked for kernel
version and the exact parameters in the failed futex call.

(Third attempt, really sorry about the noise, gmail's UI sucks.)