Documenting execve() and EAGAIN

From: Michael Kerrisk (man-pages)
Date: Wed May 21 2014 - 14:12:43 EST


Vasily (and Motohiro),

Sometime ago, Motohiro raised a documentation bug
( https://bugzilla.kernel.org/show_bug.cgi?id=42704 ) which
relates to your commit 72fa59970f8698023045ab0713d66f3f4f96945c
("move RLIMIT_NPROC check from set_user() to do_execve_common()")

I have attempted to document this, and I would like to ask you
(and Motohiro) if you would review the text proposed below for
the exceve(2) man page.

Thank you,

Michael


ERRORS
EAGAIN (since Linux 3.1)
Having changed its real UID using one of the set*uid()
calls, the caller wasâand is now stillâabove its
RLIMIT_NPROC resource limit (see setrlimit(2)). For a
more detailed explanation of this error, see NOTES.

NOTES
execve() and EAGAIN
A more detailed explanation of the EAGAIN error that can occur
(since Linux 3.1) when calling execve() is as follows.

The EAGAIN error can occur when a preceding call to setuid(2),
setreuid(2), or setresuid(2) caused the real user ID of the
process to change, and that change caused the process to
exceed its RLIMIT_NPROC resource limit (i.e., the number of
processes belonging to the new real UID exceeds the resource
limit). In Linux 3.0 and earlier, this caused the set*uid()
call to fail.

Since Linux 3.1, the scenario just described no longer causes
the set*uid() call to fail, because it too often led to secuâ
rity holes because buggy applications didn't check the return
status and assumed thatâif the caller had root privilegesâthe
call would always succeed. Instead, the set*uid() calls now
successfully change real UID, but the kernel sets an internal
flag, named PF_NPROC_EXCEEDED, to note that the RLIMIT_NPROC
resource limit has been exceeded. If the resource limit is
still exceeded at the time of a subsequent execve() call, that
call fails with the error EAGAIN. This kernel logic ensures
that the RLIMIT_NPROC resource limit is still enforced for the
common privileged daemon workflowânamely, fork(2)+ set*uid()+
execve(2).

If the resource limit was not still exceeded at the time of
the execve() call (because other processes belonging to this
real UID terminated between the set*uid() call and the
execve() call), then the execve() call succeeds and the kernel
clears the PF_NPROC_EXCEEDED process flag. The flag is also
cleared if a subsequent call to fork(2) by this process sucâ
ceeds.

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/