Re: a Linux select() bug

From: Pádraig Brady
Date: Sun Sep 18 2011 - 13:15:21 EST


On 09/18/2011 05:47 PM, Pádraig Brady wrote:
> On 09/18/2011 03:58 PM, Bruno Haible wrote:
>> Hi Jim, Pádraig,
>>
>> To whom best to report this Linux kernel bug?
>>
>> ==================================== bug.c ====================================
>> /* A POSIX compliance bug in select() (and pselect() also).
>> <http://pubs.opengroup.org/onlinepubs/9699919799/functions/select.html>
>> says:
>>
>> "pselect() and select() shall fail and set errno to:
>> [EBADF]
>> One or more of the file descriptor sets specified a file descriptor
>> that is not a valid open file descriptor."
>> */
>>
>> #include <errno.h>
>> #include <stdio.h>
>> #include <sys/select.h>
>> #include <sys/time.h>
>>
>> static void
>> test (int fd)
>> {
>> struct timeval tv0;
>> fd_set rfds, wfds, xfds;
>> int r;
>>
>> tv0.tv_sec = 0;
>> tv0.tv_usec = 0;
>> FD_ZERO (&rfds);
>> FD_ZERO (&wfds);
>> FD_ZERO (&xfds);
>> FD_SET (fd, &rfds);
>> r = select (fd + 1, &rfds, &wfds, &xfds, &tv0);
>> if (r < 0 && errno == EBADF)
>> printf ("fd=%d: OK, POSIX compliant\n", fd);
>> else
>> printf ("fd=%d: r=%d, bug\n", fd, r);
>> }
>>
>> int
>> main (void)
>> {
>> test (49);
>> #if 0 /* This test succeeds when "grep FDSize /proc/self/status" is 256
>> but fails when it is 64. */
>> test (99);
>> #endif
>> test (399);
>>
>> return 0;
>> }
>> /*
>> Expected result:
>> fd=49: OK, POSIX compliant
>> fd=399: OK, POSIX compliant
>>
>> Actual result on Linux 2.6.37.6:
>> fd=49: OK, POSIX compliant
>> fd=399: r=0, bug
>> */
>> ===============================================================================
>>
>> I found this while extending tests/test-select.c to detect EBADF.
>> I randomly used fd = 99. Interestingly, the test failed 100% of the
>> time under "make":
>>
>> $ make check TESTS=test-select
>> make check-recursive
>> make[1]: Entering directory `/tmp/testdir3/gltests'
>> Making check in .
>> make[2]: Entering directory `/tmp/testdir3/gltests'
>> make check-TESTS
>> make[3]: Entering directory `/tmp/testdir3/gltests'
>> Invalid fd test... failed (select returned 0)
>> failed (invalid fd among rfds)
>> failed (select returned 0)
>> failed (invalid fd among wfds)
>> failed (select returned 0)
>> failed (invalid fd among xfds)
>> Unconnected socket test... passed
>> Connected sockets test... passed
>> General socket test with fork... passed
>> Pipe test... passed
>> FAIL: test-select
>> ==================
>> 1 of 1 test failed
>> ==================
>> make[3]: *** [check-TESTS] Error 1
>> make[3]: Leaving directory `/tmp/testdir3/gltests'
>> make[2]: *** [check-am] Error 2
>> make[2]: Leaving directory `/tmp/testdir3/gltests'
>> make[1]: *** [check-recursive] Error 1
>> make[1]: Leaving directory `/tmp/testdir3/gltests'
>> make: *** [check] Error 2
>>
>> Whereas it passed 100% of the time when invoked directly from the command line:
>>
>> $ ./test-select
>> Invalid fd test... passed
>> Unconnected socket test... passed
>> Connected sockets test... passed
>> General socket test with fork... passed
>> Pipe test... passed
>> $ echo $?
>> 0
>>
>> The reason is that the bug occurs only for 'fd' number >= fdt->max_fds,
>> where fdt is the process' file descriptor table in the kernel. This is the
>> value you get through "grep FDSize /proc/$pid/status". This number is 256,
>> _except_ that it is only 64 in processes created by 'make' through the vfork()
>> system call and their children and offspring.
>>
>> The issue is the same whether 'make' is built as a 32-bit or 64-bit binary.
>>
>> Bruno
>
> Hrm, perhaps this is due to increasing performance while
> supporting old code. I.E. the kernel increases FDSize as the number
> of file descriptors are increased, so that only that number of
> descriptors are ever inspected by select.
> I've seen lots of code do select (FD_SETSIZE, ...)
> so for the performance tweak to work, select() would have
> to effectively use MIN (nfds, FDSize)?
> Now performant code should be using an appropriate nfds variable,
> so I would be a bit surprised if Linux did make lazy user code
> faster, while making erroneous descriptors not give an error.

So I had a look at the code:

https://github.com/mirrors/linux-2.6/blame/master/fs/select.c

which lead to:

https://github.com/mirrors/linux-2.6/commit/bbea9f6

So it does look like a performance tweak.

cheers,
Pádraig.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/