Re: select() not returning though pipe became readable

From: Andrew Morton
Date: Thu Mar 24 2005 - 20:48:16 EST


Lutz Vieweg <lutz.vieweg@xxxxxxxxxxxxxxx> wrote:
>
> I'm currently investigating the following problem, which seems to indicate
> a misbehaviour of the kernel:
>
> A server software we implemented is sporadically "hanging" in a select()
> call since we upgraded from kernel 2.4 to (currently) 2.6.9 (we have to wait
> for 2.6.12 before we can upgrade again due to the shared-mem-not-dumped-into-
> core-files problem addressed there).
>
> What's suspicious is that whenever we attach with gdb to such a hanging process,
> we can see that a pipe, whose file-descriptor is definitely included in the
> fd_set "readfds" (and "n" is also high enough) has a byte in it available for
> reading - and just leaving gdb again is enough to let the server continue just
> fine.
>
> We are using that pipe, which is known only to the same one process, to cause
> select() to return immediately if a signal (SIGUSR1) had been delivered to the
> process (by another process), there's a signal handler installed that does
> nothing but a (non-blocking) write of 1 byte to the writing end of the pipe.
>
> This mechanism worked fine before kernel 2.6, and it is still working in 99.99% of
> the cases, but under heavy load, every few hours, we'll see the hanging select()
> as mentioned above.
>
> I noticed a recent thread at lkml about poll() and pipes, but that seems to address a
> different issue, where there are more events reported than occured, what we
> see is quite the opposite, we want select() to return on that pipe becoming readable...
>
> Any ideas?
> Any hints on what to do to investigate the problem further?

Could you at least test 2.6.12-rc1? Otherwise we might be looking for a
bug whicj isn't there.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/