Re: waitid breaks telnet

From: Joe Korty
Date: Wed Dec 01 2004 - 08:33:38 EST


At 08:27 PM 11/30/04 -0800, Andrew Morton wrote:
Joe Korty <kortyads@xxxxxxxxxxxxxx> wrote:
>
> [ 2nd send, this one from my home email account...]
>
> telnet no longer works:
>
> # chkconfig telnet on
> # telnet localhost
> Trying 127.0.0.1...
> Connected to localhost (127.0.0.1).
> Escape character is '^]'.
> Red Hat Enterprise Linux WS release 3 (Taroon Update 2)
> Kernel 2.6.10-rc2 on an i686
> Connection closed by foreign host.
>
> A bsearch placed the bug between 2.6.9-rc1-bk[78], another
> bsearch on the changesets showed the problem is caused
> by this patch:
>
> roland@xxxxxxxxxx[torvalds]|ChangeSet|20040831173525|30767
> [PATCH] waitid system call
>
> My guess is, something about the new wait4(2) wrapper
> is causing the telnet daemon to declare success before
> its child, /bin/login, exits.

I can reproduce this on 2.6.10-rc2, but it seems to have been fixed in more
recent kernels. However I cannot think of anything which we did which
would have fixed this.

I was able to reproduce it with the day-before-yesterday''s bitkeeper tree.

My boss sees broken kernels work once in a while. I myself have
never been able to get a broken kernel to work. The problem may
be a race.

Joe


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/