Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+

From: Ilpo Järvinen
Date: Mon May 26 2008 - 17:20:26 EST


...Added Pavel & Denis.

On Mon, 26 May 2008, Ingo Molnar wrote:
> * Ingo Molnar <mingo@xxxxxxx> wrote:
>
> > > It's well possible that e.g., net namespaces have some bug in
> > > handling of orphaned tcp.
> >
> > yes, that would match the symptoms i think. I half-assumed that it's a
> > state machine problem so i didnt even check what the reader does - and
> > in this case it appears to not exist at all anymore ;-)

I keep wondering why distcc in the first place has a connection orphaned
before it has read all data from it, I guess it should not yet be
speculatively executing anything or so :-). Anyway, there should be RST
due to that data because nobody is going read it after the process in gone
but since state is still ESTABLISHED, I doubt that tcp_close() was ever
executed which adds some strangeness.

> after ~7 hours of uptime the networking code produced this assertion:
>
> [25441.140000] KERNEL: assertion (!sk->sk_ack_backlog) failed at
> net/ipv4/inet_connection_sock.c (642)

...I checked /proc/net/tcp output you gave earlier, and indeed, it shows
1 in sk->sk_ack_backlog of the LISTENing socket at that time.

> no further information in the logs. I kept the failing system booted up
> for 8 hours and will reboot it now.



--
i.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/