tcp_keepalive crash - more data, speculations - Apache tie?

From: Whit Blauvelt (whit@transpect.com)
Date: Mon Feb 28 2000 - 11:17:45 EST


Hi guys,

This may not mean anything yet, since sometimes the system in question has
run more than 24 hours without crashing - but then recently it
more-often-than-not hasn't. I set /proc/sys/net/ipv4/tcp_keepalive_time to
30 days (2592000) rather than 2 hours yesterday morning, and so far no
crash.

If this does turn out to stop the crashes, then having changed this value
means that the condition in tcp_timer.c of

if (elapsed >= sysctl_tcp_keepalive_time)

will never be met in the real world, with the implication that it's
something that happens when it is that triggers the seizure.

On the other hand, if I run "netstat -tc", I'm not seeing any stale tcp
sockets sticking around anyhow. I wasn't checking this before, so can't
say whether there were before. But if there weren't, and this condition
were still being met, the could the code have been trying to kill a socket
that wasn't there? Could this be what "Kernel panic: Attempted
to kill the idle task:" is referring to, attempting to kill an idle socket
that wasn't there to kill?

Probably unrelated, but I am running Apache 1.3.9 with KeepAlive on.
That's set with a KeepAliveTimeout of 15 seconds, though, and in netstat
it looks like that's working just fine. But the majority of the traffic
this server handles is Web, and the problem has increased in frequency
roughly in proportion to Web traffic, so if there were some manner in which
these fairly different keepalive functions could interfere (eg, could they
both be playing with the same stack at the same time without locking?)....

Anyhow, you can be sure I'll let you know if the turkey goes belly-up
again, so if you don't hear from me again in a day or so confidence should
approach unity that increasing the tcp_keepalive_time to an absurd value
prevents the bug from being triggered.

Whit

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Feb 29 2000 - 21:00:20 EST