Re: Oops - hard crash in 2.2.15 - tcp_keepalive (again!!)

From: Whit Blauvelt (whit@transpect.com)
Date: Tue May 16 2000 - 23:59:52 EST


On Wed, 17 May 2000 01:09:35 +0100 (BST), Alan Cox
<alan@lxorguk.ukuu.org.uk> wrote:
>
>> Andrea's delack-timer-5 to it this time around (had a prior crash in
>> vanilla 2.2.15 without that though).

> The delack crash is fairly specific - you see the machine hang solidly rather
> than an oops.

All the crashes hang solidly. Only saw one without an oops message on the
hung screen though - blank like the delack reports, so thought I'd try
that. Oh well.

>> In the past, setting tcp_keepalive real frequent crashed it quicker. Haven't
>> tried it on this present example.
>
> That would make sense. And yes Im sure this is software but the network hackers
> haven't yet managed to pin this one down. Normally its only hitting very very
> busy sites (think big porn sites, search engines etc). I wonder why you see
> it so often.

The system's only moderately busy, although there are a dozen Web domains
on it, and it's handling a moderate amount of mail at the same time (the
better part of which gets forwarded elsewhere immediately). The busiest Web
domain sees a bit less than 200 visitors a day. The rest on their busiest
day don't handle that much again all together. The _only_ sign ever left in
the logs by these crashes is there will often be a string of @@@'s left in
the midst of a line of one of the Apache access logs at the time of the
crash - but not every time.

The only way I can see that it might have something in common with a
heavy-access site is this all goes through a 384 SDSL line, which is
solidly connected at that speed and generally more than enough, but
sometimes sees brief periods of congestion. Could it be some condition that
occurs when the remote connection negotiations get a bit backed up, which
porn sites would also be seeing relative to their fatter pipes (metaphor
unintended)?

On the other hand, the crashes seem a bit more common when the system is
least busy - so this may be an unlikely thing - except, come to think of
it, it gets a fair amount of Web traffic from the far ends of the Earth.
It's mostly hosting a jazz site, and there are a few Net-connected fans in
about every country who find it. The small hours of the morning here are
when they visit. But then I'm at a loss to imagine what special problem the
most distant Web client would introduce, since there's not really all that
much to negotiate in http.

Whit

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue May 23 2000 - 21:00:12 EST