I have scoured the list archives for several hours seeing several
references over the past year about instances of rampant "dst cache
overflow" messages. There are posts from January and June relating
difficulties that individuals have had with this messages, including
replies in January that say the problem is fixed in "current" - which is
presumably not as current as my 2.2.16-3 as built by RedHat for 6.2.
The problem:
An x86 clone that does a constant 4-5mbps of web traffic using apache
1.3.12 with perl 5.6.0 for cgi. The card is a 3c905b and what happens is
that about once every two weeks we see these messages in the kernel
(example: )
Sep 9 08:24:07 cne kernel: dst cache overflow
Sep 9 08:24:12 cne kernel: NET: 500 messages suppressed.
Sep 9 08:24:12 cne kernel: dst cache overflow
Sep 9 08:24:17 cne kernel: NET: 378 messages suppressed.
Sep 9 08:24:17 cne kernel: dst cache overflow
Sep 9 08:24:22 cne kernel: NET: 469 messages suppressed.
Sep 9 08:24:22 cne kernel: dst cache overflow
Sep 9 08:24:27 cne kernel: NET: 421 messages suppressed.
Sep 9 08:24:27 cne kernel: dst cache overflow
Sep 9 08:24:32 cne kernel: NET: 457 messages suppressed.
Sep 9 08:24:32 cne kernel: dst cache overflow
Sep 9 08:24:37 cne kernel: NET: 540 messages suppressed.
Sep 9 08:24:37 cne kernel: dst cache overflow
(dot dot dot)
Until such time as the machine simply gives up. At that time if you are
on the console you can do an /sbin/ifconfig and the interface appears to
be happy, but there are no packets being handled. If you subsequently do
a "/sbin/ifconfig eth0 down; /sbin/ifconfig eth0 up" the interface will
respond happily again... for a few hours. The problem will then occur
almost immediately. A reboot allows for another week or two of
uninterrupted operation. Adding a second card allows us to ssh in and
init 6 the machine (or just ifconfig it back up) but isn't a real fix.
Any ideas? I do not want to update to a development kernel, but I am
willing to play with the source on the 2.2.x series. I don't believe I am
the only person with this issue from looking at the list. I do believe
that a contributing factor is using the very cheapest of available
hardware. Not much I can do there - multiple NICs of varying chipsets
have been tried to no avail. It appears to be a problem with routing
entries not being cleared, but my abilities at kernel hacking end there.
Thanks,
Mike Litherland
PS If you don't mind, can you CC replies to this post to
mike@raetek.com? I do read the list, but I am notorious for missing
things... thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Fri Sep 15 2000 - 21:00:15 EST