Re: 2.4.20 instability on bigmem systems?

From: William Lee Irwin III (wli@holomorphy.com)
Date: Thu Mar 13 2003 - 19:42:56 EST


On Thu, Mar 13, 2003 at 04:27:22PM -0800, Gregory K. Ruiz-Ade wrote:
> The primary problem: Whenever any process (or set of processes) initiates
> intensive disk I/O, the system grinds to a halt, kswapd and kupdated
> consume upwards of 40% to 60% CPU each, and system load averages can jump
> upwards of 21.00. The problem can be replicated with a simple find command
> ("find / -print" seems to do it nicely).
> I have had two rather painful nights dealing with this (Monday and Tuesday
> nights). Luckily, I have a serial null-modem cable rigged up between the
> troubled server and another server, and was able to capture all the info
> from the Magic Sysrq commands that I could.
> Full details are at http://castandcrew.com/~gregory/lkmlstuff/burpr/2.4.20

Hmm, slabinfo would be very helpful, as well as meminfo.

On Thu, Mar 13, 2003 at 04:27:22PM -0800, Gregory K. Ruiz-Ade wrote:
> I've included the kernel config, the kernel and initrd images, the system
> map file, output from "ps auxfww" and a couple screen scrapings from top,
> and captures from magic sysrq commands from both crashes.
> I had problems like this with 2.4.19, and was directed to apply a patch to
> inode.c, which appears to be part of a patch set for 2.4.19pre9aa2. I've
> archived it at:
> http://castandcrew.com/~gregory/lkmlstuff/burpr/2.4.19/patches/10_inode-highmem-2
> For 2.4.19, this solves _most_ of the stability issues, but I still have to
> work with the LVM people and possibly whomever is responsible for the VM in
> 2.4.19/2.4.20 to track down some kernel oopses (possibly a seperate
> problem.)
> I will happily provide whatever other information is needed, though my
> opportunities to test things on the machine in question is limited by the
> fact that it's a production server.

You might need bh stuff (memclass-related or something like it) if it's
general disk io. Can't be too sure until slabinfo + meminfo materialize.

-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Mar 15 2003 - 22:00:37 EST