2.6.32 swapper allocation failure with plenty of memory available
From: Mikael Abrahamsson
Date: Tue Aug 10 2010 - 02:06:17 EST
Hi.
Yesterday my Ubuntu 10.04 machine with their 2.6.32 (amd64) kernel, under
a lot of disk IO and network stress stopped responding. I thought it had
frozen completely, but ~2 hours later it came back to life.
When I logged in I saw a lot of "swapper allocation failure" and r8169
timeouts in dmesg (first time I've seen this cause network instability
like this, but it's also the first motherboard I've tested with that has a
r8169 NIC).
I've had this problem before with older kernels on other hardware
<https://bugs.launchpad.net/ubuntu/+source/linux/+bug/296275>, and it
seems related to having a lot of TCP sessions up moving data, in
conjunction with pretty agressive TCP tuning for long bandwidth delay
product (4-8 megs of tcp memory settings with sysctl).
The machine has 8 gigs of ram (core i5 + P7H57D-V EVO motherboard) and was
running programs which was using ~2 gigs of memory, so most of the memory
was used for buffers and disk cache.
Unless this has been fixed since 2.6.32, I suspect it's still a problem
even in newer kernels because the behaviour seems to have been present
since at least 2.6.24. Generally, tuning down the TCP wmem and rmem etc to
~1 megabyte makes the problem go away.
Please see attached dmesg file for more information.
--
Mikael Abrahamsson email: swmike@xxxxxxxxxAttachment:
dmesg.100809-2.txt.gz
Description: Binary data