Re: Free memory never fully used, swapping

From: KOSAKI Motohiro
Date: Mon Nov 29 2010 - 04:31:16 EST


Hi

> On Tue, Nov 23, 2010 at 12:35:31AM -0800, Dave Hansen wrote:
>
> > I wish. :) The best thing to do is to watch stuff like /proc/vmstat
> > along with its friends like /proc/{buddy,meminfo,slabinfo}. Could you
> > post some samples of those with some indication of where the bad
> > behavior was seen?
> >
> > I've definitely seen swapping in the face of lots of free memory, but
> > only in cases where I was being a bit unfair about the numbers of
> > hugetlbfs pages I was trying to reserve.
>
> So, Dave and I spent quite some time today figuring out was going on
> here. Once load picked up during the day, kswapd actually never slept
> until late in the afternoon. During the evening now, it's still waking
> up in bursts, and still keeping way too much memory free:
>
> http://0x.ca/sim/ref/2.6.36/memory_tonight.png
>
> (NOTE: we did swapoff -a to keep /dev/sda from overloading)
>
> We have a much better idea on what is happening here, but more questions.
>
> This x86_64 box has 4 GB of RAM; zones are set up as follows:
>
> [ 0.000000] Zone PFN ranges:
> [ 0.000000] DMA 0x00000001 -> 0x00001000
> [ 0.000000] DMA32 0x00001000 -> 0x00100000
> [ 0.000000] Normal 0x00100000 -> 0x00130000
> ...
> [ 0.000000] On node 0 totalpages: 1047279
> [ 0.000000] DMA zone: 56 pages used for memmap
> [ 0.000000] DMA zone: 0 pages reserved
> [ 0.000000] DMA zone: 3943 pages, LIFO batch:0
> [ 0.000000] DMA32 zone: 14280 pages used for memmap
> [ 0.000000] DMA32 zone: 832392 pages, LIFO batch:31
> [ 0.000000] Normal zone: 2688 pages used for memmap
> [ 0.000000] Normal zone: 193920 pages, LIFO batch:31

This machine's zone size are

DMA32: 3250MB
NORMAL: 750MB

This inbalance zone size is one of root cause of the strange swapping
issue. I'm sure we certinally need to fix our VM heuristics. However
there is no perfect heuristics in the real world and we can't make it.
Also, I guess a bug reporter need practical workaround.

Then, I wrote following patch.

if you pass a following boot parameter, zone division change to
dma32=1G + normal=3G.

in grub.conf

kernel /boot/vmlinuz ro root=foobar .... zone_dma32_size=1G


I bet this one reduce your head pain a lot. Can you please try this?
Of cource, this is only workaround. not truth fix.