Re: [2.4] heavy-load under swap space shortage

From: Andrew Morton
Date: Mon Mar 15 2004 - 13:37:15 EST


Andrea Arcangeli <andrea@xxxxxxx> wrote:
>
> On Tue, Mar 16, 2004 at 01:37:04AM +1100, Nick Piggin wrote:
> > This case I think is well worth the unfairness it causes, because it
> > means your zone's pages can be freed quickly and without freeing pages
> > from other zones.
>
> freeing pages from other zones is perfectly fine, the classzone design
> gets it right, you have to free memory from the other zones too or you
> have no way to work on a 1G machine. you call the thing "unfair" when it
> has nothing to do with fariness, your unfariness is the slowdown I
> pointed out,

This "slowdown" is purely theoretical and has never been demonstrated.

One could just as easily point at the fact that on a 32GB machine with a
single LRU we have to send 64 highmem pages to the wrong end of the LRU for
each scanned lowmem page, thus utterly destroying any concept of it being
an LRU in the first place. But this is also theoretical, and has never
been demonstrated and is thus uninteresting.

What _is_ interesting is the way in which the single LRU collapses when
there are a huge number amount of highmem pages on the tail and then there
is a surge in lowmem demand. This was demonstrated, and is what prompted
the per-zone LRU.




Begin forwarded message:

Date: Sun, 04 Aug 2002 01:35:22 -0700
From: Andrew Morton <akpm@xxxxxxxxxx>
To: "linux-mm@xxxxxxxxx" <linux-mm@xxxxxxxxx>
Subject: how not to write a search algorithm


Worked out why my box is going into a 3-5 minute coma with one test.
Think what the LRUs look like when the test first hits page reclaim
on this 2.5G ia32 box:

head tail
active_list: <800M of ZONE_NORMAL> <200M of ZONE_HIGHMEM>
inactive_list: <1.5G of ZONE_HIGHMEM>

now, somebody does a GFP_KERNEL allocation.

uh-oh.

VM calls refill_inactive. That moves 25 ZONE_HIGHMEM pages onto
the inactive list. It then scans 5000 pages, achieving nothing.

VM calls refill_inactive. That moves 25 ZONE_HIGHMEM pages onto
the inactive list. It then scans about 10000 pages, achieving nothing.

VM calls refill_inactive. That moves 25 ZONE_HIGHMEM pages onto
the inactive list. It then scans about 20000 pages, achieving nothing.

VM calls refill_inactive. That moves 25 ZONE_HIGHMEM pages onto
the inactive list. It then scans about 40000 pages, achieving nothing.

VM calls refill_inactive. That moves 25 ZONE_HIGHMEM pages onto
the inactive list. It then scans about 80000 pages, achieving nothing.

VM calls refill_inactive. That moves 25 ZONE_HIGHMEM pages onto
the inactive list. It then scans about 160000 pages, achieving nothing.

VM calls refill_inactive. That moves 25 ZONE_HIGHMEM pages onto
the inactive list. It then scans about 320000 pages, achieving nothing.

The page allocation fails. So __alloc_pages tries it all again.


This all gets rather boring.


Per-zone LRUs will fix it up. We need that anyway, because a ZONE_NORMAL
request will bogusly refile, on average, memory_size/800M pages to the
head of the inactive list, thus wrecking page aging.

Alan's kernel has a nice-looking implementation. I'll lift that out
next week unless someone beats me to it.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx For more info on Linux MM,
see: http://www.linux-mm.org/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/