Re: [PATCH v2 2/2] vmscan: shrink_slab() require number oflru_pages, not page order

From: Andrew Morton
Date: Thu Jul 08 2010 - 16:33:12 EST


On Thu, 8 Jul 2010 09:04:18 -0500 (CDT)
Christoph Lameter <cl@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Thu, 8 Jul 2010, KOSAKI Motohiro wrote:
>
> > Fix simple argument error. Usually 'order' is very small value than
> > lru_pages. then it can makes unnecessary icache dropping.
>
> AFAICT this is not argument error but someone changed the naming of the
> parameter.

It's been there since day zero:

: commit 2a16e3f4b0c408b9e50297d2ec27e295d490267a
: Author: Christoph Lameter <clameter@xxxxxxxxxxxx>
: AuthorDate: Wed Feb 1 03:05:35 2006 -0800
: Commit: Linus Torvalds <torvalds@xxxxxxxxxxx>
: CommitDate: Wed Feb 1 08:53:16 2006 -0800
:
: [PATCH] Reclaim slab during zone reclaim
:
: If large amounts of zone memory are used by empty slabs then zone_reclaim
: becomes uneffective. This patch shakes the slab a bit.
:
: The problem with this patch is that the slab reclaim is not containable to a
: zone. Thus slab reclaim may affect the whole system and be extremely slow.
: This also means that we cannot determine how many pages were freed in this
: zone. Thus we need to go off node for at least one allocation.
:
: The functionality is disabled by default.
:
: We could modify the shrinkers to take a zone parameter but that would be quite
: invasive. Better ideas are welcome.
:
: Signed-off-by: Christoph Lameter <clameter@xxxxxxx>
: Signed-off-by: Andrew Morton <akpm@xxxxxxxx>
: Signed-off-by: Linus Torvalds <torvalds@xxxxxxxx>
:
: diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
: index 4bca2a3..a46c10f 100644
: --- a/Documentation/sysctl/vm.txt
: +++ b/Documentation/sysctl/vm.txt
: @@ -137,6 +137,7 @@ This is value ORed together of
: 1 = Zone reclaim on
: 2 = Zone reclaim writes dirty pages out
: 4 = Zone reclaim swaps pages
: +8 = Also do a global slab reclaim pass
:
: zone_reclaim_mode is set during bootup to 1 if it is determined that pages
: from remote zones will cause a measurable performance reduction. The
: @@ -160,6 +161,11 @@ Allowing regular swap effectively restricts allocations to the local
: node unless explicitly overridden by memory policies or cpuset
: configurations.
:
: +It may be advisable to allow slab reclaim if the system makes heavy
: +use of files and builds up large slab caches. However, the slab
: +shrink operation is global, may take a long time and free slabs
: +in all nodes of the system.
: +
: ================================================================
:
: zone_reclaim_interval:
: diff --git a/mm/vmscan.c b/mm/vmscan.c
: index 9e2ef36..aa4b80d 100644
: --- a/mm/vmscan.c
: +++ b/mm/vmscan.c
: @@ -1596,6 +1596,7 @@ int zone_reclaim_mode __read_mostly;
: #define RECLAIM_ZONE (1<<0) /* Run shrink_cache on the zone */
: #define RECLAIM_WRITE (1<<1) /* Writeout pages during reclaim */
: #define RECLAIM_SWAP (1<<2) /* Swap pages out during reclaim */
: +#define RECLAIM_SLAB (1<<3) /* Do a global slab shrink if the zone is out of memory */
:
: /*
: * Mininum time between zone reclaim scans
: @@ -1666,6 +1667,19 @@ int zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
:
: } while (sc.nr_reclaimed < nr_pages && sc.priority > 0);
:
: + if (sc.nr_reclaimed < nr_pages && (zone_reclaim_mode & RECLAIM_SLAB)) {
: + /*
: + * shrink_slab does not currently allow us to determine
: + * how many pages were freed in the zone. So we just
: + * shake the slab and then go offnode for a single allocation.
: + *
: + * shrink_slab will free memory on all zones and may take
: + * a long time.
: + */
: + shrink_slab(sc.nr_scanned, gfp_mask, order);
: + sc.nr_reclaimed = 1; /* Avoid getting the off node timeout */
: + }
: +
: p->reclaim_state = NULL;
: current->flags &= ~PF_MEMALLOC;

> The "lru_pages" parameter is really a division factor affecting
> the number of pages scanned. This patch increases this division factor
> significantly and therefore reduces the number of items scanned during
> zone_reclaim.
>

And for that reason I won't apply the patch. I'd be crazy to do so.
It tosses away four years testing, replacing it with something which
could have a large effect on reclaim behaviour, with no indication
whether that effect is good or bad.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/