Re: [PATCH] mm: swap prefetch improvements

From: Con Kolivas
Date: Sat May 12 2007 - 01:14:19 EST


On Saturday 12 May 2007 15:03, Paul Jackson wrote:
> > Swap prefetch is not cpuset aware so make the config option depend on
> > !CPUSETS.
>
> Ok.
>
> Could you explain what it means to say "swap prefetch is not cpuset aware",
> or could you give a rough idea of what it would take to make it cpuset
> aware?

Hmm I'm not really sure what it takes to make it cpuset aware; it was Nick
that pointed out that it was not, so I'm not sure and still going off your
original recommendation that there was no need to make it cpuset aware but at
least honour node placement (see below).

> I wouldn't go so far as to say that no one would ever want to prefetch and
> use cpusets at the same time, but I will grant that it's not a sufficiently
> important need that it should block a useful prefetch implementation on
> non-cpuset systems.

Thank you for agreeing on me there :)

> One case that would be useful, however, is to handle prefetch in the case
> that cpusets are configured into ones kernel, but one is not making any
> real use of them ('number_of_cpusets' <= 1). That will actually be the
> most common case for the major distribution(s) that enable cpusets by
> default in their builds, for most arch's including the arch's popular
> on desktops.
>
> So what would it take to allow CONFIG'ing both prefetch and cpusets on,
> but having prefetch dynamically adapt to the presence of active cpuset
> usage, perhaps by basically shutting down if it can't easily do any
> better? I could certainly entertain requests to callout to some
> prefetch routine from the cpuset code, at the critical points that
> cpusets transitioned in or out of active use.

It would be absolutely trivial to add a check for 'number_of_cpusets' <= 1 in
the prefetch_enabled() function. Would you like that?

> Semi-separate issue -- is it just cpusets that aren't prefetch friendly,
> or is it also mm/mempolicy (mbind, set_mempolicy) as well?
>
> For that matter, even if neither mm/mempolicy nor cpusets are used, on
> systems with multiple memory nodes (not all memory equally distant from
> all CPUs, aka NUMA), could prefetch cause some sort of shuffling of
> memory placement, which might harm the performance of an HPC (High
> Performance Computing) application with carefully tuned memory
> placement. Granted, this -is- getting to be a corner case. Most HPC
> apps running on NUMA hardware are making at least some use of
> mm/mempolicy or cpusets.

It is numa aware to some degree. It stores the node id and when it starts
prefetching it only prefetches to nodes that are suitable for prefetching to
(based on a number of arbitrary freeness arguments I invented). It uses the
original node id it came from by allocating a page via:
alloc_pages_node(node, GFP_HIGHUSER & ~__GFP_WAIT, 0);
where "node" is the original node the swapped page came from.

Thanks for comments.

--
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/