Re: [PATCH 2.6.9-rc2-mm1 0/2] mm: memory policy for page cacheallocation

From: Paul Jackson
Date: Mon Sep 20 2004 - 15:43:03 EST


Nits ...

1) better change this line in mempolicy.h

#define MPOL_MAX MPOL_INTERLEAVE

to be instead

#define MPOL_MAX MPOL_ROUNDROBIN

2) Why change the alloc_page_vma() code structure for
MPOL_INTERLEAVE from starting with:

if (unlikely(pol->policy == MPOL_INTERLEAVE)) {

to starting with:

switch (pol->policy) {
case MPOL_INTERLEAVE:

Doesn't the original way work just as well (and keep the patch
smaller)? Just add another 'if(...MPOL_ROUNDROBIN)' section
following the MPOL_INTERLEAVE section. And the other switch
statements in this file don't indent the case lines a tab further.

3) The following line looks like it could trigger after a
hotplug node removal (not that I know how to do that yet):

BUG_ON(!test_bit(nid, (const volatile void *) &node_online_map));

Should the '(const volatile void *)' cast be a nodes_addr() wrapper?

Can this entire line be dropped, or turned into a test:

if (!node_isset(nid, node_online_map))
continue;

4) Patches done with the 'diff -p' option are slightly easier to
read, as they show the procedure the diff seems to appear in.

5) I see no need for the 'else' in:

- if (pol->policy == MPOL_INTERLEAVE)
+ if (pol->policy == MPOL_INTERLEAVE) {
return alloc_page_interleave(gfp, order, interleave_nodes(pol));
+ } else if (pol->policy == MPOL_ROUNDROBIN) {
+ return alloc_page_roundrobin(gfp, order, pol);
+ }

Couldn't one just have this less intrusive patch instead:

if (pol->policy == MPOL_INTERLEAVE)
return alloc_page_interleave(gfp, order, interleave_nodes(pol));
+ if (pol->policy == MPOL_ROUNDROBIN)
+ return alloc_page_roundrobin(gfp, order, pol);
return __alloc_pages(gfp, order, zonelist_policy(gfp, pol));
}

6) Can the added rr_next in task_struct be shared with il_next?

7) Why doesn't alloc_page_roundrobin() have its own accounting, like
alloc_page_interleave() does?

8) Could you explain the for loop in alloc_page_roundrobin()? Won't
the first call to __alloc_pages() within the loop search down all
the nodes in the system, in a numa friendly order (closer nodes
first)? Why then pick another node to search from, in the next
pass of the for loop, which will again search down the same nodes,
using a differently sorted zonelist. Obviously I'm missing something
here.

9) Too bad there's not some pseudo-random value floating around somewhere,
such as a per-node clock or something, that could be used to drive a
pseudo-uniform distribution, without any need for the additional rr_next
state?

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@xxxxxxx> 1.650.933.1373
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/