Re: [patch 5/6] mempolicy: add MPOL_F_RELATIVE_NODES flag

From: David Rientjes
Date: Tue Feb 26 2008 - 20:17:52 EST


On Mon, 25 Feb 2008, David Rientjes wrote:

> Adds another optional mode flag, MPOL_F_RELATIVE_NODES, that specifies
> nodemasks passed via set_mempolicy() or mbind() should be considered
> relative to the current task's mems_allowed.
>

Here's some examples of the functional changes between the default
actions of the various mempolicy modes and the new behavior with
MPOL_F_STATIC_NODES or MPOL_F_RELATIVE_NODES.

To read this, the logical order follows from the left-most column to the
right-most:

- "mems" is the task's mems_allowed as constrained by its attached
cpuset,

- "nodemask" is the mask passed with the set_mempolicy() or mbind() call
for that particular policy,

- the first "result" is the nodemask that the policy is effected over,

- "rebind" is the nodemask of a subsequent change to the cpuset's mems,
and

- the second "result" is the nodemask that the policy is now effected
over.

MPOL_INTERLEAVE
---------------
mems nodemask result rebind result
1-3 0-2 1-2[*] 4-6 4-5
1-3 1-2 1-2 0-2 0-1
1-3 1-3 1-3 4-7 4-6
1-3 2-4 2-3 0-2 1-2
1-3 2-6 2-3 4-7 5-6
1-3 4-7 EINVAL
1-3 0-7 1-3 4-7 4-6

MPOL_PREFERRED
--------------
mems nodemask result rebind result
1-3 0 EINVAL
1-3 2 2 4-7 5
1-3 5 EINVAL

MPOL_BIND
---------
mems nodemask result rebind result
1-3 0-2 1-2 0-2 0-1
1-3 1-2 1-2 2-7 2-3
1-3 1-3 1-3 0-1 0-1
1-3 2-4 2-3 3-6 4-5
1-3 2-6 2-3 5 5
1-3 4-7 EINVAL
1-3 0-7 1-3 1-3 1-3

[*] Notice how the resulting nodemask for all of these examples when
creating the mempolicy is intersected with mems_allowed. This is
the current behavior, with contextualize_policy(), and is identical
to the initial result of the MPOL_F_STATIC_NODES case.

Perhaps it would make more sense to remap the nodemask when it is
created, as well, in the ~MPOL_F_STATIC_NODES case. For example, in
this case, the "result" would be 1-3 instead.

That is a departure from what is currently implemented in HEAD (and,
thus, can be used as ample justification for the above behavior) but
makes more sense. Thoughts?

MPOL_INTERLEAVE | MPOL_F_STATIC_NODES
-------------------------------------
mems nodemask result rebind result
1-3 0-2 1-2 4-6 nil
1-3 1-2 1-2 0-2 1-2
1-3 1-3 1-3 4-7 nil
1-3 2-4 2-3 0-2 2
1-3 2-6 2-3 4-7 4-6
1-3 4-7 EINVAL
1-3 0-7 1-3 4-7 4-7

MPOL_PREFERRED | MPOL_F_STATIC_NODES
------------------------------------
mems nodemask result rebind result
1-3 0 EINVAL
1-3 2 2 4-7 -1[**]
1-3 5 EINVAL

[**] Upon further rebind with a nodemask of 2, the preferred node would
again be 2.

MPOL_BIND | MPOL_F_STATIC_NODES
-------------------------------
mems nodemask result rebind result
1-3 0-2 1-2 0-2 0-2
1-3 1-2 1-2 2-7 2
1-3 1-3 1-3 0-1 1
1-3 2-4 2-3 3-6 3-4
1-3 2-6 2-3 5 5
1-3 4-7 EINVAL
1-3 0-7 1-3 1-3 1-3

MPOL_INTERLEAVE | MPOL_F_RELATIVE_NODES
---------------------------------------
mems nodemask result rebind result
1-3 0-2 1-3 4-6 4-6
1-3 1-2 2-3 0-2 1-2
1-3 1-3 1-3 4-7 5-7
1-3 2-4 1-3 0-2 0-2
1-3 2-6 1-3 4-7 4-7
1-3 4-7 1-3 0-1,5 0-1,5
1-3 0-7 1-3 4-7 4-7

MPOL_PREFERRED | MPOL_F_RELATIVE_NODES
--------------------------------------
mems nodemask result rebind result[***]
1-3 0 1 0 1
1-3 2 3 4-7 3
1-3 5 3 0-7 3

[***] All of these results are wrong and will be corrected in the next
posting of the patchset. They change the preferred node in some
cases to be a node that is expressly excluded from being accessed
by the cpuset mems change.

MPOL_BIND | MPOL_F_RELATIVE_NODES
---------------------------------
mems nodemask result rebind result
1-3 0-2 1-3 0-2 0-2
1-3 1-2 2-3 2-7 3-4
1-3 1-3 1-3 0-1 0-1
1-3 2-4 1-3 3-6 3,5-6
1-3 2-6 1-3 5 5
1-3 4-7 1-3 0-3,6 0-2,6
1-3 0-7 1-3 1-3 1-3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/