Re: [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING

From: Huang, Ying
Date: Wed Dec 02 2020 - 20:50:05 EST


Mel Gorman <mgorman@xxxxxxx> writes:

> On Wed, Dec 02, 2020 at 04:42:33PM +0800, Huang Ying wrote:
>> Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
>> ---
>> man2/set_mempolicy.2 | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>>
>> diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2
>> index 68011eecb..3754b3e12 100644
>> --- a/man2/set_mempolicy.2
>> +++ b/man2/set_mempolicy.2
>> @@ -113,6 +113,12 @@ A nonempty
>> .I nodemask
>> specifies node IDs that are relative to the set of
>> node IDs allowed by the process's current cpuset.
>> +.TP
>> +.BR MPOL_F_NUMA_BALANCING " (since Linux 5.11)"
>> +Enable the Linux kernel NUMA balancing for the task if it is supported
>> +by kernel.
>> +If the flag isn't supported by Linux kernel, return -1 and errno is
>> +set to EINVAL.
>> .PP
>> .I nodemask
>> points to a bit mask of node IDs that contains up to
>> @@ -293,6 +299,9 @@ argument specified both
>
> Should this be expanded more to clarify it applies to MPOL_BIND
> specifically?
>
> Maybe the first patch should be expanded more and explictly fail if
> MPOL_F_NUMA_BALANCING is used with anything other than MPOL_BIND?

For MPOL_PREFERRED, why could we not use NUMA balancing to migrate pages
to the accessing local node if it is same as the preferred node? We
have a way to turn off NUMA balancing already, why could we not provide
a way to enable it if that's intended?

Even for MPOL_INTERLEAVE, if the target node is the same as the
accessing local node, can we use NUMA balancing to migrate pages?

So, I prefer to make MPOL_F_NUMA_BALANCING to be

Optimizing with NUMA balancing if possible, and we may add more
optimization in the future.

Do you agree?

Best Regards,
Huang, Ying

>> .B MPOL_F_STATIC_NODES
>> and
>> .BR MPOL_F_RELATIVE_NODES .
>> +Or, the
>> +.B MPOL_F_NUMA_BALANCING
>> +isn't supported by the Linux kernel.
>
> This will be difficult for an app to distinguish but we can't go back in
> time and make this ENOSYS :(
>
> The linux-api people might have more guidance but it may go to the
> extent of including a small test program in the manual page for a
> sequence that tests whether MPOL_F_NUMA_BALANCING works. They might have
> a better recommendation on how it should be handled.