Re: [External Mail] [RFC PATCH] mm/mempolicy: Weighted interleave auto-tuning

From: Huang, Ying
Date: Thu Dec 26 2024 - 20:59:57 EST


Gregory Price <gourry@xxxxxxxxxx> writes:

> On Thu, Dec 26, 2024 at 09:35:32AM +0800, Huang, Ying wrote:
>> > Having two files for each node (nodeN, defaultN) seems a bit too
>> > cluttered for the user perspective. Making the nodeN interfaces serve
>> > multiple purposes (i.e. echo -1 into the nodes will output the default
>> > value for that node) also seems a bit too complicated as well, in my
>> > opinion. Maybe having a file 'weight_tables' that contains a table of
>> > default/user/effective weights (as have been used in these conversations)
>> > might be useful for the user? (Or maybe just the defaults)
>> >
>> > Then a workflow for the user may be as such:
>> >
>> > $ cat /sys/kernel/mm/mempolicy/weighted_interleave/weight_tables
>> > default vales: [4,7,2]
>> > user values: [-,-,-]
>> > effective: [4,7,2]
>>
>> AFAIK, this breaks the sysfs attribute format rule as follows.
>>
>> https://docs.kernel.org/filesystems/sysfs.html#attributes
>>
>> It's hard to use array sysfs attribute here too. Because the node ID
>> may be non-consecutive. This makes it hard to read.
>>
>
> Would generally agree. I think essentially a
> use_defaults => (0 | 1)
> interface is probably the best we can do.
>
> Setting any node changes use_defaults from 1 => 0
> echoing 1 into use_default clears user_values
>
> This still allows 0 to be a manual "reset specific node to default"
> mechanism for a specific node, and gives us a clean override.

The difficulty is that users don't know the default value when they
reset a node's weight. We don't have an interface to show them. So, I
suggest to disable the functionality: "reset specific node to default".
They can still use "echo 1 > use_defaults" to reset all nodes to
default.

> The only question is a matter of hotplug behavior
>
> nodes_online: 0,1
> default_values: [5,3]
> user_values : [-,-]
>
> event: node1 is taken offline
> default_values: [5,3] <-- nothing happens
>
> event: node1 comes back online with different bandwidth attribute
> default_values: [6,5] <-- reweight as occured silently
>
> event: user sets a custom value (node1 <= 2)
> default_values: [6,5]
> user_values: [6,2] <= note, *no reduction*
>
> event: node1 is taken offline
> default_values: [6,5]
> user_values: [6,2] <= value still present but not used
>
> event: node1 comes back online with different bandwidth attribute
> default_values: [5,3] <-- default reweight has occurred silently
> user_values : [6,2] <-- user responsible for triggering re-weight
>
> The user has the option of
>
> echo 1 > /sys/.../weghted_interleave/user_defaults
> result
> default_values: [5,3]
> user_values : [-,-]
> or
> echo 0 > /sys/.../weighted_interleave/node1
> result
> default_values: [5,3]
> user_values : [6,3] <= only node1 is updated, no re-weight
>
> Basically, if the user ever sets any value, we never automatically pull
> new values in, and the admin is responsible for triggering a re-weight
> (use_default) or manually reweighting *all* nodes - because changing
> values implies a change in the bandwidth distribution anyway.
>
> I think this makes the most sense.

---
Best Regards,
Huang, Ying