Re: [External Mail] Re: [External Mail] [RFC PATCH] mm/mempolicy: Weighted interleave auto-tuning
From: Gregory Price
Date: Thu Jan 09 2025 - 10:56:40 EST
On Wed, Jan 08, 2025 at 10:19:19AM +0900, Hyeonggon Yoo wrote:
> Hi, hope you all had a nice year-end holiday :)
>
... snip ...
> Please let me know if there's any point we discussed that I am missing.
>
> Additionally I would like to mention that within an internal discussion
> my colleague Honggyu suggested introducing 'mode' parameter which can be
> either 'manual' or 'auto' instead of 'use_defaults' to be provide more
> intuitive interface.
>
> With Honggyu's suggestion and the points we've discussed,
> I think the interface could be:
>
> # At booting, the mode is 'auto' where the kernel can automatically
> # update any weights.
>
> mode auto # User hasn't specified any weight yet.
> effective [2, 1, -, -] # Using system defaults for node 0-1,
> # and node 2-3 not populated yet.
>
> # When a new NUMA node is added (e.g. via hotplug) in the 'auto' mode,
> # all weights are re-calculated based on ACPI HMAT table, including the
> # weight of the new node.
>
> mode auto # User hasn't specified weights yet.
> effective [2, 1, 1, -] # Using system defaults for node 0-2,
> # and node 3 not populated yet.
>
> # When user set at least one weight value, change the mode to 'manual'
> # where the kernel does not update any weights automatically without
> # user's consent.
>
> mode manual # User changed the weight of node 0 to 4,
> # changing the mode to manual config mode.
> effective [4, 1, 1, -]
>
>
> # When a new NUMA node is added (e.g. via hotplug) in the manual mode,
> # the new node's weight is zero because it's in manual mode and user
> # did not specify the weight for the new node yet.
>
> mode manual
> effective [4, 1, 1, 0]
>
0's cannot show up in the effective list - the allocators can never
percieve a 0 as there are (race) conditions where that may cause a div0.
The actual content of the list may be 0, but the allocator will see '1'.
IIRC this was due to lock/sleep limitations in the allocator paths and
accessing this RCU protected memory. If someone wants to take another
look at the allocator paths and characterize the risk more explicitly,
this would be helpful.
> # When user changes the mode to 'auto', all weights are changed to
> # system defaults based on the ACPI HMAT table.
>
> mode auto
> effective [2, 1, 1, 1] # system defaults
>
> In the example I did not distinguish 'default weights' and 'user
> weights' because it's not important where the weight values came from --
> but it's important to know 1) what's the effective weights now and 2) if
> the kernel can update them.
>
> Any thoughts?
>
> ---
> Best,
> Hyeonggon