Re: [PATCH v2] mm/slub: allocate sheaves on local memory nodes

From: Vlastimil Babka (SUSE)

Date: Tue Jun 09 2026 - 06:23:43 EST


On 6/9/26 05:41, Harry Yoo wrote:
>
>
> On 6/3/26 1:26 PM, Hao Li wrote:
>> On Mon, Jun 01, 2026 at 08:28:16PM +0900, Harry Yoo wrote:
>>> On 6/1/26 6:56 PM, Hao Li wrote:
>>>> Sheaf structs are exchanged through node-local barns. Since barn structs
>>>> are already allocated from their local NUMA node, this patch aims to
>>>> allocate sheaf structs from their local memory nodes as well.
>>>>
>>>> To achieve this, the obvious choice would be using cpu_to_mem().
>>>> However, init_percpu_sheaves() and bootstrap_cache_sheaves() iterate
>>>> through possible CPUs, whereas cpu_to_mem() is only initialized for
>>>> online CPUs. Therefore, we cannot use cpu_to_mem() and instead need to
>>>> use local_memory_node(cpu_to_node(cpu)), similar to what
>>>> __build_all_zonelists() does.
>>>>
>>>> The primary goal of this patch is to improve NUMA node locality.
>>>> Although the actual performance impact is minor, it still yields a ~1%
>>>> improvement on a 192-core, 8-NUMA-node system when testing with the
>>>> will-it-scale mmap test case.
>>>
>>> Oh, nice :)
>>>
>>> I have a question though...
>>>
>>> I wonder if would be better to handle this by e.g.) not returning empty
>>> sheaves back to barn and freeing them if the node id doesn't match and
>>> it's not a memoryless node.
>>>
>>> init_percpu_sheaves() and bootstrap_cache_sheaves() are not the only
>>> places that can allocate sheaves from remote nodes; sheaves allocation
>>> could fall back to other nodes and then SLUB could keep reusing those
>>> sheaves from remote nodes even after memory is reclaimed.
>>
>> This is a good catch. In addition to the fallback mechanism, task migration
>> between CPUs in __pcs_replace_empty_main() and __pcs_replace_full_main() can
>> also mix up sheaf structs across different barns. So yeah, changing allocation
>> locality is not a silver bullet.
>>
>>> If this works well, we probably don't need to handle it in
>>> init_percpu_sheaves() and bootstrap_cache_sheaves() at all as they will
>>> eventually be freed, while covering the other case too?
>>
>> freeing the empty sheaf if the NUMA node mismatches instead of putting it back
>> into the barn is indeed a good idea. I like it. But unfortunately, my testing
>> didn't show a clear performance improvement, though there was no noticeable
>> degradation either. :-(
>
> Hmm... the idea still makes sense to me, but yeah, it's not late to fix
> it when we have data to back up.

I think if that approach doesn't complicate things unreasonably, it makes
sense to do it even if there are no clear wins (as long as there are no
regressions).