Re: [PATCH v2] mm/slub: allocate sheaves on local memory nodes
From: Harry Yoo
Date: Mon Jun 08 2026 - 23:44:17 EST
On 6/3/26 1:26 PM, Hao Li wrote:
> On Mon, Jun 01, 2026 at 08:28:16PM +0900, Harry Yoo wrote:
>> On 6/1/26 6:56 PM, Hao Li wrote:
>>> Sheaf structs are exchanged through node-local barns. Since barn structs
>>> are already allocated from their local NUMA node, this patch aims to
>>> allocate sheaf structs from their local memory nodes as well.
>>>
>>> To achieve this, the obvious choice would be using cpu_to_mem().
>>> However, init_percpu_sheaves() and bootstrap_cache_sheaves() iterate
>>> through possible CPUs, whereas cpu_to_mem() is only initialized for
>>> online CPUs. Therefore, we cannot use cpu_to_mem() and instead need to
>>> use local_memory_node(cpu_to_node(cpu)), similar to what
>>> __build_all_zonelists() does.
>>>
>>> The primary goal of this patch is to improve NUMA node locality.
>>> Although the actual performance impact is minor, it still yields a ~1%
>>> improvement on a 192-core, 8-NUMA-node system when testing with the
>>> will-it-scale mmap test case.
>>
>> Oh, nice :)
>>
>> I have a question though...
>>
>> I wonder if would be better to handle this by e.g.) not returning empty
>> sheaves back to barn and freeing them if the node id doesn't match and
>> it's not a memoryless node.
>>
>> init_percpu_sheaves() and bootstrap_cache_sheaves() are not the only
>> places that can allocate sheaves from remote nodes; sheaves allocation
>> could fall back to other nodes and then SLUB could keep reusing those
>> sheaves from remote nodes even after memory is reclaimed.
>
> This is a good catch. In addition to the fallback mechanism, task migration
> between CPUs in __pcs_replace_empty_main() and __pcs_replace_full_main() can
> also mix up sheaf structs across different barns. So yeah, changing allocation
> locality is not a silver bullet.
>
>> If this works well, we probably don't need to handle it in
>> init_percpu_sheaves() and bootstrap_cache_sheaves() at all as they will
>> eventually be freed, while covering the other case too?
>
> freeing the empty sheaf if the NUMA node mismatches instead of putting it back
> into the barn is indeed a good idea. I like it. But unfortunately, my testing
> didn't show a clear performance improvement, though there was no noticeable
> degradation either. :-(
Hmm... the idea still makes sense to me, but yeah, it's not late to fix
it when we have data to back up.
> I also did some more testing on my patch too. Under CONFIG_PREEMPT_LAZY, the
> improvement is only about 0.5% (sometimes 1%). And when switching to
> CONFIG_PREEMPT, the patch doesn't seem to yield statistically significant
> benefits, likely because sheaves get mixed during task migration.
>
> So, perhaps the performance gain just isn't worth the extra complexity. It's a
> bit frustrating, but maybe we should just abandon this direction and keep
> things as they are... :(
Ouch. Don't let it get you down, it's completely normal for experiments
not to go as planned.... ;)
> Thanks for the feedback anyway!
No problem!
--
Cheers,
Harry / Hyeonggon
Attachment:
OpenPGP_signature.asc
Description: OpenPGP digital signature