Re: [PATCH v2] mm/slub: allocate sheaves on local memory nodes

From: Hao Li

Date: Tue Jun 09 2026 - 22:52:43 EST


On Wed, Jun 10, 2026 at 10:49:26AM +0900, Harry Yoo wrote:
>
>
> On 6/9/26 7:14 PM, Vlastimil Babka (SUSE) wrote:
> > On 6/9/26 05:41, Harry Yoo wrote:
> >> On 6/3/26 1:26 PM, Hao Li wrote:
> >>> On Mon, Jun 01, 2026 at 08:28:16PM +0900, Harry Yoo wrote:
> >>>> On 6/1/26 6:56 PM, Hao Li wrote:
> >>>>> Sheaf structs are exchanged through node-local barns. Since barn structs
> >>>>> are already allocated from their local NUMA node, this patch aims to
> >>>>> allocate sheaf structs from their local memory nodes as well.
> >>>>>
> >>>>> To achieve this, the obvious choice would be using cpu_to_mem().
> >>>>> However, init_percpu_sheaves() and bootstrap_cache_sheaves() iterate
> >>>>> through possible CPUs, whereas cpu_to_mem() is only initialized for
> >>>>> online CPUs. Therefore, we cannot use cpu_to_mem() and instead need to
> >>>>> use local_memory_node(cpu_to_node(cpu)), similar to what
> >>>>> __build_all_zonelists() does.
> >>>>>
> >>>>> The primary goal of this patch is to improve NUMA node locality.
> >>>>> Although the actual performance impact is minor, it still yields a ~1%
> >>>>> improvement on a 192-core, 8-NUMA-node system when testing with the
> >>>>> will-it-scale mmap test case.
> >>>>
> >>>> Oh, nice :)
> >>>>
> >>>> I have a question though...
> >>>>
> >>>> I wonder if would be better to handle this by e.g.) not returning empty
> >>>> sheaves back to barn and freeing them if the node id doesn't match and
> >>>> it's not a memoryless node.
> >>>>
> >>>> init_percpu_sheaves() and bootstrap_cache_sheaves() are not the only
> >>>> places that can allocate sheaves from remote nodes; sheaves allocation
> >>>> could fall back to other nodes and then SLUB could keep reusing those
> >>>> sheaves from remote nodes even after memory is reclaimed.
> >>>
> >>> This is a good catch. In addition to the fallback mechanism, task migration
> >>> between CPUs in __pcs_replace_empty_main() and __pcs_replace_full_main() can
> >>> also mix up sheaf structs across different barns. So yeah, changing allocation
> >>> locality is not a silver bullet.
> >>>
> >>>> If this works well, we probably don't need to handle it in
> >>>> init_percpu_sheaves() and bootstrap_cache_sheaves() at all as they will
> >>>> eventually be freed, while covering the other case too?
> >>>
> >>> freeing the empty sheaf if the NUMA node mismatches instead of putting it back
> >>> into the barn is indeed a good idea. I like it. But unfortunately, my testing
> >>> didn't show a clear performance improvement, though there was no noticeable
> >>> degradation either. :-(
> >>
> >> Hmm... the idea still makes sense to me, but yeah, it's not late to fix
> >> it when we have data to back up.
> >
> > I think if that approach doesn't complicate things unreasonably, it makes
> > sense to do it even if there are no clear wins (as long as there are no
> > regressions).
>
> I'm fine with that unless it introduces regressions or the
> implementation complicates things too much :)
>
> What do you say, Hao?
>

Sure thing, let me test it thoroughly!

--
Thanks,
Hao