Re: [PATCH v2] net: skbuff: set FLAG_SKB_NO_MERGE for skbuff_fclone_cache

From: Eric Dumazet
Date: Thu Feb 29 2024 - 12:08:35 EST


On Thu, Feb 29, 2024 at 6:01 PM Christoph Lameter (Ampere) <cl@xxxxxxxxx> wrote:
>
> On Wed, 28 Feb 2024, Shijie Huang wrote:
>
> >>
> >> Using SLAB_NO_MERGE does not help, I am still seeing wrong allocations
> >> on a dual socket
> >> host with plenty of available memory.
> >> (either sk_buff or skb->head being allocated on the other node).
> >
> > Do you mean you still can see the wrong fclone after using SLAB_NO_MERGE?
> >
> > If so, I guess there is bug in the slub.
>
> Mergin has nothing to do with memory locality.
>
> >> fclones might be allocated from a cpu running on node A, and freed
> >> from a cpu running on node B.
> >> Maybe SLUB is not properly handling this case ?
> >
> > Maybe.
>
> Basic functionality is broken??? Really?

It seems so.

>
> >> I think we need help from mm/slub experts, instead of trying to 'fix'
> >> networking stacks.
> >
> > @Christopher
> >
> > Any idea about this?
>
>
> If you want to force a local allocation then use GFP_THISNODE as a flag.
>
> If you do not specify a node or GFP_THISNODE then the slub allocator will
> opportunistically allocate sporadically from other nodes to avoid
> fragmentation of slabs. The page allocator also will sporadically go off
> node in order to avoid reclaim. The page allocator may go off node
> extensively if there is a imbalance of allocation between node. The page
> allocator has knobs to tune off node vs reclaim options. Doing more
> reclaim will slow things down but give you local data.

Maybe, maybe not.

Going back to CONFIG_SLAB=y removes all mismatches, without having to
use GFP_THISNODE at all,
on hosts with plenty of available memory on all nodes.

I think that is some kind of evidence that something is broken in SLUB land.