Re: [net-next v4 PATCH] page_pool: handle page recycle for NUMA_NO_NODE condition

From: Ilias Apalodimas
Date: Thu Dec 19 2019 - 10:28:48 EST


On Thu, Dec 19, 2019 at 03:52:06PM +0100, Michal Hocko wrote:
> On Thu 19-12-19 14:35:35, Jesper Dangaard Brouer wrote:
> > On Thu, 19 Dec 2019 13:09:25 +0100
> > Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> >
> > > On Wed 18-12-19 09:01:35, Jesper Dangaard Brouer wrote:
> > > [...]
> > > > For the NUMA_NO_NODE case, when a NIC IRQ is moved to another NUMA
> > > > node, then ptr_ring will be emptied in 65 (PP_ALLOC_CACHE_REFILL+1)
> > > > chunks per allocation and allocation fall-through to the real
> > > > page-allocator with the new nid derived from numa_mem_id(). We accept
> > > > that transitioning the alloc cache doesn't happen immediately.
> >
> > Oh, I just realized that the drivers usually refill several RX
> > packet-pages at once, this means that this is called N times, meaning
> > during a NUMA change this will result in N * 65 pages returned.
> >
> >
> > > Could you explain what is the expected semantic of NUMA_NO_NODE in this
> > > case? Does it imply always the preferred locality? See my other email[1] to
> > > this matter.
> >
> > I do think we want NUMA_NO_NODE to mean preferred locality.
>

Why? wouldn't it be clearer if it meant "this is not NUMA AWARE"?
The way i see it iyou have drivers that sit on specific SoCs,
like the ti one, or the netsec one can declare 'NUMA_NO_NODE' since they
know beforehand what hardware they'll be sitting on.
On PCI/USB pluggable interfaces mlx5 example should be followed.

> I obviously have no saying here because I am not really familiar with
> the users of this API but I would note that if there is such an implicit
> assumption then you make it impossible to use the numa agnostic page
> pool allocator (aka fast reallocation). This might be not important here
> but future extension would be harder (you can still hack it around aka
> NUMA_REALLY_NO_NODE). My experience tells me that people are quite
> creative and usually require (or worse assume) semantics that you
> thought were not useful.
>
> That being said, if the NUMA_NO_NODE really should have a special
> locality meaning then document it explicitly at least.

Agree, if we treat it like this we have to document it somehow

> --
> Michal Hocko
> SUSE Labs

Thanks
/Ilias