Re: [RFC net-next v2 1/9] net: napi: Add napi_storage

From: Jakub Kicinski
Date: Tue Sep 10 2024 - 10:52:31 EST


On Tue, 10 Sep 2024 08:13:51 +0200 Joe Damato wrote:
> On Mon, Sep 09, 2024 at 04:40:39PM -0700, Jakub Kicinski wrote:
> > On Sun, 8 Sep 2024 16:06:35 +0000 Joe Damato wrote:
> > > Add a persistent NAPI storage area for NAPI configuration to the core.
> > > Drivers opt-in to setting the storage for a NAPI by passing an index
> > > when calling netif_napi_add_storage.
> > >
> > > napi_storage is allocated in alloc_netdev_mqs, freed in free_netdev
> > > (after the NAPIs are deleted), and set to 0 when napi_enable is called.
> >
> > > enum {
> > > @@ -2009,6 +2019,9 @@ enum netdev_reg_state {
> > > * @dpll_pin: Pointer to the SyncE source pin of a DPLL subsystem,
> > > * where the clock is recovered.
> > > *
> > > + * @napi_storage: An array of napi_storage structures containing per-NAPI
> > > + * settings.
> >
> > FWIW you can use inline kdoc, with the size of the struct it's easier
> > to find it. Also this doesn't need to be accessed from fastpath so you
> > can move it down.
>
> OK. I figured since it was being deref'd in napi_complete_done
> (where we previously read napi_defer_hard_irqs and
> gro_flush_timeout) it needed to be in the fast path.
>
> I'll move it down for the next RFC.

Hm, fair point. In my mind I expected we still add the fast path fields
to NAPI instances. And the storage would only be there to stash that
information for the period of time when real NAPI instances are not
present (napi_disable() -> napi_enable() cycles).

But looking at napi_struct, all the cachelines seem full, anyway, so we
can as well split the info. No strong preference, feel free to keep as
is, then. But maybe rename from napi_storage to napi_config or such?

> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index 22c3f14d9287..ca90e8cab121 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -6719,6 +6719,9 @@ void napi_enable(struct napi_struct *n)
> > > if (n->dev->threaded && n->thread)
> > > new |= NAPIF_STATE_THREADED;
> > > } while (!try_cmpxchg(&n->state, &val, new));
> > > +
> > > + if (n->napi_storage)
> > > + memset(n->napi_storage, 0, sizeof(*n->napi_storage));
>
> OK, your comments below will probably make more sense to me after I
> try implementing it, but I'll definitely have some questions.
>
> > And here inherit the settings and the NAPI ID from storage, then call
> > napi_hash_add(). napi_hash_add() will need a minor diff to use the
> > existing ID if already assigned.
>
> I don't think I realized we settled on the NAPI ID being persistent.
> I'm not opposed to that, I just think I missed that part in the
> previous conversation.
>
> I'll give it a shot and see what the next RFC looks like.

The main reason to try to make NAPI ID persistent from the start is that
if it works we don't have to add index to the uAPI. I don't feel
strongly about it, if you or anyone else has arguments against / why
it won't work.

> > And the inverse of that has to happen in napi_disable() (unhash, save
> > settings to storage), and __netif_napi_del() (don't unhash if it has
> > index).
> >
> > I think that should work?
>
> Only one way to find out ;)
>
> Separately: your comment about documenting rings to NAPIs... I am
> not following that bit.
>
> Is that a thing you meant should be documented for driver writers to
> follow to reduce churn ?

Which comment?