Re: [RFC net-next v2 1/9] net: napi: Add napi_storage

From: Joe Damato
Date: Wed Sep 11 2024 - 03:47:38 EST


On Tue, Sep 10, 2024 at 05:10:48PM -0700, Jakub Kicinski wrote:
> On Tue, 10 Sep 2024 18:10:42 +0200 Joe Damato wrote:
> > On Tue, Sep 10, 2024 at 07:52:17AM -0700, Jakub Kicinski wrote:
> > > Hm, fair point. In my mind I expected we still add the fast path fields
> > > to NAPI instances. And the storage would only be there to stash that
> > > information for the period of time when real NAPI instances are not
> > > present (napi_disable() -> napi_enable() cycles).
> >
> > I see, I hadn't understood that part. It sounds like you were
> > thinking we'd stash the values in between whereas I thought we were
> > reading from the struct directly (hence the implementation of the
> > static inline wrappers).
> >
> > I don't really have a preference one way or the other.
>
> Me neither. I suspect having the fields in napi_struct to be slightly
> more optimal. No need for indirect accesses via napi->storage->$field,
> and conditions to check if napi->storage is set. We can make sure we
> populate the fields in NAPIs when they are created and when sysfs writes
> happen. So slightly better fastpath locality at the expense of more
> code on the control path keeping it in sync.
>
> FWIW the more we discuss this the less I like the word "storage" :)
> If you could sed -i 's/storage/config/' on the patches that'd great :)

Yup, I actually did that yesterday. I also like config more.

Right now, I have:
- struct napi_config
- in the napi_struct its a struct napi_config *config
- in the net_device its a struct napi_config *napi_config

I figured in the second case you are already de-refing through a
"napi_struct" so writing "napi->napi_config->..." seemed a bit wordy
compared to "napi->config->...".

> > > > I don't think I realized we settled on the NAPI ID being persistent.
> > > > I'm not opposed to that, I just think I missed that part in the
> > > > previous conversation.
> > > >
> > > > I'll give it a shot and see what the next RFC looks like.
> > >
> > > The main reason to try to make NAPI ID persistent from the start is that
> > > if it works we don't have to add index to the uAPI. I don't feel
> > > strongly about it, if you or anyone else has arguments against / why
> > > it won't work.
> >
> > Yea, I think not exposing the index in the uAPI is probably a good
> > idea? Making the NAPI IDs persistent let's us avoid that so I can
> > give that a shot because it's easier from the user app perspective,
> > IMO.
>
> Right, basically we can always add it later. Removing it later won't
> be possible :S

Yup, I totally get that.

> > > > Only one way to find out ;)
> > > >
> > > > Separately: your comment about documenting rings to NAPIs... I am
> > > > not following that bit.
> > > >
> > > > Is that a thing you meant should be documented for driver writers to
> > > > follow to reduce churn ?
> > >
> > > Which comment?
> >
> > In this message:
> >
> > https://lore.kernel.org/netdev/20240903124008.4793c087@xxxxxxxxxx/
> >
> > You mentioned this, which I interpreted as a thing that core needs
> > to solve for, but perhaps this intended as advice for drivers?
> >
> > Maybe it's enough to document how rings are distributed to NAPIs?
> >
> > First set of NAPIs should get allocated to the combined channels,
> > then for remaining rx- and tx-only NAPIs they should be interleaved
> > starting with rx?
> >
> > Example, asymmetric config: combined + some extra tx:
> >
> > combined tx
> > [0..#combined-1] [#combined..#combined+#tx-1]
> >
> > Split rx / tx - interleave:
> >
> > [0 rx0] [1 tx0] [2 rx1] [3 tx1] [4 rx2] [5 tx2] ...
> >
> > This would limit the churn when changing channel counts.
>
> Oh, yes. I'm not sure _where_ to document it. But if the driver supports
> asymmetric ring count or dedicated Rx and Tx NAPIs - this is the
> recommended way to distributing the rings to NAPIs, to, as you say,
> limit the config churn / mismatch after ring count changes.

OK, so that seems like it's related but separate from the changes I
am proposing via RFC, so I don't think I need to include that in my
next RFC.