Re: [PATCH net-next 5/5] netdev-genl: Support setting per-NAPI config values

From: Joe Damato
Date: Thu Sep 05 2024 - 05:21:17 EST


On Tue, Sep 03, 2024 at 02:58:14PM -0700, Samiullah Khawaja wrote:
> On Tue, Sep 3, 2024 at 12:40 PM Jakub Kicinski <kuba@xxxxxxxxxx> wrote:
> >
> > On Tue, 3 Sep 2024 12:04:52 -0700 Samiullah Khawaja wrote:
> > > Do we need a queue to napi association to set/persist napi
> > > configurations?
> >
> > I'm afraid zero-copy schemes will make multiple queues per NAPI more
> > and more common, so pretending the NAPI params (related to polling)
> > are pre queue will soon become highly problematic.
> Agreed.
> >
> > > Can a new index param be added to the netif_napi_add
> > > and persist the configurations in napi_storage.
> >
> > That'd be my (weak) preference.
> >
> > > I guess the problem would be the size of napi_storage.
> >
> > I don't think so, we're talking about 16B per NAPI,
> > struct netdev_queue is 320B, struct netdev_rx_queue is 192B.
> > NAPI storage is rounding error next to those :S
> Oh, I am sorry I was actually referring to the problem of figuring out
> the count of the napi_storage array.
> >
> > > Also wondering if for some use case persistence would be problematic
> > > when the napis are recreated, since the new napi instances might not
> > > represent the same context? For example If I resize the dev from 16
> > > rx/tx to 8 rx/tx queues and the napi index that was used by TX queue,
> > > now polls RX queue.
> >
> > We can clear the config when NAPI is activated (ethtool -L /
> > set-channels). That seems like a good idea.
> That sounds good.
> >
> > The distinction between Rx and Tx NAPIs is a bit more tricky, tho.
> > When^w If we can dynamically create Rx queues one day, a NAPI may
> > start out as a Tx NAPI and become a combined one when Rx queue is
> > added to it.
> >
> > Maybe it's enough to document how rings are distributed to NAPIs?
> >
> > First set of NAPIs should get allocated to the combined channels,
> > then for remaining rx- and tx-only NAPIs they should be interleaved
> > starting with rx?
> >
> > Example, asymmetric config: combined + some extra tx:
> >
> > combined tx
> > [0..#combined-1] [#combined..#combined+#tx-1]
> >
> > Split rx / tx - interleave:
> >
> > [0 rx0] [1 tx0] [2 rx1] [3 tx1] [4 rx2] [5 tx2] ...
> >
> > This would limit the churn when changing channel counts.
> I think this is good. The queue-get dump netlink does provide details
> of all the queues in a dev. It also provides a napi-id if the driver
> has set it (only few drivers set this).

This is true, but there are several and IMHO extending existing
drivers to support this can be done. I have been adding "nits" to
driver reviewers for new drivers asking the author(s) to consider
adding support for the API.

Not sure which driver you are using, but I can help you add support
for the API if it is needed.

> So basically a busy poll application would look at the queue type
> and apply configurations on the relevant napi based on the
> documentation above (if napi-id is not set on the queue)?

That was my plan for my user app based on the conversation so far.
At start, the app gets some config with a list of ifindexes it will
bind to for incoming connections and then gets the NAPI IDs via
netlink and sets the per-NAPI params via netlink as well.

Haven't implemented this yet in the user app, but that's the
direction I am planning to go with this all.