Re: [PATCH net-next 5/5] netdev-genl: Support setting per-NAPI config values

From: Jakub Kicinski
Date: Mon Sep 02 2024 - 21:02:31 EST


On Mon, 2 Sep 2024 18:56:07 +0200 Joe Damato wrote:
> > How do you feel about making this configuration opt-in / require driver
> > changes? What I'm thinking is that having the new "netif_napi_add()"
> > variant (or perhaps extending netif_napi_set_irq()) to take an extra
> > "index" parameter would make the whole thing much simpler.
>
> What about extending netif_queue_set_napi instead? That function
> takes a napi and a queue index.

This may become problematic, since there may be more queues than NAPIs.
Either with multiple combined queues mapped to a single NAPI, or having
separate Rx/Tx NAPIs.

No strong preference on which function we modify (netif_napi_add vs the
IRQ setting helper) but I think we need to take the index as an
explicit param, rather than try to guess it based on another ID.
The queue ID will match 95% of the time, today. But Intel was
interested in "remapping" queues. And we all think about a queue API...
So using queue ID is going to cause problems down the road.

> Locally I kinda of hacked up something simple that:
> - Allocates napi_storage in net_device in alloc_netdev_mqs
> - Modifies netif_queue_set_napi to:
> if (napi)
> napi->storage = dev->napi_storage[queue_index];
>
> I think I'm still missing the bit about the
> max(rx_queues,tx_queues), though :(

You mean whether it's enough to do

napi_cnt = max(txqs, rxqs)

? Or some other question?

AFAIU mlx5 can only have as many NAPIs as it has IRQs (256?).
Look at ip -d link to see how many queues it has.
We could do txqs + rxqs to be safe, if you prefer, but it will
waste a bit of memory.

> > Index would basically be an integer 0..n, where n is the number of
> > IRQs configured for the driver. The index of a NAPI instance would
> > likely match the queue ID of the queue the NAPI serves.
>
> Hmmm. I'm hesitant about the "number of IRQs" part. What if there
> are NAPIs for which no IRQ is allocated ~someday~ ?

A device NAPI without an IRQ? Whoever does something of such silliness
will have to deal with consequences. I think it's unlikely.

> It seems like (I could totally be wrong) that netif_queue_set_napi
> can be called and work and create the association even without an
> IRQ allocated.
>
> I guess the issue is mostly the queue index question above: combined
> rx/tx vs drivers having different numbers of rx and tx queues.

Yes, and in the future the ability to allocate more queues than NAPIs.
netif_queue_set_napi() was expected to cover such cases.

> > We can then allocate an array of "napi_configs" in net_device -
> > like we allocate queues, the array size would be max(num_rx_queue,
> > num_tx_queues). We just need to store a couple of ints so it will
> > be tiny compared to queue structs, anyway.
> >
> > The NAPI_SET netlink op can then work based on NAPI index rather
> > than the ephemeral NAPI ID. It can apply the config to all live
> > NAPI instances with that index (of which there really should only
> > be one, unless driver is mid-reconfiguration somehow but even that
> > won't cause issues, we can give multiple instances the same settings)
> > and also store the user config in the array in net_device.
> >
> > When new NAPI instance is associate with a NAPI index it should get
> > all the config associated with that index applied.
> >
> > Thoughts? Does that makes sense, and if so do you think it's an
> > over-complication?
>
> I think what you are proposing seems fine; I'm just working out the
> implementation details and making sure I understand before sending
> another revision.

If you get stuck feel free to send a link to a GH or post RFC.
I'm adding extra asks so whether form of review helps.. :)