Re: [net-next 2/2] ena: Link queues to NAPIs

From: Joe Damato
Date: Tue Oct 01 2024 - 14:07:58 EST


On Tue, Oct 01, 2024 at 04:40:32PM +0000, Arinzon, David wrote:
> > > > >
> > > > > Thank you for uploading this patch.
> > > >
> > > > Can you please let me know (explicitly) if you want me to send a
> > > > second revision (when net-next allows for it) to remove the 'struct
> > > > napi_struct' and add a comment as described above?
> > >
> > > Yes, I would appreciate that.
> > > I guess the `struct napi_struct` is OK, this way both functions will look the
> > same.
> > > Regarding the comment, yes please, something like /* This API is
> > supported for non-XDP queues only */ in both places.
> > > I also added a small request to preserve reverse christmas tree order on
> > patch 1/2 in the series.
> >
> > Thanks for mentioning the nit about reverse christmas tree order; I missed
> > that.
> >
> > I will:
> > - Fix the ordering of the variables in 1/2
> > - Add 2 comments in 2/2
> >
> > I'll have to wait the ~48hr timeout before I can post the v2, but I'll be sure to
> > CC you on the next revision.
>
> It's not at least a 24hr timeout?
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/maintainer-netdev.rst#n394

Ah, looks like you are right. For some reason I had 48hr in my head;
I think I usually wait a bit longer for larger / more complicated
stuff, but in this case 24hr seems OK.

> >
> > > Thank you again for the patches in the driver.
> >
> > No worries, thanks for the review.
> >
> > BTW: Since neither of the changes you've asked me to make are functional
> > changes, would you mind testing the driver changes on your side just to
> > make sure? I tested them myself on an ec2 instance with an ENA driver, but I
> > am not an expert on ENA :)
> >
> > - Joe
>
> I picked up the patch and got to the same results that you did when running on an EC2 instance.
> Thank you for sharing the commands in the commit messages, it was really helpful.
> Correct me if I'm wrong, but there's no functional impact to these changes except the ability to
> view the mappings through netlink.

This doesn't change anything about how the driver processes packets
or handles data or anything. It's a "control plane" sort of change;
it allows the mapping of IRQs queues and NAPI IDs to be exposed via
netlink from core networking code (see also: net/core/netdev-genl.c
and net/core/netdev-genl-gen.c).

This can be useful if an app uses, for example,
SO_INCOMING_NAPI_ID (typically used for busy polling, but not
required to busy poll).

A user app might design some logic like (just making this up as an
example):

New fd from accept() has NAPI_ID 123 which corresponds to ifindex
3 and so thread number X should handle the connection, because it
is pinned to a CPU that is most optimal for ifindex 3 (e.g. NUMA
local or softIRQ local or whatever the user app wants).

Without this change, user apps can get the NAPI id of an FD but have
no way to know which ifindex or queue it is associated with.

It also allows user apps to more easily determine which IRQ maps to
which queue (without having to parse, say /proc/interrupts).

This can be used for monitoring/alerting/observability purposes by
user apps trying to track down the source of hardware IRQ
generation. Say you are using RSS to direct specific types of flows
to specific queues, knowing which queues are associated with which
IRQs (via their NAPI ID) can help narrow down where IRQ generation
is coming from.

Overall: there's a lot of use cases where exposing this mapping to
userland can be very helpful.