Re: [PATCH net v2] net: mana: Optimize irq affinity for low vcpu configs

From: Shradha Gupta

Date: Tue May 26 2026 - 09:05:35 EST


On Mon, May 18, 2026 at 12:04:31AM -0700, Erni Sri Satya Vennela wrote:
> > > But one observation I had was that " irq_set_affinity_and_hint(*irqs++,
> > > NULL);" is essentially a no-op and we end up relying on the initial
> > > placement from pci_alloc_irq_vectors().
> >
> > Yes you are, assuming you're not binding them before in your call chain.
> >
> > > Even though in these tests we
> > > were not able to reproduce it, but with this distribution there is a
> > > chance we end up clustering the mana queue IRQs, while other vCPUs are
> > > not running any network load.
> >
> > That sounds like an IRQ balancer bug which you're unable to reproduce.
> >
> > > It's because the placement depends on
> > > system-wide IRQ state at allocation time.
> >
> > I don't understand this point. The
> >
> > irq_set_affinity_and_hint(*irqs++, NULL);
> >
> > simply means: I trust system IRQ balancer to pick the best CPU for my
> > IRQ at runtime. It doesn't refer any "IRQ state at allocation time".
> >
> > > The linear approach however gaurantees each queue IRQ lands on a
> > > distinct vCPU regardless of system state. Even after stressing the cpus
> > > using stress-ng, we did not observe any significant throughput drop.
> >
> > If you just do nothing, it would lead to the same numbers, right? What
> > does that "non-significant throughput drop" mean? It sounds like the
> > linear approach is slightly worse.
>
> The numbers are not worse, they almost same in both the cases.
> >
> > --
> >
> > So, as you can't demonstrate solid benefit for the 'linear' IRQ placement,
> > I would just stick to the no-affinity logic.
>
> Thankyou Yury,
> We are investigating on more test scenarios and trying to
> capture numbers with both, your proposed change and the one from this
> patch. We will keep you updated about the results.
>
>
> - Vennela

Hi Yury,

Vennela and I ran a bunch of more tests and were able to reproduce the
clustering of mana IRQs issue we discussed earlier with the suggested
approach(setting the affinity and hint to NULL).
In these tests there were additional IRQs allocated(apart from MANA),
that disturbed the MANA IRQ distribution

ENV details
azure SKU(Standard_L4als_v5) 4 vcpu(2 cores), 5 MANA IRQs (1 HWC + 4
Queue)

"Affinity set to NULL" approach
========================================
MANA IRQ distribution vCPU
========================================
IRQ0 HWC 0
IRQ1 mana_q1 2
IRQ3 mana_q2 3
IRQ4 mana_q3 2
IRQ5 mana_q4 3


"Affinity set linearly" approach
========================================
MANA IRQ distribution vCPU
========================================
IRQ0 HWC 0
IRQ1 mana_q1 1
IRQ3 mana_q2 2
IRQ4 mana_q3 3
IRQ5 mana_q4 0


Throughput(Gbps) with high TCP connection
========================================
connection affinity NULL Linear
20480 5.25 13.49
10240 5.77 13.48
8192 7.16 13.48
6144 9.33 13.53
4096 13.50 13.50


Considering these results, we would like to proceed with the linear
approach that was proposed by this patch.


Regards,
Shradha