Re: [PATCH net] atl1c: move tx cleanup processing out of interrupt
From: Eric Dumazet
Date: Thu Apr 01 2021 - 17:46:46 EST
On 4/1/21 7:32 PM, Gatis Peisenieks wrote:
> Tx queue cleanup happens in interrupt handler on same core as rx queue processing.
> Both can take considerable amount of processing in high packet-per-second scenarios.
>
> Sending big amounts of packets can stall the rx processing which is unfair
> and also can lead to to out-of-memory condition since __dev_kfree_skb_irq
> queues the skbs for later kfree in softirq which is not allowed to happen
> with heavy load in interrupt handler.
>
> This puts tx cleanup in its own napi to be executed on different core than rx
> cleanup which solves the mentioned problems and increases the throughput.
>
> The ability to sustain equal amounts of tx/rx traffic increased:
> from 280Kpps to 1130Kpps on Threadripper 3960X with upcoming Mikrotik 10/25G NIC,
> from 520Kpps to 850Kpps on Intel i3-3320 with Mikrotik RB44Ge adapter.
>
> Signed-off-by: Gatis Peisenieks <gatis@xxxxxxxxxxxx>
> }
> }
> - if (status & ISR_TX_PKT)
> - atl1c_clean_tx_irq(adapter, atl1c_trans_normal);
> + if (status & ISR_TX_PKT) {
> + if (napi_schedule_prep(&adapter->tx_napi)) {
> + int tx_cpu = (smp_processor_id() + 1) %
> + num_online_cpus();
Ouch. Please do not burry in a driver such hard-coded facility.
There is no guarantee tx_cpu is an online cpu .
> + spin_lock(&adapter->irq_mask_lock);
> + hw->intr_mask &= ~ISR_TX_PKT;
> + AT_WRITE_REG(hw, REG_IMR, hw->intr_mask);
> + spin_unlock(&adapter->irq_mask_lock);
> + smp_call_function_single_async(tx_cpu,
> + &adapter->csd);
> + }
> + }
>
Have you tried using a second NAPI, but no csd ?
We now have kthread napi [1]
By simply enabling kthread NAPI on your NIC, you should get same nice behavior.
As a bonus, on moderate load you would use a single cpu instead of two.
[1]
cb038357937ee4f589aab2469ec3896dce90f317 net: fix race between napi kthread mode and busy poll
5fdd2f0e5c64846bf3066689b73fc3b8dddd1c74 net: add sysfs attribute to control napi threaded mode
29863d41bb6e1d969c62fdb15b0961806942960e net: implement threaded-able napi poll loop support