Re: [PATCH net-next v2 13/14] net: macb: use context swapping in .set_ringparam()
From: Théo Lebrun
Date: Thu Apr 16 2026 - 04:55:36 EST
Hello Jakub,
On Tue Apr 14, 2026 at 2:50 AM CEST, Jakub Kicinski wrote:
> On Fri, 10 Apr 2026 21:52:01 +0200 Théo Lebrun wrote:
>> ethtool_ops.set_ringparam() is implemented using the primitive close /
>> update ring size / reopen sequence. Under memory pressure this does not
>> fly: we free our buffers at close and cannot reallocate new ones at
>> open. Also, it triggers a slow PHY reinit.
>>
>> Instead, exploit the new context mechanism and improve our sequence to:
>> - allocate a new context (including buffers) first
>> - if it fails, early return without any impact to the interface
>> - stop interface
>> - update global state (bp, netdev, etc)
>> - pass buffer pointers to the hardware
>> - start interface
>> - free old context.
>>
>> The HW disable sequence is inspired by macb_reset_hw() but avoids
>> (1) setting NCR bit CLRSTAT and (2) clearing register PBUFRXCUT.
>>
>> The HW re-enable sequence is inspired by macb_mac_link_up(), skipping
>> over register writes which would be redundant (because values have not
>> changed).
>>
>> The generic context swapping parts are isolated into helper functions
>> macb_context_swap_start|end(), reusable by other operations (change_mtu,
>> set_channels, etc).
>
>> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
>> index 81beb67b206a..340ae7d881c6 100644
>> --- a/drivers/net/ethernet/cadence/macb_main.c
>> +++ b/drivers/net/ethernet/cadence/macb_main.c
>> @@ -3081,6 +3081,89 @@ static void macb_configure_dma(struct macb *bp)
>> }
>> }
>>
>> +static void macb_context_swap_start(struct macb *bp)
>> +{
>> + struct macb_queue *queue;
>> + unsigned long flags;
>> + unsigned int q;
>> + u32 ctrl;
>> +
>> + /* Disable software Tx, disable HW Tx/Rx and disable NAPI. */
>> +
>> + netif_tx_disable(bp->netdev);
>
> AFAIR netif_tx_disable() just stops all the queues, if the NAPIs and
> whatever else may wake queues is still running the queues may get
> restarted right away.
Your memory appears correct (unsurprisingly). Ordering was wrong, it
must be (1) NAPI disabling followed by (2) disabling of Tx queues.
The tx queue wakeup is possible in NAPI poll function through this call
stack: netif_wake_subqueue() <- macb_tx_complete() <- macb_tx_poll().
There is also macb_tx_error_task() that disables Tx queues at start and
re-enables them at the end. Meaning we need to disable Tx queues after
we disabled queue->tx_error_task. (Note that tx_error_task probably
races with NAPI, but that is outside our scope.)
Thanks,
--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com