Re: [PATCH net-next v5 2/5] net: cadence: macb: implement EEE TX LPI support

From: Théo Lebrun

Date: Tue Mar 03 2026 - 03:16:35 EST

On Tue Mar 3, 2026 at 3:15 AM CET, Jakub Kicinski wrote:
> On Fri, 27 Feb 2026 16:06:07 +0100 Nicolai Buchwitz wrote:
>> +static bool macb_tx_lpi_set(struct macb *bp, bool enable)
>> +{
>> + unsigned long flags;
>> + u32 old, ncr;
>> +
>> + spin_lock_irqsave(&bp->lock, flags);
>
> we should optimize this function for the past path caller.
> xmit path does:
>
> + macb_tx_lpi_wake(bp);
> +
> spin_lock(&bp->lock);
>
> So it immediately takes that lock again, can we move the lpi_wake()
> call under the spin_lock, and make sure other callers also take that
> lock? I think you can add a lockdep assert to make sure spin lock is
> held
>
>> + ncr = macb_readl(bp, NCR);
>> + old = ncr;
>> + if (enable)
>> + ncr |= GEM_BIT(TXLPIEN);
>> + else
>> + ncr &= ~GEM_BIT(TXLPIEN);
>> + if (old != ncr)
>> + macb_writel(bp, NCR, ncr);
>> + spin_unlock_irqrestore(&bp->lock, flags);
>> +
>> + return old != ncr;
>> +}
>> +
>> +static bool macb_tx_all_queues_idle(struct macb *bp)
>> +{
>> + unsigned int q;
>> +
>> + for (q = 0; q < bp->num_queues; q++) {
>> + struct macb_queue *queue = &bp->queues[q];
>> +
>> + if (queue->tx_head != queue->tx_tail)
>
> Does not not need tx_ptr_lock technically?
>
>> + return false;
>> + }
>> + return true;
>> +}
>> +
>> +static void macb_tx_lpi_work_fn(struct work_struct *work)
>> +{
>> + struct macb *bp = container_of(work, struct macb, tx_lpi_work.work);
>> +
>> + if (bp->eee_active && macb_tx_all_queues_idle(bp))
>> + macb_tx_lpi_set(bp, true);
>> +}
>> +
>> +static void macb_tx_lpi_schedule(struct macb *bp)
>> +{
>> + if (bp->eee_active)
>> + mod_delayed_work(system_wq, &bp->tx_lpi_work,
>> + usecs_to_jiffies(bp->tx_lpi_timer));
>> +}
>> +
>> +/* Wake from LPI before transmitting. The MAC must deassert TXLPIEN
>> + * and wait for the PHY to exit LPI before any frame can be sent.
>> + * IEEE 802.3az Tw_sys is ~17us for 1000BASE-T, ~30us for 100BASE-TX;
>> + * we use a conservative 50us.
>> + */
>> +static void macb_tx_lpi_wake(struct macb *bp)
>> +{
>> + if (!macb_tx_lpi_set(bp, false))
>
> Does this lpi_set() not have a relatively high cost, even if eee_active
> is disabled? Reading registers is usually pretty slow. Can we add
> a eee_active check here as well to short cut the lpi check?
> If we do we probably want to make sure that the code paths setting
> eee_active are also under bp->lock, otherwise this new check will be
> racy.

Funny how this discussion keeps coming up! I made the same remark on V3:
https://lore.kernel.org/netdev/DGOXXGNSSMYK.2XNU9AQ6E077P@xxxxxxxxxxx/

And it had been discussed in a previous iteration before.

In theory I agree, in practice the optimization was statistically
insignificant on my platform. The total time spent in macb_start_xmit()
is tiny, so any optimization inside of it is even more so.

Thanks,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com