Re: [PATCH iwl-next v2 5/9] igc: Add support for frame preemption verification
From: Vladimir Oltean
Date: Thu Feb 06 2025 - 10:04:42 EST
On Thu, Feb 06, 2025 at 10:40:11PM +0800, Abdul Rahim, Faizal wrote:
>
> Hi Vladimir,
>
> Thanks for the quick review, appreciate your help.
>
> On 6/2/2025 1:12 am, Vladimir Oltean wrote:
> > On Wed, Feb 05, 2025 at 05:05:20AM -0500, Faizal Rahim wrote:
> > > This patch implements the "ethtool --set-mm" callback to trigger the
> > > frame preemption verification handshake.
> > >
> > > Uses the MAC Merge Software Verification (mmsv) mechanism in ethtool
> > > to perform the verification handshake for igc.
> > > The structure fpe.mmsv is set by mmsv in ethtool and should remain
> > > read-only for the driver.
> > >
> > > igc does not use two mmsv callbacks:
> > > a) configure_tx()
> > > - igc lacks registers to configure FPE in the transmit direction.
> >
> > Yes, maybe, but it's still important to handle this. It tells you when
> > the preemptible traffic classes should be sent as preemptible on the wire
> > (i.e. when the verification is either disabled, or it succeeded).
> >
> > There is a selftest called manual_failed_verification() which supposedly
> > tests this exact condition: if verification fails, then packets sent to
> > TC0 are supposed to bump the eMAC's TX counters, even though TC0 is
> > configured as preemptible. Otherwise stated: even if the tc program says
> > that a certain traffic class is preemptible, you don't want to actually
> > send preemptible packets if you haven't verified the link partner can
> > handle them, since it will likely drop them on RX otherwise.
>
> Even though fpe in tx direction isn't set in igc, it still checks
> ethtool_mmsv_is_tx_active() before setting a queue as preemptible.
>
> This is done in :
> igc_tsn_enable_offload(struct igc_adapter *adapter) {
> {
> ....
> if (ethtool_mmsv_is_tx_active(&adapter->fpe.mmsv) &&
> ring->preemptible)
> txqctl |= IGC_TXQCTL_PREEMPTIBLE;
>
>
> Wouldn't this handle the situation mentioned ?
> Sorry if I miss something here.
And what if tx_active becomes true after you had already configured the
queues with tc (and the above check caused IGC_TXQCTL_PREEMPTIBLE to not
be set)? Shouldn't you set IGC_TXQCTL_PREEMPTIBLE now? Isn't
ethtool_mmsv_configure_tx() exactly the function that notifies you of
changes to tx_active, and hence, aren't you interested in setting up a
callback for it?
Or is igc_tsn_reset() -> igc_tsn_enable_offload() called through some
other path, after verification succeeds, that I'm not seeing? I don't
think so. Maybe coincidentally, but not guaranteed.
> I briefly checked the driver code and the i226 SW User Manual.
>
> The code calls igc_reset_task() → igc_reset() → igc_reinit_locked() →
> igc_down() → igc_reset() → igc_power_down_phy_copper_base().
>
> I suspect igc_power_down_phy_copper_base() contributes to the link loss, but
> there are likely other factors as well.
>
> The SW User Manual states that a software reset (CTRL.DEV_RST) affects
> several components, including "Port Configuration Autoload from NVM, Data
> Path, On-die Memories, MAC, PCS, Auto Negotiation and other Port-related
> Logic, Function Queue Enable, Function Interrupt and Statistics registers,
> Wake-up Management Registers, and Memory Configuration Registers."
>
> Given this, right now, I’m unsure about the feasibility of making this
> change, though I see the benefits mentioned.
> Validating this would require significant exploration—i.e., investigating
> the code, running experiments, reviewing commit history, confirming hardware
> expectations from the SW manual, and consulting other hardware SMEs.
>
> Resetting the adapter is a common mechanism in igc that relies on shared
> functions, which can be triggered in various scenarios. Modifying this
> behavior (if feasible) could introduce some risks and would likely require
> additional testing across those scenarios. Focusing on this right now might
> delay the current series, which is primarily aimed at enabling Qbu on
> i225/6.
>
> Would it be alright if I explore this separately from this Qbu series?
Ok - as I said, it's not as if I couldn't eventually tolerate the workaround
you chose. We'd be putting a dependency of this feature on some unrelated
thing with a high degree of uncertainty. For example, I asked if this is
fiber or a twisted pair PHY, because while for the copper PHY the issue
might be more tractable (clause 28 autoneg on the media side is completely
independent of what the host side NIC is doing - it may reset, it may do
whatever - unless the MAC provides a clock that is important for PHY
operation), I do wonder if anything at all could be done to avoid the
link partner seeing a link loss over fiber, because there, the connection
is direct PCS-to-PCS. If you have to reset, you have to reset, and then
pick up the pieces, I understand.
It's good you're committing to look into this in more depth, though.