[BUG] SFP I2C timeout forces link down with PHY_ERROR
From: Sean Anderson
Date: Tue May 28 2024 - 12:57:42 EST
Hi,
I saw the following warning [1] twice when testing 1000Base-T SFP
modules:
[ 1481.682501] cdns-i2c ff030000.i2c: timeout waiting on completion
[ 1481.692010] Marvell 88E1111 i2c:sfp-ge3:16: Master/Slave resolution failed
[ 1481.699910] ------------[ cut here ]------------
[ 1481.705459] phy_check_link_status+0x0/0xe8: returned: -67
[ 1481.711448] WARNING: CPU: 2 PID: 67 at drivers/net/phy/phy.c:1233 phy_state_machine+0xac/0x2ec
<snip>
[ 1481.904544] macb ff0c0000.ethernet net1: Link is Down
and a second time with some other errors too:
[ 64.972751] cdns-i2c ff030000.i2c: xfer_size reg rollover. xfer aborted!
[ 64.979478] cdns-i2c ff030000.i2c: xfer_size reg rollover. xfer aborted!
[ 65.998108] cdns-i2c ff030000.i2c: timeout waiting on completion
[ 66.010558] Marvell 88E1111 i2c:sfp-ge3:16: Master/Slave resolution failed
[ 66.017856] ------------[ cut here ]------------
[ 66.022786] phy_check_link_status+0x0/0xcc: returned: -67
[ 66.028255] WARNING: CPU: 0 PID: 70 at drivers/net/phy/phy.c:1233 phy_state_machine+0xa4/0x2b8
<snip>
[ 66.339533] macb ff0c0000.ethernet net1: Link is Down
The chain of events is:
- The I2C transaction times out for some reason (in the latter case due
to a known hardware bug).
- mdio-i2c converts the error response to a 0xffff return value
- genphy_read_lpa sees that LPA_1000MSFAIL is set in MII_STAT1000 and
returns -ENOLINK. This propagates up the calls stack.
- phy_check_link_status returns -ENOLINK
- phy_error_precise forces the link down with state = PHY_ERROR.
The problem with this is that although the register read fails due to a
temporary condition, the link goes down permanently (or at least until
the admin cycles the interface state).
I think some part of the stack should implement a retry mechanism, but
I'm not sure which part. One idea could be to have mdio-i2c propagate
negative errors instead of converting them to successful reads of
0xffff. But we would still need to handle that in the phy driver or in
phy_state_machine.
- Are I2C bus drivers supposed to be flaky like this? That is, are callers of
i2c_transfer expected to handle the occasional spurious error?
- Similarly, are MDIO bus drivers allowed to be flaky?
- Is ETIMEDOUT even supposed to be recoverable? Maybe we should have
cdns-i2c return EAGAIN instead so it gets retried by the bus
arbitration logic in __i2c_transfer.
- ENOLINK really seems like something which we could recover from by
resetting the phy (or even just waiting a bit). Maybe we should have
the phy state machine just switch to PHY_NOLINK?
Of course, the best option would be to fix cdns-i2c to not be buggy, but
the hardware itself is buggy in at least one of the above cases so that
may not be practical.
--Sean