[REGRESSION] stmmac: Random DMA reset failure on RK3399 since v6.18
From: Jensen Huang
Date: Wed Apr 29 2026 - 09:02:54 EST
Hi,
I'm reporting a regression on RK3399 (stmmac) observed in v6.18.24.
When a network cable is connected during boot, the DMA reset
occasionally fails with the error message: "Failed to reset the dma".
This appears to be a timing issue related to the EEE RX clock-stop
logic. Based on my investigation with the RTL8211E PHY, I monitored
the PHY register PS1R (MMD device 3, address 0x01) and observed a
value of 0x0f40. This indicates that the PHY is in LPI mode and the RX
clock may have already stopped.
While commit dd557266cf5f ("net: stmmac: block PHY RXC clock-stop")
ensures the clock is running before the DMA reset, my tests suggest
that the phylink_rx_clk_stop_block() call might not provide a
sufficiently stable RX clock in time for the immediate DMA reset that
follows.
Since stmmac already sets mac_requires_rxc = true, I modified
phylink_bringup_phy() to honor this flag. This avoids toggling the
PHY's clk_stop_enable during the initialization sequence, ensuring the
RX clock remains active and stable throughout.
With the change below, I achieved 200/200 successful reboots with the
cable connected (previously ~50% failure rate).
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -2171,7 +2171,7 @@ static int phylink_bringup_phy(struct phylink
*pl, struct phy_device *phy,
/* Allow the MAC to stop its clock if the PHY has the capability */
pl->mac_tx_clk_stop = phy_eee_tx_clock_stop_capable(phy) > 0;
- if (pl->mac_supports_eee_ops) {
+ if (pl->mac_supports_eee_ops && !pl->config->mac_requires_rxc) {
/* Explicitly configure whether the PHY is allowed to stop it's
* receive clock.
*/
Any feedback/testing on this would be appreciated.
Best regards,
Jensen Huang