Re: [REGRESSION] stmmac: Random DMA reset failure on RK3399 since v6.18
From: Maxime Chevallier
Date: Thu May 07 2026 - 09:22:42 EST
Hi,
On 07/05/2026 14:49, Jensen Huang wrote:
> On Tue, May 5, 2026 at 4:26 PM Thorsten Leemhuis
> <regressions@xxxxxxxxxxxxx> wrote:
>>
>> [Jumping in here, as there are no replies yet]
>>
>> BTW, Russel, just in case you missed this: looks like this regressions
>> caused by a change of yours.
I think Russell is dealing with unpleasant personal stuff, let's see if we
can figure this out while he's away.
>>
>> On 4/29/26 14:53, Jensen Huang wrote:
>>>
>>> I'm reporting a regression on RK3399 (stmmac) observed in v6.18.24.
>>> When a network cable is connected during boot, the DMA reset
>>> occasionally fails with the error message: "Failed to reset the dma".
>>>
>>> This appears to be a timing issue related to the EEE RX clock-stop
>>> logic. Based on my investigation with the RTL8211E PHY, I monitored
>>> the PHY register PS1R (MMD device 3, address 0x01) and observed a
>>> value of 0x0f40. This indicates that the PHY is in LPI mode and the RX
>>> clock may have already stopped.
>From what I get, your current hypthesis is that it takes a while for that
clock to stabilize and therefore we're accessing the DMA registers too soon ?
Can you confirm that with the addition of a small delay ?
>>>
>>> While commit dd557266cf5f ("net: stmmac: block PHY RXC clock-stop")
>>
>> Just wondering: have you tried if mainline (e.g. 7.1-rc1) is still
>> affected? This is something that is always a good advisable (some people
>> would call it required). In this case even more, as it since a while
>> contains a fix for the change you mentioned, that wasn't backported:
>> c171e679ee66d7 ("net: stmmac: Disable EEE RX clock stop when VLAN is
>> enabled"). But this is not my area of expertise (and in different area
>> of the code), so that fix might be unrelated to your issue.
>
> Thanks for the pointer.
> As you suggested, I have tested the mainline and confirmed that the
> issue is not present in v7.1-rc2, nor as early as v6.19-rc1. However,
> I verified that the issue persists in the latest stable v6.18.26.
> I performed a git bisect and the result pointed exactly to the commit
> you mentioned: c171e679ee66d7 ("net: stmmac: Disable EEE RX clock stop
> when VLAN is enabled").
Do you mean that c171e679ee66d7 ("net: stmmac: Disable EEE RX clock stop
when VLAN is enabled") introduces the bug on 6.18.26 ?
do you have the possibility of bisecting to verify when exactly the issue
was solved between v6.18 and v6.19 ?
Maxime