Re: R8169: Network lockups in 4.18.{8,9,10} (and 4.19 dev)

From: Heiner Kallweit
Date: Tue Oct 09 2018 - 17:39:29 EST


On 09.10.2018 16:40, Chris Clayton wrote:
> Thanks to Maciej and Heiner for their replies.
>
> On 09/10/2018 13:32, Maciej S. Szmigiero wrote:
>> On 07.10.2018 21:36, Chris Clayton wrote:
>>> Hi again,
>>>
>>> I didn't think there was anything in 4.19-rc7 to fix this regression, but tried it anyway. I can confirm that the
>>> regression is still present and my network still fails when, after a resume from suspend (to ram or disk), I open my
>>> browser or my mail client. In both those cases the failure is almost immediate - e.g. my home page doesn't get displayed
>>> in the browser. Pinging one of my ISPs name servers doesn't fail quite so quickly but the reported time increases from
>>> 14-15ms to more than 1000ms.
>>
>> You can try comparing chip registers (ethtool -d eth0) in the working
>> state (before a suspend) and in the broken state (after a resume).
>> Maybe there will be some obvious in the difference.
>>
>> The same goes for the PCI configuration (lspci -d :8168 -vv).
>>
> Maciej suggested comparing the output from lspci -vv for the ethernet device. They are identical.
>
> Both Maciej and Heiner suggested comparing the output from "ethtool -d" pre and post suspend. Again, they are identical.
> Heiner specifically suggested looking at the RxConfig. The value of that is 0x0002870e both pre and post suspend.
>
> I've attached files I redirected the outputs to.
>
> Please don't hesitate to ask for any other information needed to solve this problem. In the meantime, I've now got
> scripts that stop the network during suspend and restart it during resume. (Those scripts were removed whilst I gathered
> the diagnostics shown in the attachments.)
>
I'd like to check whether it may be a timing issue. The following experimental patch
adds a PCI commit after writing register ChipCmd. Could you please check whether
it changes anything?

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 7d3f671e1..f3c359492 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -4641,6 +4641,7 @@ static void rtl_hw_start(struct rtl8169_private *tp)
/* Initially a 10 us delay. Turned it into a PCI commit. - FR */
RTL_R8(tp, IntrMask);
RTL_W8(tp, ChipCmd, CmdTxEnb | CmdRxEnb);
+ RTL_R8(tp, ChipCmd);
rtl_init_rxcfg(tp);
rtl_set_tx_config_registers(tp);

--
2.19.1