Re: kernel 4.18.5 Realtek 8111G network adapter stops responding under high system load

From: Maciej S. Szmigiero
Date: Sat Sep 15 2018 - 19:54:32 EST


[ I've added Realtek Linux NIC and netdev mailing lists to CC ]

Hi David,

On 15.09.2018 23:23, David Arendt wrote:
> Hi,
>
> just a follow up:
>
> In kernel 4.18.8 the behaviour is different.
>
> The network is not reachable a number of times, but restarting to be
> reachable by itself before it finally is no longer reachable at all.
>
> Here the logging output:
>
> Sep 15 17:44:43 server kernel: NETDEV WATCHDOG: enp3s0 (r8169): transmit
> queue 0 timed out
> Sep 15 17:44:43 server kernel: r8169 0000:03:00.0 enp3s0: link up
> Sep 15 18:10:26 server kernel: r8169 0000:03:00.0 enp3s0: link up
> Sep 15 18:12:24 server kernel: r8169 0000:03:00.0 enp3s0: link up
> Sep 15 18:13:19 server kernel: r8169 0000:03:00.0 enp3s0: link up
> Sep 15 18:14:48 server kernel: r8169 0000:03:00.0 enp3s0: link up
> Sep 15 18:20:24 server kernel: r8169 0000:03:00.0 enp3s0: link up
> Sep 15 18:34:19 server kernel: r8169 0000:03:00.0 enp3s0: link up
> Sep 15 18:43:43 server kernel: r8169 0000:03:00.0 enp3s0: link up
> Sep 15 18:46:26 server kernel: r8169 0000:03:00.0 enp3s0: link up
> Sep 15 19:00:24 server kernel: r8169 0000:03:00.0 enp3s0: link up
>
> From 17:44 ro 18:46 the network is recovering automatically. After the
> up from 19:00, the network is no longer reachable without any additional
> message.
>
> If looking at ifconfig, the counter for TX packets is incrementing, the
> counter for RX packets not.
>
> Here again the driver from 4.17.14 is working flawlessly.

Could you please try this patch on top of 4.18.8:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f74dd480cf4e31e12971c58a1d832044db945670

In my case the problem fixed by the above commit was only limited to
bad TX performance but my r8169 NIC models were different from what
you have.

If this does not help then try bisecting the issue
(maybe limited to drivers/net/ethernet/realtek/r8169.c to save time).
If the NIC dies after a heavy load it might be possible to generate
such load quickly by in-kernel pktgen.

If that's not possible then at please least compare NIC register
values displayed by "ethtool -d enp3s0" between working and
non-working kernels.

> Thanks in advance,
> David Arendt

Maciej

>
>
> On 9/4/18 8:19 AM, David Arendt wrote:
>> Hi,
>>
>> When using kernel 4.18.5 the Realtek 8111G network adapter stops
>> responding under high system load.
>>
>> Dmesg is showing no errors.
>>
>> Sometimes an ifconfig enp3s0 down followed by an ifconfig enp3s0 up is
>> enough for the network adapter to restart responding. Sometimes a reboot
>> is necessary.
>>
>> When copying r8169.c from 4.17.14 to the 4.18.5 kernel, networking works
>> perfectly stable on 4.18.5 so the problem seems r8169.c related.
>>
>> Here the output from lshw:
>>
>> ÂÂÂÂÂÂÂ *-pci:2
>> ÂÂÂÂÂÂÂÂÂÂÂÂ description: PCI bridge
>> ÂÂÂÂÂÂÂÂÂÂÂÂ product: 8 Series/C220 Series Chipset Family PCI Express
>> Root Port #3
>> ÂÂÂÂÂÂÂÂÂÂÂÂ vendor: Intel Corporation
>> ÂÂÂÂÂÂÂÂÂÂÂÂ physical id: 1c.2
>> ÂÂÂÂÂÂÂÂÂÂÂÂ bus info: pci@0000:00:1c.2
>> ÂÂÂÂÂÂÂÂÂÂÂÂ version: d5
>> ÂÂÂÂÂÂÂÂÂÂÂÂ width: 32 bits
>> ÂÂÂÂÂÂÂÂÂÂÂÂ clock: 33MHz
>> ÂÂÂÂÂÂÂÂÂÂÂÂ capabilities: pci pciexpress msi pm normal_decode
>> bus_master cap_list
>> ÂÂÂÂÂÂÂÂÂÂÂÂ configuration: driver=pcieport
>> ÂÂÂÂÂÂÂÂÂÂÂÂ resources: irq:18 ioport:d000(size=4096)
>> memory:f7300000-f73fffff ioport:f2100000(size=1048576)
>> ÂÂÂÂÂÂÂÂÂÂ *-network
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ description: Ethernet interface
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ product: RTL8111/8168/8411 PCI Express Gigabit Ethernet
>> Controller
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ vendor: Realtek Semiconductor Co., Ltd.
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ physical id: 0
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ bus info: pci@0000:03:00.0
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ logical name: enp3s0
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ version: 0c
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ serial: <hidden>
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ size: 1Gbit/s
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ capacity: 1Gbit/s
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ width: 64 bits
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ clock: 33MHz
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ capabilities: pm msi pciexpress msix vpd bus_master
>> cap_list ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd 1000bt
>> 1000bt-fd autonegotiation
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ configuration: autonegotiation=on broadcast=yes
>> driver=r8169 driverversion=2.3LK-NAPI duplex=full
>> firmware=rtl8168g-2_0.0.1 02/06/13 latency=0 link=yes multicast=yes
>> port=MII speed=1Gbit/s
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ resources: irq:18 ioport:d000(size=256)
>> memory:f7300000-f7300fff memory:f2100000-f2103fff
>>
>> Thanks in advance for looking into this,
>>
>> David Arendt
>>
>>
>