Fwd: ath11k: QCNFA765: Bug with non-standard router setting. Crashes, terrible latensy and speed.

From: Bagas Sanjaya
Date: Wed Dec 20 2023 - 22:50:09 EST


Hi all,

On Bugzilla [1], Evgenii Ilchenko <evgenii.ilchenko@xxxxxxxxxxxxxx> (Cc'ed)
wrote ath11k bug report with his non-standard router setup:

> Hardware:
> Lenovo Thinkpad P14s (21K5001JUS)
> AMD Ryzen 7450u with Qualcomm QCNFA765 Wireless Network Adapter
> Router: Huawei HG8245X6-10
>
> Software:
> Debian Testing (trixie).
> Testing with 6.1.0, 6.5.0, 6.5.13 kernels.
>
>
> The problem is reproduced in the following environment:
>
> 802.11ax is turned off on the router.
> In this case, a lot of messages like this are printed to the logs:
>
> ath11k_pci 0000:02:00.0: Received with invalid mcs in VHT mode 11
> ath11k_pci 0000:02:00.0: Received with invalid mcs in VHT mode 10
>
> and:
>
> [ 19.498035] ------------[ cut here ]------------
> [ 19.498039] Rate marked as a VHT rate but data is invalid: MCS: 10, NSS: 0
> [ 19.498138] WARNING: CPU: 12 PID: 3107 at net/mac80211/rx.c:5337 ieee80211_rx_list+0x2b3/0xda0 [mac80211]
> .........
> [ 19.498631] RIP: 0010:ieee80211_rx_list+0x2b3/0xda0 [mac80211]
> [ 19.498684] Code: 00 00 80 3d 96 a7 07 00 00 0f 85 2d ff ff ff 0f b6 53 4a 40 0f b6 f7 48 c7 c7 e0 a4 e2 c1 c6 05 7a a7 07 00 01 e8 dd 5d b6 e3 <0f> 0b e9 0b ff ff ff 40 80 ff 0b 0f 86 26 03 00 00 80 3d 5c a7 07
> .......
> [ 19.498724] Call Trace:
> [ 19.498731] <IRQ>
> [ 19.498735] ? ieee80211_rx_list+0x2b3/0xda0 [mac80211]
> [ 19.498785] ? __warn+0x81/0x130
> [ 19.498799] ? ieee80211_rx_list+0x2b3/0xda0 [mac80211]
> [ 19.498852] ? report_bug+0x171/0x1a0
> [ 19.498861] ? prb_read_valid+0x1b/0x30
> [ 19.498871] ? srso_alias_return_thunk+0x5/0x7f
> [ 19.498882] ? handle_bug+0x3c/0x80
> [ 19.498891] ? exc_invalid_op+0x17/0x70
> [ 19.498897] ? asm_exc_invalid_op+0x1a/0x20
> [ 19.498910] ? ieee80211_rx_list+0x2b3/0xda0 [mac80211]
> [ 19.498941] ? srso_alias_return_thunk+0x5/0x7f
> [ 19.498944] ? _dev_warn+0x79/0xa0
> [ 19.498952] ? srso_alias_return_thunk+0x5/0x7f
> [ 19.498956] ? ath11k_peer_find_by_id+0x100/0x1c0 [ath11k]
> [ 19.498978] ieee80211_rx_napi+0x53/0xe0 [mac80211]
> [ 19.498999] ath11k_dp_rx_process_received_packets+0x23e/0x660 [ath11k]
> [ 19.499013] ath11k_dp_process_rx+0x2cf/0x3c0 [ath11k]
> [ 19.499026] ath11k_dp_service_srng+0x2e0/0x320 [ath11k]
> [ 19.499037] ath11k_pcic_ext_grp_napi_poll+0x25/0x80 [ath11k]
> [ 19.499047] __napi_poll+0x28/0x1b0
> [ 19.499055] net_rx_action+0x2a4/0x380
> [ 19.499058] ? srso_alias_return_thunk+0x5/0x7f
> [ 19.499060] ? __napi_schedule+0xb0/0xc0
> [ 19.499065] __do_softirq+0xc7/0x2ae
> [ 19.499070] ? handle_edge_irq+0x8b/0x230
> [ 19.499076] __irq_exit_rcu+0x96/0xb0
> [ 19.499083] common_interrupt+0x86/0xa0
> [ 19.499086] </IRQ>
> [ 19.499087] <TASK>
> [ 19.499089] asm_common_interrupt+0x26/0x40
> .........
> [ 19.499179] ---[ end trace 0000000000000000 ]---
> full dmesg are attached.
>
> Under these conditions, there is a high proportion of packet loss and terrible
> network speed.
> --- 8.8.8.8 ping statistics ---
> 1897 packets transmitted, 1868 received, 1.52873% packet loss, time 1899829ms
> rtt min/avg/max/mdev = 8.361/19.235/182.594/10.237 ms
>
> Workaround: When you enable 802.11ax in the router settings, everything becomes
> fine.
>
> From my side, looks like router is sending incompatible in 802.11ac mode MCS
> setting and this cause the problem. But a lot of devices (include thinkpad t14
> g2 with AX201 intel wi-fi) work well with this router and this setting.

To see full dmesg attachment, visit Bugzilla [1].

Later, after I asked to check mainline kernel, he could still reproduce
the bug:

> > Can you check current mainline (v6.7-rc5)?
> Of course.
> At first glance it seemed to be better, but the problem is still reproducible.
> 1800 packets transmitted, 1715 received, 4.72222% packet loss, time 1803370ms
>
> Dmesg:
> https://drive.proton.me/urls/ANXKYVSSE0#1UAg2yv5RbvD
> Ping with timestamps:
> https://drive.proton.me/urls/0X1YVJ0QEG#HWiaF4ZtM2YZ
>
> There appears to be a correlation between log messages (ath11k_pci ... Received
> with invalid mcs) and packet loss.

Visit above Proton Drive links for full dmesg and ping test output.

Thanks.

[1]: https://bugzilla.kernel.org/show_bug.cgi?id=218276

--
An old man doll... just what I always wanted! - Clara

Attachment: signature.asc
Description: PGP signature