Re: [BUG] wifi: rtw88: Hard system freeze on RTL8821CE when power_save is enabled (LPS/ASPM conflict)

From: Bitterblue Smith

Date: Fri Mar 27 2026 - 06:58:48 EST


On 27/03/2026 01:52, LB F wrote:
> Hi Ping-Ke,
>
> This is Oleksandr Havrylov again. Thank you for the ASPM/LPS Deep
> quirk and the rate validation patches — they are both working correctly
> (zero h2c timeouts, zero lps failures, zero mac80211 warnings).
>
> However, I'm experiencing a different, separate bug that causes kernel
> oops and makes the system completely unresponsive, requiring a hard
> power-off. After disassembling the crash site, I believe I've found
> the root cause.
>
> == Summary ==
>
> When firmware sends a C2H_ADAPTIVITY (0x37) command to an RTL8821CE
> adapter, rtw_fw_adaptivity_result() dereferences rtwdev->chip->edcca_th
> without a NULL check. The RTL8821C chip_info (rtw8821c_hw_spec) does
> not define edcca_th, so the pointer is NULL, causing a kernel oops.
>
> The crash occurs on the phy0 workqueue while holding rtwdev->mutex,
> which never gets released. This causes all subsequent processes that
> touch the network stack to hang in uninterruptible D-state, making
> the system completely unresponsive and requiring a hard power-off.
>
> == Root cause analysis ==
>
> rtw_fw_adaptivity_result() in fw.c (line ~282):
>
> static void rtw_fw_adaptivity_result(struct rtw_dev *rtwdev, u8 *payload,
> u8 length)
> {
> const struct rtw_hw_reg_offset *edcca_th = rtwdev->chip->edcca_th;
> ...
> rtw_dbg(rtwdev, RTW_DBG_ADAPTIVITY, "Reg Setting: L2H %x H2L %x\n",
> rtw_read32_mask(rtwdev, edcca_th[EDCCA_TH_L2H_IDX].hw_reg.addr,
> ^^^^^^^^^ NULL dereference here
> edcca_th[EDCCA_TH_L2H_IDX].hw_reg.mask),
> ...
>
> The RTL8822C defines .edcca_th = rtw8822c_edcca_th in its chip_info,
> but RTL8821C does not set this field at all — it remains NULL.
>
> I verified this by disassembling the compiled rtw_core.ko module:
>
> Crash RIP: rtw_fw_c2h_cmd_handle+0x127
> Address: 0x1d527 = movl (%r12), %esi
>
> R12 is loaded at +0xe5 (0x1d4e5):
> movq 0x140(%rax), %r12 ; rax = rtwdev->chip
> ; 0x140 = offset of edcca_th in rtw_chip_info
> ; R12 = chip->edcca_th = NULL for 8821c
>
> The function is entered via:
> +0xd8 (0x1d4d8): cmpl $0x37, %ecx ; c2h->id == C2H_ADAPTIVITY (0x37)
>
> With R12 = 0, the instruction at +0x127:
> movl (%r12), %esi ; reads from address 0x0 → NULL pointer dereference
>
> I also confirmed that rtw8821c_hw_spec in the mainline kernel
> (torvalds/linux master, rtw8821c.c) does NOT set .edcca_th.
>
> == Reproduction ==
>
> The crash is highly reproducible: it occurred in 4 out of 7 recent
> boots. It happens during normal active usage with no specific trigger.
>
> boot date/time of crash uptime at crash
> -5 2026-03-25 00:58:06 ~2 min
> -4 2026-03-25 21:32:00 ~6h
> -3 2026-03-26 00:28:14 ~2.5h
> -1 2026-03-27 00:56:58 ~23.5h
>
> Both ASPM and LPS Deep are disabled via the DMI quirk. The crash
> occurs every time with the same pattern and same RIP offset (+0x127).
>
> == Crash pattern ==
>
> Every crash follows the same sequence:
>
> 1) Burst of 50-60 "unused phy status page" messages in ~1 second:
>
> rtw_8821ce 0000:13:00.0: unused phy status page (8)
> rtw_8821ce 0000:13:00.0: unused phy status page (2)
> ... (50+ more within same second)
>

It looks like the firmware is not sending C2H_ADAPTIVITY (unexpected
for RTL8821CE), but rather you are getting garbage RX data. I am
curious what kind of garbage it is. Can you try this?


diff --git a/drivers/net/wireless/realtek/rtw88/rtw8821c.c b/drivers/net/wireless/realtek/rtw88/rtw8821c.c
index da67a6845fd5..aae246c2bc8e 100644
--- a/drivers/net/wireless/realtek/rtw88/rtw8821c.c
+++ b/drivers/net/wireless/realtek/rtw88/rtw8821c.c
@@ -665,6 +665,7 @@ static void query_phy_status(struct rtw_dev *rtwdev, u8 *phy_status,
struct rtw_rx_pkt_stat *pkt_stat)
{
u8 page;
+ u8 *rxdesc = phy_status - rtwdev->chip->rx_pkt_desc_sz - pkt_stat->shift;

page = *phy_status & 0xf;

@@ -677,6 +678,10 @@ static void query_phy_status(struct rtw_dev *rtwdev, u8 *phy_status,
break;
default:
rtw_warn(rtwdev, "unused phy status page (%d)\n", page);
+ print_hex_dump(KERN_INFO, "", DUMP_PREFIX_OFFSET, 4, 4,
+ rxdesc, 56, true);
+ print_hex_dump(KERN_INFO, "", DUMP_PREFIX_OFFSET, 16, 1,
+ rxdesc, 40, true);
return;
}
}