Re: [BUG] wifi: rtw88: Hard system freeze on RTL8821CE when power_save is enabled (LPS/ASPM conflict)
From: Bitterblue Smith
Date: Sat Mar 28 2026 - 11:59:38 EST
On 28/03/2026 13:41, LB F wrote:
> Hi Bitterblue,
>
> Apologies for the delayed response. I applied your diagnostic patch
> right away but held off on replying because the NULL pointer crash
> has not reproduced since — it has been over 36 hours now with no
> oops, which is unusual (previously it occurred in 4 out of 7 boots,
> typically within 2 minutes to 24 hours).
>
> I wanted to wait and collect the hex dumps from the crash-time burst
> (the 50+ "unused phy status page" events that always preceded the
> oops), as those would be the most valuable. Unfortunately, the crash
> hasn't happened yet during this session. If/when it does, I will
> follow up immediately with those dumps.
>
> In the meantime, here is what I have so far. The patch is working
> and producing output. I collected 76 "unused phy status page" events
> during this boot, with the following time distribution:
>
> 14:01 1 event (isolated)
> 16:33 1 event
> 16:57-17:00 73 events (burst over ~3 minutes, no crash followed)
> 00:03 1 event (isolated)
>
> Page number distribution (no page 0 or 1, all are "garbage" pages):
>
> page 10: 10 page 7: 8 page 8: 7 page 13: 7
> page 11: 7 page 9: 6 page 15: 6 page 12: 6
> page 4: 5 page 2: 5 page 14: 4 page 5: 2
> page 3: 2 page 6: 1
>
> Here are representative hex dumps. I'm showing the byte-level dump
> (second print_hex_dump) since it is easier to read:
>
> Isolated event (page 9):
>
> rtw_8821ce 0000:13:00.0: unused phy status page (9)
> 00000000: c7 5e 9c 9d 91 69 4d dc b0 67 c2 09 84 33 00 00 .^...iM..g...3..
> 00000010: 00 1e fe 3f cf f2 f0 08 01 29 00 00 00 11 2a 01 ...?.....)....*.
> 00000020: 0e 00 00 00 00 00 00 20 .......
>
> Burst event (page 14):
>
> rtw_8821ce 0000:13:00.0: unused phy status page (14)
> 00000000: bd 2c e0 3d 00 00 00 11 87 0a 40 80 88 33 00 00 .,.=......@..3..
> 00000010: 00 1e fe 3f 3e b6 9b 44 01 2e 00 00 00 11 2a 01 ...?>..D......*.
> 00000020: 20 00 00 00 00 00 00 20 ......
>
> Burst event (page 12) — byte 0x10 is 0x7e instead of usual 0x00:
>
> rtw_8821ce 0000:13:00.0: unused phy status page (12)
> 00000000: 1c b3 7f 15 d1 94 95 7e 70 5e f4 e3 b4 a1 bf 10 .......~p^......
> 00000010: 7e 1e fe 3f 2e f1 62 44 01 2c 00 00 00 11 2a 01 ~..?..bD.,....*.
> 00000020: 14 00 00 00 00 00 00 20 .......
>
> Burst event (page 2) — contains MAC addresses:
>
> rtw_8821ce 0000:13:00.0: unused phy status page (2)
> 00000000: 88 55 51 95 d1 66 ad 50 2f 25 3f 89 ae 35 ef 77 .UQ..f.P/%?..5.w
> 00000010: 00 1e fe 3f 89 68 62 4d 88 42 40 00 8c c8 4b 68 ...?.hbM.B@xxxxx
> 00000020: d1 63 6c 68 a4 1c 97 5b .clh...[
>
> Note: bytes 0x1a-0x1f are 8c:c8:4b:68:d1:63 — my adapter's MAC.
> bytes 0x20-0x25 are 6c:68:a4:1c:97:5b — the AP's BSSID (partially,
> the dump is only 40 bytes so it cuts off after 0x25).
>
> Burst event (page 15) — completely random, no recognizable structure:
>
> rtw_8821ce 0000:13:00.0: unused phy status page (15)
> 00000000: c6 a1 92 1c a7 68 6b 97 12 bd ad 89 30 98 ab 94 .....hk.....0...
> 00000010: 00 1e fe 3f ec 3f 3e 44 1f c2 91 41 0e 9b 54 5f ...?.?>D...A..T_
> 00000020: 30 eb 40 18 6f d3 25 62 0.@.o.%b
>
> Burst event (page 10) — offset 0x10 is completely different pattern:
>
> rtw_8821ce 0000:13:00.0: unused phy status page (10)
> 00000000: cb 1c 2a df f1 69 d0 05 58 c0 e8 0e d0 59 87 6e ..*..i..X....Y.n
> 00000010: 63 7e 56 f0 95 fa b8 d3 d5 4b 3e fa b0 0c 0e be c~V......K>.....
> 00000020: 42 28 14 89 15 c1 fd ad B(......
>
> Last isolated event (page 4):
>
> rtw_8821ce 0000:13:00.0: unused phy status page (4)
> 00000000: 97 ee fa 4e 04 90 00 21 c0 0f 89 80 b3 33 00 00 ...N...!.....3..
> 00000010: 00 1e fe 3f 97 7e 64 90 5d 3e 74 fa 70 e0 39 65 ...?.~d.]>t.p.9e
> 00000020: 48 a4 40 d3 de a9 85 15 H.@.....
>
> Observations:
>
> - Bytes at offset 0x0e-0x0f are usually 00 00 or have low values
> in most dumps, but some are completely random.
> - Bytes 0x11-0x13 are almost always 1e fe 3f (with byte 0x10
> being 00 or 7e), suggesting this is a consistent part of the
> RX descriptor that is not corrupted.
> - The "page 2" dump at 17:00:23 clearly contains the adapter
> and AP MAC addresses, confirming this is real RX frame data.
> - Some dumps (page 10, page 5, page 15) have completely random
> data with no recognizable RX descriptor structure at all.
> - The 73-event burst at 16:57-17:00 happened over ~3 minutes but
> did NOT result in a crash this time. Previously, similar bursts
> of 50+ events within ~1 second always led to the NULL pointer
> dereference in rtw_fw_c2h_cmd_handle+0x127.
>
> I will keep monitoring and will send the crash-time dumps as soon as
> the oops reproduces.
>
> Thanks for looking into this.
>
> Best regards,
> Oleksandr Havrylov
The other print_hex_dump is important too, so please attach the
full dmesg.
You don't need to wait for a crash. It appears to be caused by
random data, so I don't expect those dumps to be more useful than
these. Of course, adding a NULL check like you said before is a
good idea.
The one dump that contains your MAC addresses has them 24 bytes
lower than they are supposed to be.