Re: [PATCH] wifi: ath11k: fix warning when unbinding

From: Rameshkumar Sundaram

Date: Thu May 14 2026 - 22:28:07 EST


On 5/14/2026 1:45 PM, Baochen Qiang wrote:


On 5/14/2026 2:55 PM, Rameshkumar Sundaram wrote:
On 5/14/2026 11:48 AM, Jose Ignacio Tornos Martinez wrote:
Hello Rameshkumar,

I agree that setting tx_status to NULL makes ath11k_dp_free() more
defensive, and it matches the ath12k fix.
Ok, I agree too.

However, i am still wondering how the second ath11k_dp_free() is reached
if ATH11K_FLAG_QMI_FAIL is set.

In ath11k_pci_remove(), when ATH11K_FLAG_QMI_FAIL is set, we take the
qmi_fail path and skip ath11k_core_deinit(). So the normal remove path:

     ath11k_pci_remove()
       ath11k_core_deinit()
         ath11k_core_soc_destroy()
           ath11k_dp_free()

should not run.

So if the double free is still reproducible with QMI_FAIL set (with the
change i proposed), either the flag is not actually set in this failure
case, or there is another path calling ath11k_dp_free() ?
Let me try to clarify the issue more.
There are two error actions:
- First the previous error. I reproduce the situation as I commented: running
in a VM the default upstream kernel (with this card using PCI passthrough),
since this is always failing. Let me show the logs in this situation:
[   15.906564] ath11k_pci 0000:07:00.0: BAR 0 [mem 0xfdc00000-0xfddfffff 64bit]: assigned
[   15.926520] ath11k_pci 0000:07:00.0: MSI vectors: 32
[   15.928572] ath11k_pci 0000:07:00.0: wcn6855 hw2.0
[   16.984192] ath11k_pci 0000:07:00.0: chip_id 0x2 chip_family 0xb board_id 0xff soc_id
0x400c0200
[   16.984351] ath11k_pci 0000:07:00.0: fw_version 0x11088c35 fw_build_timestamp
2024-04-17 08:34 fw_build_id WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
[   18.186971] ath11k_pci 0000:07:00.0: failed to receive control response completion,
polling..
[   19.211036] ath11k_pci 0000:07:00.0: Service connect timeout
[   19.211815] ath11k_pci 0000:07:00.0: failed to connect to HTT: -110
[   19.214181] ath11k_pci 0000:07:00.0: failed to start core: -110
[   19.531989] ath11k_pci 0000:07:00.0: firmware crashed: MHI_CB_EE_RDDM
[   19.532930] ath11k_pci 0000:07:00.0: ignore reset dev flags 0xc000
[   29.259157] ath11k_pci 0000:07:00.0: failed to wait wlan mode request (mode 4): -110
[   29.259229] ath11k_pci 0000:07:00.0: qmi failed to send wlan mode off: -110
- Second after this, I commanded the unbinded (ath11_pci) and I get the
warning. Let extend here the stack trace:
[   24.238198]  ? free_large_kmalloc+0x57/0x90
[   24.238199]  ? report_bug+0x16b/0x180
[   24.238210]  ? handle_bug+0x3c/0x70
[   24.238218]  ? exc_invalid_op+0x14/0x70
[   24.238218]  ? asm_exc_invalid_op+0x16/0x20
[   24.238224]  ? free_large_kmalloc+0x57/0x90
[   24.238227]  ath11k_dp_free+0x99/0xb0 [ath11k]
[   24.238275]  ath11k_core_deinit+0x12b/0x1a0 [ath11k]
[   24.238287]  ath11k_pci_remove+0x7b/0x120 [ath11k_pci]
[   24.238294]  pci_device_remove+0x3e/0xb0
[   24.238304]  device_release_driver_internal+0x193/0x200
[   24.238315]  unbind_store+0x9d/0xb0
[   24.238320]  kernfs_fop_write_iter+0x13a/0x1d0
[   24.238330]  vfs_write+0x32e/0x470
[   24.238335]  ksys_write+0x5f/0xe0
[   24.238336]  do_syscall_64+0x5f/0xe0
Very easy to reproduce.



Thanks much for the logs, that makes sense. The timestamps explain why my earlier
reasoning did not match the trace: unbind reaches ath11k_pci_remove() before
ATH11K_FLAG_QMI_FAIL is set by the QMI event worker as it is held up on wlan mode off qmi

how could QMI worker set this flag? the first failure happens in
ath12k_core_qmi_firmware_ready() and upon this failure the QMI worker just break out
without setting any flag, no?



you mean ath1*1*k_core_qmi_firmware_ready() ?. Yes in ToT it breaks out without setting any flags, so I proposed to set that on failure case ATH11K_QMI_EVENT_FW_READY: (similar to case ATH11K_QMI_EVENT_FW_INIT_DONE:) in this mail thread.


--
Ramesh