Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
From: Tejun Heo
Date: Wed Mar 04 2026 - 12:18:53 EST
Hello,
(Partially drafted with the help of Claude)
On Tue, Mar 03, 2026 at 04:02:14PM -0800, Ben Greear wrote:
> Could the logic that detects blocked work-queues instead be instrumented
> to print out more useful information so that just reproducing the problem
> and providing dmesg output will be sufficient? Or does dmesg already provide
> enough that would give you a clue as to what is going on?
It may not be exactly the same issue, but Breno just posted a patch that
might help. The current watchdog only prints backtraces for workers that
are actively running on CPU, so sleeping culprits are invisible. His
patch removes that filter so all in-flight workers get printed:
http://lkml.kernel.org/r/aag4tTyeiZyw0jID@xxxxxxxxx
Might be worth trying.
> If I were to attempt to use AI on the coredump, would echoing 'c' to
> /proc/sysrq-trigger with kdump enabled (when deadlock is happening) be
> the appropriate action to grab the core file?
Yes, that's right, but you need to set up kdump first. The quickest way
depends on your distro:
- Fedora/RHEL: dnf install kexec-tools, then kdumpctl reset-crashkernel,
systemctl enable --now kdump
- Ubuntu/Debian: apt install kdump-tools (say Yes to enable), reboot
- Arch: Install kexec-tools, add crashkernel=512M to your kernel
cmdline, create a kdump.service that runs
kexec -p /boot/vmlinuz-linux --initrd=/boot/initramfs-linux.img \
--append="root=<your-root> irqpoll nr_cpus=1 reset_devices"
After reboot, verify with: cat /sys/kernel/kexec_crash_size (should be
non-zero). Then when the deadlock happens:
echo c > /proc/sysrq-trigger
The system will panic and boot into the kdump kernel. Note that the
kdump kernel runs with very limited memory, so you can't do much there
directly. Use makedumpfile to save a compressed dump to disk:
makedumpfile -l -d 31 /proc/vmcore /var/crash/vmcore
Most distros' kdump setups do this automatically. Once the dump is saved,
the system reboots back to normal and you can analyze it at your leisure
with drgn:
drgn -c /var/crash/vmcore
Thanks.
--
tejun