iTCO_wdt: watchdog pretimeout panic after KVM guest suspend/resume
From: Evict7837
Date: Tue Jun 30 2026 - 22:46:51 EST
`Hi,`
`I'd like to report a kernel panic in iTCO_wdt triggered reproducibly after`
`suspending and resuming a KVM virtual machine host.`
`== Environment ==`
`Host:`
`OS: Ubuntu 24.04.4 LTS`
`Kernel: 6.17.0-35-generic`
`CPU: AMD Ryzen 7 6800H`
`QEMU: 8.2.2 (Debian 1:8.2.2+ds-0ubuntu1.17)`
`libvirt: 10.0.0`
`Guest:`
`OS: Rocky Linux 10.2`
`Kernel: 6.12.0-211.26.1.el10_2.x86_64`
`vCPU: 1`
`RAM: 1 GiB`
`Machine: Q35 + ICH9`
`Watchdog device exposed via libvirt XML:`
`<watchdog model="itco" action="reset"/>`
`Modules loaded: iTCO_wdt, iTCO_vendor_support`
`== Bug Description ==`
`After suspending the KVM host for >= 1 minute and resuming, the guest`
`kernel panics with "watchdog pretimeout event". The panic is 100%`
`reproducible with a suspend duration >= 1 minute. Shorter durations`
`do not trigger the panic.`
`The bug is a regression: it does NOT occur with the Rocky Linux 10.1`
`kernel under identical conditions. Blacklisting iTCO_wdt and`
`iTCO_vendor_support in the guest completely eliminates the panic.`
`Notably, the host CPU is AMD Ryzen, meaning iTCO is purely a QEMU`
`emulated device with no real Intel TCO hardware involved.`
`== Panic Call Trace ==`
`[ 95.753169] Kernel panic - not syncing: watchdog pretimeout event`
`[ 95.764046] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.12.0-211.26.1.el10_2.x86_64 #1`
`[ 95.770099] Hardware name: QEMU Ubuntu 24.04 PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2`
`[ 95.774960] Call Trace:`
`[ 95.776311] <IRQ>`
`[ 95.779071] dump_stack_lvl+0x4e/0x70`
`[ 95.780023] vpanic+0xc2/0x290`
`[ 95.781436] panic+0x6c/0x6c`
`[ 95.783045] pretimeout_panic+0x15/0x20`
`[ 95.784403] watchdog_notify_pretimeout+0x32/0x50`
`[ 95.786789] watchdog_hrtimer_pretimeout+0x15/0x20`
`[ 95.789221] __hrtimer_run_queues+0x13c/0x2a0`
`[ 95.791593] hrtimer_interrupt+0xff/0x230`
`[ 95.793728] __sysvec_apic_timer_interrupt+0x55/0x100`
`[ 95.796387] sysvec_apic_timer_interrupt+0x6c/0x90`
`[ 95.798942] </IRQ>`
`[ 95.800091] <TASK>`
`[ 95.803047] asm_sysvec_apic_timer_interrupt+0x1a/0x20`
`[ 95.803047] RIP: 0010:default_idle+0xf/0x20`
`[ 95.826719] default_idle_call+0x29/0xf0`
`[ 95.827574] cpu_startup_entry+0x29/0x30`
`[ 95.828663] rest_init+0xcc/0xd0`
`[ 95.829644] start_kernel+0x435/0x440`
`[ 95.840922] ---[ end Kernel panic - not syncing: watchdog pretimeout event ]---`
`== Root Cause ==`
`The issue is in drivers/watchdog/iTCO_wdt.c, in the suspend path.`
`The driver comment (line ~595) states:`
`"In ACPI sleep states the watchdog is stopped by the platform firmware."`
`Based on this assumption, need_suspend() only returns true for S0`
`(suspend-to-idle), leaving non-S0 states to firmware:`
`static inline bool __maybe_unused need_suspend(void)`
`{`
`return acpi_target_system_state() == ACPI_STATE_S0;`
`}`
`static int __maybe_unused iTCO_wdt_suspend_noirq(struct device *dev)`
`{`
`p->suspended = false;`
`if (watchdog_active(&p->wddev) && need_suspend()) {`
`ret = iTCO_wdt_stop(&p->wddev);`
`if (!ret)`
`p->suspended = true;`
`}`
`return ret;`
`}`
`This assumption is valid on physical machines. However, under KVM:`
`- "virsh suspend" causes QEMU to freeze the VM process directly`
`- There is no real ACPI S3 transition`
`- There is no platform firmware to stop the watchdog`
`- need_suspend() returns false (target state is not S0)`
`- iTCO_wdt_suspend_noirq() does NOT stop the watchdog or its hrtimer`
`- After resume, the pretimeout hrtimer has already expired`
`- pretimeout_panic() fires immediately`
`An additional anomaly: /sys/class/watchdog/watchdog0/pretimeout reads "0",`
`which should prevent the pretimeout path from being entered at all.`
`Yet the panic still occurs via watchdog_hrtimer_pretimeout, suggesting`
`the driver's internal pretimeout hrtimer state becomes inconsistent`
`after KVM suspend/resume.`
`== Reproduction Steps ==`
`1. Create a KVM guest (Q35 machine type) with iTCO watchdog:`
`libvirt XML: <watchdog model="itco" action="reset"/>`
`2. Boot Rocky Linux 10.2 (6.12.0-211.26.1.el10_2) as the guest`
`3. Confirm in guest:`
`lsmod | grep iTCO # iTCO_wdt loaded`
`cat /sys/class/watchdog/watchdog0/pretimeout # 0`
`cat /sys/class/watchdog/watchdog0/pretimeout_governor # panic`
`4. Suspend the host for >= 1 minute:`
`systemctl suspend`
`5. Resume: guest panics within ~60-95 seconds after resume`
`== Suggested Fix Direction ==`
`need_suspend() should also return true when running under a hypervisor,`
`since no real ACPI firmware will stop the watchdog during guest suspension:`
`static inline bool __maybe_unused need_suspend(void)`
`{`
`#ifdef CONFIG_ACPI`
`/*`
`* Under a hypervisor there is no real platform firmware to stop`
`* the watchdog during non-S0 sleep states, so handle it ourselves.`
`*/`
`if (boot_cpu_has(X86_FEATURE_HYPERVISOR))`
`return true;`
`return acpi_target_system_state() == ACPI_STATE_S0;`
`#else`
`return true;`
`#endif`
`}`
`Alternatively, unconditionally cancel the pretimeout hrtimer in`
`iTCO_wdt_suspend_noirq() regardless of need_suspend().`
`== Workaround ==`
`Blacklist the driver in the guest:`
`echo "blacklist iTCO_wdt" >> /etc/modprobe.d/blacklist-watchdog.conf`
`echo "blacklist iTCO_vendor_support" >> /etc/modprobe.d/blacklist-watchdog.conf`
`dracut --force && reboot`
`Thanks for looking into this.`
Attachment:
signature.asc
Description: OpenPGP digital signature