iTCO_wdt: watchdog pretimeout panic after KVM guest suspend/resume

From: Evict7837

Date: Tue Jun 30 2026 - 22:46:51 EST


`Hi,`

`I'd like to report a kernel panic in iTCO_wdt triggered reproducibly after`
`suspending and resuming a KVM virtual machine host.`

`== Environment ==`

`Host:`
  `OS:      Ubuntu 24.04.4 LTS`
  `Kernel:  6.17.0-35-generic`
  `CPU:     AMD Ryzen 7 6800H`
  `QEMU:    8.2.2 (Debian 1:8.2.2+ds-0ubuntu1.17)`
  `libvirt: 10.0.0`

`Guest:`
  `OS:      Rocky Linux 10.2`
  `Kernel:  6.12.0-211.26.1.el10_2.x86_64`
  `vCPU:    1`
  `RAM:     1 GiB`
  `Machine: Q35 + ICH9`
  `Watchdog device exposed via libvirt XML:`
    `<watchdog model="itco" action="reset"/>`
  `Modules loaded: iTCO_wdt, iTCO_vendor_support`

`== Bug Description ==`

`After suspending the KVM host for >= 1 minute and resuming, the guest`
`kernel panics with "watchdog pretimeout event". The panic is 100%`
`reproducible with a suspend duration >= 1 minute. Shorter durations`
`do not trigger the panic.`

`The bug is a regression: it does NOT occur with the Rocky Linux 10.1`
`kernel under identical conditions. Blacklisting iTCO_wdt and`
`iTCO_vendor_support in the guest completely eliminates the panic.`

`Notably, the host CPU is AMD Ryzen, meaning iTCO is purely a QEMU`
`emulated device with no real Intel TCO hardware involved.`

`== Panic Call Trace ==`

`[   95.753169] Kernel panic - not syncing: watchdog pretimeout event`
`[   95.764046] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.12.0-211.26.1.el10_2.x86_64 #1`
`[   95.770099] Hardware name: QEMU Ubuntu 24.04 PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2`
`[   95.774960] Call Trace:`
`[   95.776311]  <IRQ>`
`[   95.779071]  dump_stack_lvl+0x4e/0x70`
`[   95.780023]  vpanic+0xc2/0x290`
`[   95.781436]  panic+0x6c/0x6c`
`[   95.783045]  pretimeout_panic+0x15/0x20`
`[   95.784403]  watchdog_notify_pretimeout+0x32/0x50`
`[   95.786789]  watchdog_hrtimer_pretimeout+0x15/0x20`
`[   95.789221]  __hrtimer_run_queues+0x13c/0x2a0`
`[   95.791593]  hrtimer_interrupt+0xff/0x230`
`[   95.793728]  __sysvec_apic_timer_interrupt+0x55/0x100`
`[   95.796387]  sysvec_apic_timer_interrupt+0x6c/0x90`
`[   95.798942]  </IRQ>`
`[   95.800091]  <TASK>`
`[   95.803047]  asm_sysvec_apic_timer_interrupt+0x1a/0x20`
`[   95.803047] RIP: 0010:default_idle+0xf/0x20`
`[   95.826719]  default_idle_call+0x29/0xf0`
`[   95.827574]  cpu_startup_entry+0x29/0x30`
`[   95.828663]  rest_init+0xcc/0xd0`
`[   95.829644]  start_kernel+0x435/0x440`
`[   95.840922] ---[ end Kernel panic - not syncing: watchdog pretimeout event ]---`

`== Root Cause ==`

`The issue is in drivers/watchdog/iTCO_wdt.c, in the suspend path.`

`The driver comment (line ~595) states:`

  `"In ACPI sleep states the watchdog is stopped by the platform firmware."`

`Based on this assumption, need_suspend() only returns true for S0`
`(suspend-to-idle), leaving non-S0 states to firmware:`

  `static inline bool __maybe_unused need_suspend(void)`
  `{`
      `return acpi_target_system_state() == ACPI_STATE_S0;`
  `}`

  `static int __maybe_unused iTCO_wdt_suspend_noirq(struct device *dev)`
  `{`
      `p->suspended = false;`
      `if (watchdog_active(&p->wddev) && need_suspend()) {`
          `ret = iTCO_wdt_stop(&p->wddev);`
          `if (!ret)`
              `p->suspended = true;`
      `}`
      `return ret;`
  `}`

`This assumption is valid on physical machines. However, under KVM:`

  `- "virsh suspend" causes QEMU to freeze the VM process directly`
  `- There is no real ACPI S3 transition`
  `- There is no platform firmware to stop the watchdog`
  `- need_suspend() returns false (target state is not S0)`
  `- iTCO_wdt_suspend_noirq() does NOT stop the watchdog or its hrtimer`
  `- After resume, the pretimeout hrtimer has already expired`
  `- pretimeout_panic() fires immediately`

`An additional anomaly: /sys/class/watchdog/watchdog0/pretimeout reads "0",`
`which should prevent the pretimeout path from being entered at all.`
`Yet the panic still occurs via watchdog_hrtimer_pretimeout, suggesting`
`the driver's internal pretimeout hrtimer state becomes inconsistent`
`after KVM suspend/resume.`

`== Reproduction Steps ==`

`1. Create a KVM guest (Q35 machine type) with iTCO watchdog:`
     `libvirt XML: <watchdog model="itco" action="reset"/>`

`2. Boot Rocky Linux 10.2 (6.12.0-211.26.1.el10_2) as the guest`

`3. Confirm in guest:`
     `lsmod | grep iTCO               # iTCO_wdt loaded`
     `cat /sys/class/watchdog/watchdog0/pretimeout          # 0`
     `cat /sys/class/watchdog/watchdog0/pretimeout_governor # panic`

`4. Suspend the host for >= 1 minute:`
     `systemctl suspend`

`5. Resume: guest panics within ~60-95 seconds after resume`

`== Suggested Fix Direction ==`

`need_suspend() should also return true when running under a hypervisor,`
`since no real ACPI firmware will stop the watchdog during guest suspension:`

  `static inline bool __maybe_unused need_suspend(void)`
  `{`
  `#ifdef CONFIG_ACPI`
      `/*`
       `* Under a hypervisor there is no real platform firmware to stop`
       `* the watchdog during non-S0 sleep states, so handle it ourselves.`
       `*/`
      `if (boot_cpu_has(X86_FEATURE_HYPERVISOR))`
          `return true;`
      `return acpi_target_system_state() == ACPI_STATE_S0;`
  `#else`
      `return true;`
  `#endif`
  `}`

`Alternatively, unconditionally cancel the pretimeout hrtimer in`
`iTCO_wdt_suspend_noirq() regardless of need_suspend().`

`== Workaround ==`

`Blacklist the driver in the guest:`

  `echo "blacklist iTCO_wdt" >> /etc/modprobe.d/blacklist-watchdog.conf`
  `echo "blacklist iTCO_vendor_support" >> /etc/modprobe.d/blacklist-watchdog.conf`
  `dracut --force && reboot`

`Thanks for looking into this.`

Attachment: signature.asc
Description: OpenPGP digital signature