Re: [PATCH v12 0/2] Detect stalls on guest vCPUS

From: Greg Kroah-Hartman
Date: Thu Jul 14 2022 - 10:55:04 EST

On Mon, Jul 11, 2022 at 08:17:18AM +0000, Sebastian Ene wrote:
> Minor change from v11 which cleans up the Kconfig option selection.
> This adds a mechanism to detect stalls on the guest vCPUS by creating a
> per CPU hrtimer which periodically 'pets' the host backend driver.
> On a conventional watchdog-core driver, the userspace is responsible for
> delivering the 'pet' events by writing to the particular /dev/watchdogN node.
> In this case we require a strong thread affinity to be able to
> account for lost time on a per vCPU basis.
> This device driver acts as a soft lockup detector by relying on the host
> backend driver to measure the elapesed time between subsequent 'pet' events.
> If the elapsed time doesn't match an expected value, the backend driver
> decides that the guest vCPU is locked and resets the guest. The host
> backend driver takes into account the time that the guest is not
> running. The communication with the backend driver is done through MMIO
> and the register layout of the virtual watchdog is described as part of
> the backend driver changes.
> The host backend driver is implemented as part of:
> Changelog v12:
> - don't select LOCKUP_DETECTOR from Kconfig when VCPU_STALL_DETECTOR is
> compiled in as suggested by Greg
> - add the review-by tag received from Guenter

Thanks for sticking with this, now applied to my tree!

greg k-h