Re: [PATCH] wq: handle VM suspension in stall detection

From: Tejun Heo
Date: Thu May 20 2021 - 13:00:29 EST


On Thu, May 20, 2021 at 07:14:22PM +0900, Sergey Senozhatsky wrote:
> If VCPU is suspended (VM suspend) in wq_watchdog_timer_fn() then
> once this VCPU resumes it will see the new jiffies value, while it
> may take a while before IRQ detects PVCLOCK_GUEST_STOPPED on this
> VCPU and updates all the watchdogs via pvclock_touch_watchdogs().
> There is a small chance of misreported WQ stalls in the meantime,
> because new jiffies is time_after() old 'ts + thresh'.
>
> wq_watchdog_timer_fn()
> {
> for_each_pool(pool, pi) {
> if (time_after(jiffies, ts + thresh)) {
> pr_emerg("BUG: workqueue lockup - pool");
> }
> }
> }
>
> Save jiffies at the beginning of this function and use that value
> for stall detection. If VM gets suspended then we continue using
> "old" jiffies value and old WQ touch timestamps. If IRQ at some
> point restarts the stall detection cycle (pvclock_touch_watchdogs())
> then old jiffies will always be before new 'ts + thresh'.
>
> Signed-off-by: Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx>

Applied to wq/for-5.13-fixes.

Thanks.

--
tejun