Re: [PATCH] rcu/tree: consider time a VM was suspended

From: Sergey Senozhatsky
Date: Thu May 20 2021 - 01:51:02 EST


On (21/05/18 16:15), Paul E. McKenney wrote:
>
> In the shorter term... PVCLOCK_GUEST_STOPPED is mostly for things like
> guest migration and debugger breakpoints, correct?

Our use case is a bit different. We suspend VM when user puts the host
system into sleep (which can happen multiple times a day).

> Either way, I am wondering if rcu_cpu_stall_reset() should take a lighter
> touch. Right now, it effectively disables all stalls for the current grace
> period. Why not make it restart the stall timeout when the stoppage is detected?

Sounds good. I can cook a patch and run some tests.
Or do you want to send a patch?

> The strange thing is that unless something is updating the jiffies counter
> to make it look like the system was up during the stoppage time interval,
> there should be no reason to tell RCU anything. Is the jiffies counter
> updated in this manner? (Not seeing it right offhand, but I don't claim
> to be familiar with this code.)

VCPUs are not resumed all at once. It's up to the host to schedule VCPUs
for execution. So, for example, when we resume VCPU-3 and it discovers
this_cpu PVCLOCK_GUEST_STOPPED, other VCPUs, e.g. VCPU-0, can already be
resumed, up and running processing timer interrupts and adding ticks to
jiffies.

I can reproduce it.
While VCPU-2 has PVCLOCK_GUEST_STOPPED set (resuming) and is in
check_cpu_stall(), the VCPU-3 is executing:

apic_timer_interrupt()
tick_irq_enter()
tick_do_update_jiffies64()
do_timer()