Re: [RFC PATCH] rcu: call kvm_check_and_clear_guest_paused unconditionally

From: Sergey Senozhatsky
Date: Fri Jul 16 2021 - 02:23:21 EST


On (21/07/16 14:41), Sergey Senozhatsky wrote:
> @@ -657,6 +657,13 @@ static void check_cpu_stall(struct rcu_data *rdp)
> unsigned long js;
> struct rcu_node *rnp;
>
> + /*
> + * If a virtual machine is stopped by the host it can look to
> + * the watchdog like an RCU stall. Check to see if the host
> + * stopped the vm.
> + */
> + kvm_check_and_clear_guest_paused();
> +
> lockdep_assert_irqs_disabled();
> if ((rcu_stall_is_suppressed() && !READ_ONCE(rcu_kick_kthreads)) ||
> !rcu_gp_in_progress())
> @@ -699,14 +706,6 @@ static void check_cpu_stall(struct rcu_data *rdp)
> (READ_ONCE(rnp->qsmask) & rdp->grpmask) &&
> cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) {
>
> - /*
> - * If a virtual machine is stopped by the host it can look to
> - * the watchdog like an RCU stall. Check to see if the host
> - * stopped the vm.
> - */
> - if (kvm_check_and_clear_guest_paused())
> - return;
> -
> /* We haven't checked in, so go dump stack. */
> print_cpu_stall(gps);
> if (READ_ONCE(rcu_cpu_stall_ftrace_dump))
> @@ -717,14 +716,6 @@ static void check_cpu_stall(struct rcu_data *rdp)
> ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) &&
> cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) {
>
> - /*
> - * If a virtual machine is stopped by the host it can look to
> - * the watchdog like an RCU stall. Check to see if the host
> - * stopped the vm.
> - */
> - if (kvm_check_and_clear_guest_paused())
> - return;
> -
> /* They had a few time units to dump stack, so complain. */
> print_other_cpu_stall(gs2, gps);
> if (READ_ONCE(rcu_cpu_stall_ftrace_dump))

This patch depends on
https://lore.kernel.org/lkml/20210716053405.1243239-1-senozhatsky@xxxxxxxxxxxx/

If that x86/kvm patch lands, then we need to handle
PVCLOCK_GUEST_STOPPED in watchdogs.


In theory, this patch opens a small race window, if the VCPU gets preempted
after kvm_check_and_clear_guest_paused() (external interrupt, etc.)
But it's hard to tell how likely the problem is.