[PATCH RFC tip/core/rcu 0/2] srcu: All SRCU readers from both process and irq
From: Paul E. McKenney
Date: Mon Jun 05 2017 - 18:09:29 EST
This is a repost of a pair of patches from Paolo Bonzini to a wider
audience.
Linu Cherian reported a WARN in cleanup_srcu_struct when shutting
down a guest that has iperf running on a VFIO assigned device.
This happens because irqfd_wakeup calls srcu_read_lock(&kvm->irq_srcu)
in interrupt context, while a worker thread does the same inside
kvm_set_irq. If the interrupt happens while the worker thread is
executing __srcu_read_lock, lock_count can fall behind. (KVM is using
SRCU here not really for the "sleepable" part, but rather due to its
faster detection of grace periods). One way or another, this needs to
be fixed in v4.12.
We discussed three ways of fixing this:
1. Have KVM protect process-level ->irq_srcu readers with
local_irq_disable() or similar. This works, and is the most
confined change, but is a bit of an ugly usage restriction.
2. Make KVM convert ->irq_srcu uses to RCU-sched. This works, but
KVM needs fast grace periods, and synchronize_sched_expedited()
interrupts CPUs, and is thus not particularly friendly to
real-time workloads.
3. Make SRCU tolerate use of srcu_read_lock() and srcu_read_unlock()
from both process context and irq handlers for the same
srcu_struct. It turns out that only a small change to SRCU is
required, namely, changing __srcu_read_lock()'s __this_cpu_inc()
to this_cpu_inc(), matching the existing __srcu_read_unlock()
usage. In addition, this change simplifies the use of SRCU.
Of course, any RCU change after -rc1 is a bit scary.
Nevertheless, the following two patches illustrate this approach.
Coward that I am, my personal preferred approach would be #1 during 4.12,
possibly moving to #3 over time. However, the KVM guys make a good case
for just making a single small change right now and being done with it.
Plus the overall effect of the one-step approach #3 is to make RCU
smaller, even if only by five lines of code.
The reason for splitting this into two patches is to ease backporting.
This means that the two commit logs are quite similar.
Thoughts? In particular, are there better ways to fix this?
Thanx, Paul
------------------------------------------------------------------------
include/linux/srcu.h | 2 --
include/linux/srcutiny.h | 2 +-
kernel/rcu/rcutorture.c | 4 ++--
kernel/rcu/srcu.c | 5 ++---
kernel/rcu/srcutiny.c | 21 ++++++++++-----------
kernel/rcu/srcutree.c | 5 ++---
6 files changed, 17 insertions(+), 22 deletions(-)