Re: [PATCH 2/2] srcuclassic: allow using same SRCU in process and interrupt context
From: Paul E. McKenney
Date: Wed May 31 2017 - 13:58:32 EST
On Wed, May 31, 2017 at 02:03:11PM +0200, Paolo Bonzini wrote:
> Linu Cherian reported a WARN in cleanup_srcu_struct when shutting
> down a guest that has iperf running on a VFIO assigned device.
>
> This happens because irqfd_wakeup calls srcu_read_lock(&kvm->irq_srcu)
> in interrupt context, while a worker thread does the same inside
> kvm_set_irq. If the interrupt happens while the worker thread is
> executing __srcu_read_lock, lock_count can fall behind.
>
> The docs say you are not supposed to call srcu_read_lock() and
> srcu_read_unlock() from irq context, but KVM interrupt injection happens
> from (host) interrupt context and it would be nice if SRCU supported the
> use case. KVM is using SRCU here not really for the "sleepable" part,
> but rather due to its faster detection of grace periods, therefore it
> is not possible to switch back to RCU, effectively reverting commit
> 719d93cd5f5c ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING", 2014-01-16).
>
> However, the docs are painting a worse situation than it actually is.
> You can have an SRCU instance only has users in irq context, and you
> can mix process and irq context as long as process context users
> disable interrupts. In addition, __srcu_read_unlock() actually uses
> this_cpu_dec, so that only srcu_read_lock() is unsafe.
>
> When srcuclassic's __srcu_read_unlock() was changed to use this_cpu_dec(),
> in commit 5a41344a3d83 ("srcu: Simplify __srcu_read_unlock() via
> this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments.
> Therefore it kept __this_cpu_inc, with preempt_disable/enable in the
> caller. Nowadays however it only does one increment, so on most
> architectures it is more efficient for __srcu_read_lock to use
> this_cpu_inc, too.
>
> There would be a slowdown if 1) fast this_cpu_inc is not available and
> cannot be implemented (this usually means that atomic_inc has implicit
> memory barriers), and 2) local_irq_save/restore is slower than disabling
> preemption. The main architecture with these constraints is s390, which
> however is already paying the price in __srcu_read_unlock and has not
> complained.
>
> Cc: stable@xxxxxxxxxxxxxxx
> Fixes: 719d93cd5f5c ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING")
> Reported-by: Linu Cherian <linuc.decode@xxxxxxxxx>
> Suggested-by: Linu Cherian <linuc.decode@xxxxxxxxx>
> Cc: kvm@xxxxxxxxxxxxxxx
> Cc: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Again, take your choice, or use both. ;-)
Reviewed-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
Tested-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> ---
> include/linux/srcu.h | 2 --
> kernel/rcu/srcu.c | 5 ++---
> 2 files changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> index 167ad8831aaf..4c1d5f7e62c4 100644
> --- a/include/linux/srcu.h
> +++ b/include/linux/srcu.h
> @@ -172,9 +172,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
> {
> int retval;
>
> - preempt_disable();
> retval = __srcu_read_lock(sp);
> - preempt_enable();
> rcu_lock_acquire(&(sp)->dep_map);
> return retval;
> }
> diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
> index 584d8a983883..dea03614263f 100644
> --- a/kernel/rcu/srcu.c
> +++ b/kernel/rcu/srcu.c
> @@ -263,7 +263,7 @@ void cleanup_srcu_struct(struct srcu_struct *sp)
>
> /*
> * Counts the new reader in the appropriate per-CPU element of the
> - * srcu_struct. Must be called from process context.
> + * srcu_struct.
> * Returns an index that must be passed to the matching srcu_read_unlock().
> */
> int __srcu_read_lock(struct srcu_struct *sp)
> @@ -271,7 +271,7 @@ int __srcu_read_lock(struct srcu_struct *sp)
> int idx;
>
> idx = READ_ONCE(sp->completed) & 0x1;
> - __this_cpu_inc(sp->per_cpu_ref->lock_count[idx]);
> + this_cpu_inc(sp->per_cpu_ref->lock_count[idx]);
> smp_mb(); /* B */ /* Avoid leaking the critical section. */
> return idx;
> }
> @@ -281,7 +281,6 @@ int __srcu_read_lock(struct srcu_struct *sp)
> * Removes the count for the old reader from the appropriate per-CPU
> * element of the srcu_struct. Note that this may well be a different
> * CPU than that which was incremented by the corresponding srcu_read_lock().
> - * Must be called from process context.
> */
> void __srcu_read_unlock(struct srcu_struct *sp, int idx)
> {
> --
> 1.8.3.1
>