[PATCH v2 00/20] KVM: x86/xen: Fix Xen/GP/PREEMPT_RT issues with rwlock_t
From: Sean Christopherson
Date: Fri May 29 2026 - 13:45:58 EST
This series fixes sleeping-in-hardirq bugs in KVM's Xen emulation on
PREEMPT_RT, cleans up the now-unnecessary IRQ disabling in GPC lock usage
throughout KVM, and then adds CLASS()-based APIs for utilizing GPCs mappings
to dedup code and (hopefully) make it easier to use GPCs in other places.
The core issue is that kvm_xen_set_evtchn_fast() and the Xen timer
callback are called from hardirq/atomic context, but on PREEMPT_RT the
GPC rwlock_t is a sleeping lock.
Assuming I can get an Ack on patch 1, I'm planning on grabbing at least these
KVM: Move {g,p}fn <=> {g,h}pa conversion helpers to kvm_types.h
KVM: x86/xen: Don't dirty track "vCPU info" page
KVM: x86/xen: Explicitly tag "shared info" page as never being dirty tracked
KVM: x86/xen: Extract delivery of event to vCPU into a separate helper
KVM: x86/xen: Use guard() to grab kvm->srcu around gpc critical sections
KVM: Remove unnecessary IRQ disabling from GPC lock in pfncache.c
KVM: x86: Remove unnecessary irqsave from kvm_setup_guest_pvclock()
KVM: x86/xen: Remove unnecessary irqsave from GPC lock usage in xen.c
KVM: x86/xen: Use read_trylock() for GPC locks in hardirq/atomic paths
locking/rt: Use raw_spin_lock_irqsave() in __rwbase_read_unlock()
for 7.2. If people like the CLASS() stuff, I'll also probably grab these:
KVM: Add "extended" gpc CLASS() APIs for sometimes-atomic cases
KVM: x86/xen: Convert event injection to gpc's CLASS() APIs
KVM: x86/xen: Drop local "kick_vcpu" from __kvm_xen_set_evtchn_fast()
KVM: x86/xen: Convert xen_get_guest_pvclock() to gpc's CLASS() APIs
KVM: x86/xen: Convert kvm_xen_set_evtchn_fast() to gpc's CLASS() APIs
KVM: x86/xen: Convert wait_pending_event() to gpc's CLASS() APIs
KVM: x86/xen: Don't bother waiting on gpc->lock in SCHEDOP_poll
KVM: x86/xen: Convert kvm_xen_shared_info_init() to gpc's CLASS() APIs
KVM: Add CLASS() constructs to automagically handle lock+check of gpc
I do NOT plan on grabbing the record_steal_time change for 7.2 no matter
what, even though I do like the end result, as I still have concerns over the
lack of range-based invalidation for GPCs. I 100% agree that such problems
are really only due to flawed VMMs and/or setups, but unfortunately history
has shown that there are a suprising number of deployments running what I
would consider flawed setups, e.g. run with NUMA autobalancing and KSM.
I realize I'm being somewhat paranoid, as KVM already uses a GPC for PV
clocks. But for modern setups, KVM_REQ_CLOCK_UPDATE is a rare event, whereas
KVM will update steal time (when enabled) on every vCPU load. So I want a
high level of confidence that KVM won't regress "imperfect" setups before
switching to a GPC for steal time (though again, I definitely like the end
result and want to do so).
[*] https://lore.kernel.org/all/20240821202814.711673-2-dwmw2@xxxxxxxxxxxxx
v2:
- Add the CLASS() APIs.
- Move the steal time change to the very end.
- "Fix" a dirty logging inconsistency with the Xen vCPU info page.
v1: https://lore.kernel.org/all/20260508181717.3230988-1-dwmw2@xxxxxxxxxxxxx
Carsten Stollmaier (1):
KVM: x86: Use gfn_to_pfn_cache for record_steal_time
David Woodhouse (5):
locking/rt: Use raw_spin_lock_irqsave() in __rwbase_read_unlock()
KVM: x86/xen: Use read_trylock() for GPC locks in hardirq/atomic paths
KVM: x86/xen: Remove unnecessary irqsave from GPC lock usage in xen.c
KVM: x86: Remove unnecessary irqsave from kvm_setup_guest_pvclock()
KVM: Remove unnecessary IRQ disabling from GPC lock in pfncache.c
Sean Christopherson (14):
KVM: x86/xen: Use guard() to grab kvm->srcu around gpc critical
sections
KVM: x86/xen: Extract delivery of event to vCPU into a separate helper
KVM: x86/xen: Explicitly tag "shared info" page as never being dirty
tracked
KVM: x86/xen: Don't dirty track "vCPU info" page
KVM: Move {g,p}fn <=> {g,h}pa conversion helpers to kvm_types.h
KVM: Add CLASS() constructs to automagically handle lock+check of gpc
KVM: x86/xen: Convert kvm_xen_shared_info_init() to gpc's CLASS() APIs
KVM: x86/xen: Don't bother waiting on gpc->lock in SCHEDOP_poll
KVM: x86/xen: Convert wait_pending_event() to gpc's CLASS() APIs
KVM: x86/xen: Convert kvm_xen_set_evtchn_fast() to gpc's CLASS() APIs
KVM: x86/xen: Convert xen_get_guest_pvclock() to gpc's CLASS() APIs
KVM: x86/xen: Drop local "kick_vcpu" from __kvm_xen_set_evtchn_fast()
KVM: x86/xen: Convert event injection to gpc's CLASS() APIs
KVM: Add "extended" gpc CLASS() APIs for sometimes-atomic cases
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/kvm/x86.c | 140 +++++++---------
arch/x86/kvm/xen.c | 288 +++++++++++++-------------------
include/linux/kvm_host.h | 84 +++++++---
include/linux/kvm_types.h | 17 ++
kernel/locking/rwbase_rt.c | 5 +-
virt/kvm/pfncache.c | 68 ++++++--
7 files changed, 304 insertions(+), 300 deletions(-)
base-commit: d1568b1332b6b3b36b222c2868fc102727c12a34
--
2.54.0.823.g6e5bcc1fc9-goog