Re: [PATCH v2 1/1] KVM: x86/xen: Use trylock for fast path event channel delivery

From: Sean Christopherson

Date: Thu Apr 02 2026 - 18:43:20 EST

On Thu, Apr 02, 2026, Sebastian Andrzej Siewior wrote:
> On 2026-04-02 07:01:02 [+0530], shaikh.kamal wrote:
> …
> > The function uses read_lock_irqsave() to access two gpc structures:
> > shinfo_cache and vcpu_info_cache. On PREEMPT_RT, these rwlocks are
> > rt_mutex-based and cannot be acquired from hard IRQ context.
> >
> > Use read_trylock() instead for both gpc lock acquisitions. If either
> > lock is contended, return -EWOULDBLOCK to trigger the existing slow
> > path: xen_timer_callback() sets vcpu->arch.xen.timer_pending, kicks
> > the vCPU with KVM_REQ_UNBLOCK, and the event gets injected from
> > process context via kvm_xen_inject_timer_irqs().
> >
> > This approach works on all kernels (RT and non-RT) and preserves the
> > "fast path" semantics: acquire the lock only if immediately available,
> > otherwise bail out rather than blocking.
>
> No. This split into local_irq_save() + trylock is something you must not
> do. The fact that it does not lead to any warnings does not mean it is
> good.
> One problem is that your trylock will record the current task on the CPU
> as the owner of this lock which can lead to odd lock chains if observed
> by other tasks while trying to PI.

Is that a problem with local_irq_save() specifically, or is it a broader problem
with doing read_trylock() inside a raw spinlock? (Or using read_trylock() in
the sched_out path in particular?)

I ask because I _think_ David's suggestion was to drop the irq_save stuff
entirely, because if KVM only ever does trylock, there's no risk of deadlocking
due to waiting on the lock in atomic context.

> So no.
>
> If this is just to shut up syskaller I would suggest to let xen depend
> on !PREEMPT_RT until someone figures out what to do.

Heh, I was considering proposing exactly that, but it doesn't actually change
anything in practice, because no one actually use KVM XEN support with PREEMPT_RT.
Making the two mutually exclusive would completely prevent the badness, but it
wouldn't fix the more annoying (for me at least) problem, which is that
check_wait_context() fires with CONFIG_PROVE_LOCKING=y irrespective of PREEMPT_RT.
I.e. making CONFIG_KVM_XEN depend on !PREEMPT_RT won't eliminate what are already
false positives.

More importantly, there's a desire to use the same KVM construct in other code
that runs inside the shced_out() path and thus attempts to take a non-raw rwlock
inside a raw spinlock[*]. And that code isn't mutually exclusive with PREEMPT_RT.
So while I'd be happy to punt on XEN, the underlying problem needs to be solved :-/.

https://lore.kernel.org/all/1d6712ed413ea66ef376d1410811997c3b416e99.camel@xxxxxxxxxxxxx