Re: [PATCH v2 1/2] sched/wait: Add add_wait_queue_priority()

From: Paolo Bonzini
Date: Wed Nov 04 2020 - 06:25:36 EST


On 04/11/20 10:35, David Woodhouse wrote:
On Wed, 2020-10-28 at 15:35 +0100, Peter Zijlstra wrote:
On Tue, Oct 27, 2020 at 02:39:43PM +0000, David Woodhouse wrote:
From: David Woodhouse <dwmw@xxxxxxxxxxxx>

This allows an exclusive wait_queue_entry to be added at the head of the
queue, instead of the tail as normal. Thus, it gets to consume events
first without allowing non-exclusive waiters to be woken at all.

The (first) intended use is for KVM IRQFD, which currently has
inconsistent behaviour depending on whether posted interrupts are
available or not. If they are, KVM will bypass the eventfd completely
and deliver interrupts directly to the appropriate vCPU. If not, events
are delivered through the eventfd and userspace will receive them when
polling on the eventfd.

By using add_wait_queue_priority(), KVM will be able to consistently
consume events within the kernel without accidentally exposing them
to userspace when they're supposed to be bypassed. This, in turn, means
that userspace doesn't have to jump through hoops to avoid listening
on the erroneously noisy eventfd and injecting duplicate interrupts.

Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx>

Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>

Thanks. Paolo, the conclusion was that you were going to take this set
through the KVM tree, wasn't it?


Yes.

Paolo