Re: [PATCH] sched: Further restrict the preemption modes

From: Peter Zijlstra

Date: Tue Mar 03 2026 - 07:04:05 EST


On Tue, Mar 03, 2026 at 09:15:55AM +0000, Ciunas Bennett wrote:
> A quick update on the issue.
> Introducing kvm_arch_set_irq_inatomic() appears to make the problem go away on my setup.
> That said, this still begs the question: why does irqfd_wakeup behave differently (or poorly) in this scenario compared to the in-atomic IRQ injection path?
> Is there a known interaction with workqueues, contexts, or locking that would explain the divergence here?
>
> Observations:
> irqfd_wakeup: triggers the problematic behaviour.
> Forcing in-atomic IRQ injection (kvm_arch_set_irq_inatomic): issue not observed.
>
> @Peter Zijlstra — Peter, do you have thoughts on how the workqueue scheduling context here could differ enough to cause this regression?
> Any pointers on what to trace specifically in irqfd_wakeup and the work item path would be appreciated.

So the thing that LAZY does different from FULL is that it delays
preemption a bit.

This has two ramifications:

1) some ping-pong workloads will turn into block+wakeup, adding
overhead.

FULL: running your task A, an interrupt would come in, wake task B and
set Need Resched and the interrupt return path calls schedule() and
you're task B. B does its thing, 'wakes' A and blocks.

LAZY: running your task A, an interrupt would come in, wake task B (no
NR set), you continue running A, A blocks for it needs something of B,
now you schedule() [*] B runs, does its thing, does an actual wakeup of
A and blocks.

The distinct difference here is that LAZY does a block of A and
consequently B has to do a full wakeup of A, whereas FULL doesn't do a
block of A, and hence the wakeup of A is NOP as well.


2) Since the schedule() is delayed, it might happen that by the time it
does get around to it, your task B is no longer the most eligible
option.

Same as above, except now, C is also woken, and the schedule marked with
[*] picks C, this then results in a detour, delaying things further.