Re: [PATCH] sched/fair: Only increment deadline once on yield

From: Alexander Graf
Date: Sun Apr 13 2025 - 14:38:22 EST



On 01.04.25 14:36, Fernand Sieber wrote:
If a task yields, the scheduler may decide to pick it again. The task in
turn may decide to yield immediately or shortly after, leading to a tight
loop of yields.

If there's another runnable task as this point, the deadline will be
increased by the slice at each loop. This can cause the deadline to runaway
pretty quickly, and subsequent elevated run delays later on as the task
doesn't get picked again. The reason the scheduler can pick the same task
again and again despite its deadline increasing is because it may be the
only eligible task at that point.

Fix this by updating the deadline only to one slice ahead.

Note, we might want to consider iterating on the implementation of yield as
follow up:
* the yielding task could be forfeiting its remaining slice by
incrementing its vruntime correspondingly
* in case of yield_to the yielding task could be donating its remaining
slice to the target task

Signed-off-by: Fernand Sieber <sieberf@xxxxxxxxxx>


IMHO it's worth noting that this is not a theoretical issue. We have seen this in real life: A KVM virtual machine's vCPU which runs into a busy guest spin lock calls kvm_vcpu_yield_to() which eventually ends up in the yield_task_fair() function. We have seen such spin locks due to guest contention rather than host overcommit, which means we go into a loop of vCPU execution and spin loop exit, which results in an undesirable increase in the vCPU thread's deadline.

Given this impacts real workloads and is a bug present since the introduction of EEVDF, I would say it warrants a

Fixes: 147f3efaa24182 ("sched/fair: Implement an EEVDF-like scheduling policy")

tag.


Alex