Re: [PATCH v2 00/17] sched/paravirt: Introduce cpu_preferred_mask and steal-driven vCPU backoff

From: Shrikanth Hegde

Date: Wed Apr 08 2026 - 09:50:51 EST


Hi Hillf.

On 4/8/26 3:44 PM, Hillf Danton wrote:
On Wed, 8 Apr 2026 00:49:33 +0530 Shrikanth Hegde wrote:

Core idea is:
- Maintain set of CPUs which can be used by workload. It is denoted as
cpu_preferred_mask
- Periodically compute the steal time. If steal time is high/low based
on the thresholds, either reduce/increase the preferred CPUs.
- If a CPU is marked as non-preferred, push the task running on it if
possible.
- Use this CPU state in wakeup and load balance to ensure tasks run
within preferred CPUs.

For the host kernel, there is no steal time, so no changes to its preferred
CPUs. So series would affect only the guest kernels.

Changes are added to guest in order to detect if pCPU is overloaded, and if
that is true (I mean it is layer violation), why not ask the pCPU governor,
hypervisor, to monitor the loads on pCPU and migrate vCPUs forth and back
if necessary.


AFAIK, there in no information in the host scheduler on what
each vCPU is running. It maybe holding a mutex, spinlock with irq disabled
or maybe in interrupt context. Moving/migrating the vCPUs threads without
that knowledge will hurt the guest. And it has to ensure fairness.

This has to work across different archs, some have linux as hypervisor, some
has non-linux hypervisor such as powerpc, s390.

Steal time in guest is common construct in all archs. I don't think such
commonality exists in host schedulers.

If done in guest, guest actually knows what it is running and whats more important.
It can make better decisions IMHO.