Re: Regression on vcpu_is_preempted()

From: Peter Zijlstra
Date: Sat Oct 29 2022 - 04:59:34 EST


On Fri, Oct 28, 2022 at 04:48:21PM +0800, Miaohe Lin wrote:
> When scheduler tries to select a CPU to run the gc thread,
> available_idle_cpu() will check whether vcpu_is_preempted(). It
> will choose other vcpu to run gc threads when the current vcpu is
> preempted. But the preempted vcpu has no other work to do except
> continuing to do gc. In our guest, there are more vcpus than java gc
> threads. So there could always be some available vcpus when
> scheduler tries to select a idle vcpu (runing on host). This leads
> to lots of cpu migrations and results in regression.
>
> I'm not really familiar with this mechanism. Is this a problem that
> needs to be fixed or improved? Or is this just expected behavior?
> Any response would be really appreciated!

This is pretty much expected behaviour. When a vCPU is preempted the
guest cannot know it's state or latency. Typically in the overcomitted
case another vCPU will be running on the CPU and getting our vCPU thread
back will take a considerable amount of time.

If you know you're not over-committed, perhaps you should configure your
VM differently.