Re: [PATCH v3 0/4] implement vcpu preempted check

From: Paolo Bonzini
Date: Fri Sep 30 2016 - 05:08:32 EST

On 30/09/2016 10:52, Pan Xinhui wrote:
>> x86 has no hypervisor support, and I'd like to understand the desired
>> semantics first, so I don't think it should block this series. In
> Once a guest do a hypercall or something similar, IOW, there is a
> kvm_guest_exit. we think this is a lock holder preemption.
> Adn PPC implement it in this way.

Ok, good.

>> particular, there are at least the following choices:
>> 1) exit to userspace (5-10.000 clock cycles best case) counts as
>> lock holder preemption
>> 2) any time the vCPU thread not running counts as lock holder
>> preemption
>> To implement the latter you'd need a hypercall or MSR (at least as
>> a slow path), because the KVM preempt notifier is only active
>> during the KVM_RUN ioctl.
> seems a little expensive. :(
> How many clock cycles it might cost.

An MSR read is about 1500 clock cycles, but it need not be the fast path
(e.g. use a bit to check if the CPU is running, if not use the MSR to
check if the CPU is in userspace but the CPU thread is scheduled). But
it's not necessary if you are just matching PPC semantics.

Then the simplest thing is to use the kvm_steal_time struct, and add a
new field to it that replaces pad[0]. You can write a 0 to the flag in
record_steal_time (not preempted) and a 1 in kvm_arch_vcpu_put
(preempted). record_steal_time is called before the VM starts running,
immediately after KVM_RUN and also after every sched_in.

If KVM doesn't implement the flag, it won't touch that field at all. So
the kernel can write a 0, meaning "not preempted", and not care if the
hypervisor implements the flag or not: the answer will always be safe.

The pointer to the flag can be placed in a per-cpu u32*, and again if
the u32* is NULL that means "not preempted".


> I am still looking for one shared struct between kvm and guest kernel on
> x86.
> and every time kvm_guest_exit/enter called, we store some info in it. So
> guest kernel can check one vcpu is running or not quickly.
> thanks
> xinhui
>> Paolo