Re: [PATCH] Revert "KVM: x86: Unconditionally enable irqs in guest context"

From: Vitaly Kuznetsov
Date: Thu Jan 07 2021 - 04:35:06 EST


Sean Christopherson <seanjc@xxxxxxxxxx> writes:

> On Wed, Jan 06, 2021, Vitaly Kuznetsov wrote:
>>
>> Looking back, I don't quite understand why we wanted to account ticks
>> between vmexit and exiting guest context as 'guest' in the first place;
>> to my understanging 'guest time' is time spent within VMX non-root
>> operation, the rest is KVM overhead (system).
>
> With tick-based accounting, if the tick IRQ is received after PF_VCPU is cleared
> then that tick will be accounted to the host/system. The motivation for opening
> an IRQ window after VM-Exit is to handle the case where the guest is constantly
> exiting for a different reason _just_ before the tick arrives, e.g. if the guest
> has its tick configured such that the guest and host ticks get synchronized
> in a bad way.
>
> This is a non-issue when using CONFIG_VIRT_CPU_ACCOUNTING_GEN=y, at least with a
> stable TSC, as the accounting happens during guest_exit_irqoff() itself.
> Accounting might be less-than-stellar if TSC is unstable, but I don't think it
> would be as binary of a failure as tick-based accounting.
>

Oh, yea, I vaguely remember we had to deal with a very similar problem
but for userspace/kernel accounting. It was possible to observe e.g. a
userspace task going 100% kernel while in reality it was just perfectly
synchronized with the tick and doing a syscall just before it arrives
(or something like that, I may be misremembering the details).

So depending on the frequency, it is probably possible to e.g observe
'100% host' with tick based accounting, the guest just has to
synchronize exiting to KVM in a way that the tick will always arrive
past guest_exit_irqoff().

It seems to me this is a fundamental problem in case the frequency of
guest exits can match the frequency of the time accounting tick.

--
Vitaly