Re: [PATCH v10 03/39] KVM: x86: hyper-v: Introduce TLB flush fifo

From: Vitaly Kuznetsov
Date: Thu Sep 22 2022 - 05:42:48 EST


Sean Christopherson <seanjc@xxxxxxxxxx> writes:

> On Wed, Sep 21, 2022, Vitaly Kuznetsov wrote:
>> To allow flushing individual GVAs instead of always flushing the whole
>> VPID a per-vCPU structure to pass the requests is needed. Use standard
>> 'kfifo' to queue two types of entries: individual GVA (GFN + up to 4095
>> following GFNs in the lower 12 bits) and 'flush all'.
>>
>> The size of the fifo is arbitrary set to '16'.
>
> s/arbitrary/arbitrarily
>
>> +static void hv_tlb_flush_enqueue(struct kvm_vcpu *vcpu)
>> +{
>> + struct kvm_vcpu_hv_tlb_flush_fifo *tlb_flush_fifo;
>> + struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>> + u64 flush_all_entry = KVM_HV_TLB_FLUSHALL_ENTRY;
>> +
>> + if (!hv_vcpu)
>> + return;
>> +
>> + tlb_flush_fifo = &hv_vcpu->tlb_flush_fifo;
>> +
>> + kfifo_in_spinlocked(&tlb_flush_fifo->entries, &flush_all_entry,
>> + 1, &tlb_flush_fifo->write_lock);
>
> Unless I'm missing something, there's no need to disable IRQs, i.e. this can be
> kfifo_in_spinlocked_noirqsave() and the later patch can use spin_lock() instead
> of spin_lock_irqsave(). The only calls to hv_tlb_flush_enqueue() are from
> kvm_hv_hypercall(), i.e. it's always called from process context.
>

Yes, no IRQ/... contexts are expected, the intention was to take the
spinlock for the shortest amount of time, not to protect against a
deadlock. This probably is not worthy and causes a confusion so I'll
remove it.

>> diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
>> index 1030b1b50552..ac30091ab346 100644
>> --- a/arch/x86/kvm/hyperv.h
>> +++ b/arch/x86/kvm/hyperv.h
>> @@ -151,4 +151,20 @@ int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args);
>> int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>> struct kvm_cpuid_entry2 __user *entries);
>>
>> +
>
> Unnecessary newline.
>
>> +static inline void kvm_hv_vcpu_empty_flush_tlb(struct kvm_vcpu *vcpu)
>
> What about "reset" or "purge" instead of "empty"? "empty" is often used as query,
> e.g. list_empty(), it took me a second to realize this is a command.
>

'purge' sounds good to me!

>> +{
>> + struct kvm_vcpu_hv_tlb_flush_fifo *tlb_flush_fifo;
>> + struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>> +
>> + if (!hv_vcpu || !kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu))
>> + return;
>> +
>> + tlb_flush_fifo = &hv_vcpu->tlb_flush_fifo;
>> +
>> + kfifo_reset_out(&tlb_flush_fifo->entries);
>> +}
>
> Missing newline.
>
>> +void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu);
>> +
>> +
>
> One too many newlines.
>
>> #endif
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 86504a8bfd9a..45c35c5467f8 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -3385,7 +3385,7 @@ static void kvm_vcpu_flush_tlb_all(struct kvm_vcpu *vcpu)
>> static_call(kvm_x86_flush_tlb_all)(vcpu);
>> }
>>
>> -static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
>> +void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
>> {
>> ++vcpu->stat.tlb_flush;
>>
>> @@ -3420,14 +3420,14 @@ void kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu)
>> {
>> if (kvm_check_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu)) {
>> kvm_vcpu_flush_tlb_current(vcpu);
>> - kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
>> + kvm_hv_vcpu_empty_flush_tlb(vcpu);
>
> It might be worth adding a comment to call out that emptying the FIFO _after_ the
> TLB flush is ok, because it's impossible for the CPU to insert TLB entries for the
> guest while running in the host. At first glance, it looks like this (and the
> existing similar pattern in vcpu_enter_guest()) has a race where it could miss a
> TLB flush.
>
> Definitely not required, e.g. kvm_vcpu_flush_tlb_all() doesn't have a similar
> comment. I think it's just the existence of the FIFO that made me pause.
>

Np, will add something for future generation of readers)

>> }
>>
>> if (kvm_check_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu)) {
>> kvm_vcpu_flush_tlb_guest(vcpu);
>> - kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
>> + kvm_hv_vcpu_empty_flush_tlb(vcpu);
>> } else if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu)) {
>> - kvm_vcpu_flush_tlb_guest(vcpu);
>> + kvm_hv_vcpu_flush_tlb(vcpu);
>
> Rather than expose kvm_vcpu_flush_tlb_guest() to Hyper-V, what about implementing
> this in a similar way to how way KVM-on-HyperV implements remote TLB flushes? I.e.
> fall back to kvm_vcpu_flush_tlb_guest() if the precise flush "fails".
>
> I don't mind exposing kvm_vcpu_flush_tlb_guest(), but burying the calls inside
> Hyper-V code makes it difficult to see the relationship between KVM_REQ_HV_TLB_FLUSH
> and KVM_REQ_TLB_FLUSH_GUEST.
>
> And as a minor bonus, that also helps document that kvm_hv_vcpu_flush_tlb() doesn't
> yet support precise flushing.
>
> E.g.
>
> if (kvm_check_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu)) {
> kvm_vcpu_flush_tlb_guest(vcpu);
> } else if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu)) {
> /*
> * Fall back to a "full" guest flush if Hyper-V's precise
> * flushing fails.
> */
> if (kvm_hv_vcpu_flush_tlb(vcpu))
> kvm_vcpu_flush_tlb_guest(vcpu);
> }
>
>
> int kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
> {
> struct kvm_vcpu_hv_tlb_flush_fifo *tlb_flush_fifo;
> struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>
> if (!hv_vcpu)
> return -EINVAL;
>
> tlb_flush_fifo = &hv_vcpu->tlb_flush_fifo;
>
> kfifo_reset_out(&tlb_flush_fifo->entries);
>
> /* Precise flushing isn't implemented yet. */
> return -EOPNOTSUPP;
> }
>

Oh, I see, certainly can be done this way, even if just to improve the
readability. Will change.

--
Vitaly