Re: [PATCH v3 09/21] KVM: X86: Don't track dirty for KVM_SET_[TSS_ADDR|IDENTITY_MAP_ADDR]

From: Peter Xu
Date: Fri Jan 31 2020 - 15:56:03 EST


On Fri, Jan 31, 2020 at 12:36:22PM -0800, Sean Christopherson wrote:
> On Fri, Jan 31, 2020 at 03:28:24PM -0500, Peter Xu wrote:
> > On Fri, Jan 31, 2020 at 11:33:01AM -0800, Sean Christopherson wrote:
> > > For the same reason we don't take mmap_sem, it gains us nothing, i.e. KVM
> > > still has to use copy_{to,from}_user().
> > >
> > > In the proposed __x86_set_memory_region() refactor, vmx_set_tss_addr()
> > > would be provided the hva of the memory region. Since slots_lock and SRCU
> > > only protect gfn->hva, why would KVM take slots_lock since it already has
> > > the hva?
> >
> > OK so you're suggesting to unlock the lock earlier to not cover
> > init_rmode_tss() rather than dropping the whole lock... Yes it looks
> > good to me. I think that's the major confusion I got.
>
> Ya. And I missed where the -EEXIST was coming from. I think we're on the
> same page.

Good to know. Btw, for me I would still prefer to keep the lock be
after the __copy_to_user()s because "HVA is valid without lock" is
only true for these private memslots. After all this is super slow
path so I wouldn't mind to take the lock for some time longer. Or
otherwise if you really like the unlock() to be earlier I can comment
above the unlock:

/*
* We can unlock before using the HVA only because this KVM private
* memory slot will never change until the end of VM lifecycle.
*/

>
> > > Returning -EEXIST is an ABI change, e.g. userspace can currently call
> > > KVM_SET_TSS_ADDR any number of times, it just needs to ensure proper
> > > serialization between calls.
> > >
> > > If you want to change the ABI, then submit a patch to do exactly that.
> > > But don't bury an ABI change under the pretense that it's a bug fix.
> >
> > Could you explain what do you mean by "ABI change"?
> >
> > I was talking about the original code, not after applying the
> > patchset. To be explicit, I mean [a] below:
> >
> > int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size,
> > unsigned long *uaddr)
> > {
> > int i, r;
> > unsigned long hva;
> > struct kvm_memslots *slots = kvm_memslots(kvm);
> > struct kvm_memory_slot *slot, old;
> >
> > /* Called with kvm->slots_lock held. */
> > if (WARN_ON(id >= KVM_MEM_SLOTS_NUM))
> > return -EINVAL;
> >
> > slot = id_to_memslot(slots, id);
> > if (size) {
> > if (slot->npages)
> > return -EEXIST; <------------------------ [a]
> > }
> > ...
> > }
>
> Doh, I completely forgot that the second __x86_set_memory_region() would
> fail. Sorry :-(
>
> > > > Yes, but as I mentioned, I don't think it's an issue to be considered
> > > > by KVM, otherwise we should have the same issue all over the places
> > > > when we fetch the cached userspace_addr from any user slots.
> > >
> > > Huh? Of course it's an issue that needs to be considered by KVM, e.g.
> > > kvm_{read,write}_guest_cached() aren't using __copy_{to,}from_user() for
> > > giggles.
> >
> > The cache is for the GPA->HVA translation (struct gfn_to_hva_cache),
> > we still use __copy_{to,}from_user() upon the HVAs, no?
>
> I'm still lost on this one. I'm pretty sure I'm incorrectly interpreting:
>
> I don't think it's an issue to be considered by KVM, otherwise we should
> have the same issue all over the places when we fetch the cached
> userspace_addr from any user slots.
>
> What is the issue to which you are referring?

The issue I was referring to is "HVA can be unmapped by the userspace
without KVM's notice". I think actually we're on the same page too
here, my follow-up question is really a pure question for when you say
"kvm_{read,write}_guest_cached() aren't using __copy_{to,}from_user()"
above - because that's against my understanding.

--
Peter Xu