Re: [PATCH v2 03/16] KVM: x86: Always use non-compat vcpu_runstate_info size for gfn=>pfn cache
From: Sean Christopherson
Date: Thu Oct 27 2022 - 10:45:42 EST
On Thu, Oct 27, 2022, Paolo Bonzini wrote:
> On 10/13/22 23:12, Sean Christopherson wrote:
> > Always use the size of Xen's non-compat vcpu_runstate_info struct when
> > checking that the GPA+size doesn't cross a page boundary. Conceptually,
> > using the current mode is more correct, but KVM isn't consistent with
> > itself as kvm_xen_vcpu_set_attr() unconditionally uses the "full" size
> > when activating the cache. More importantly, prior to the introduction
> > of the gfn_to_pfn_cache, KVM _always_ used the full size, i.e. allowing
> > the guest (userspace?) to use a poorly aligned GPA in 32-bit mode but not
> > 64-bit mode is more of a bug than a feature, and fixing the bug doesn't
> > break KVM's historical ABI.
>
> I'd rather not introduce additional restrictions in KVM,
But KVM already has this restriction. "struct vcpu_info" is always checked for
the non-compat size, and as above, "struct vcpu_runstate_info" is checked for the
non-compat size during its initialization.
> actually easy to avoid this patch by instead enforcing that attributes are
> set in a sensible order:
I don't care about fixing XEN support, I care about forcing "length" to be immutable
in order to simplify the gfn_to_pfn_cache implementation. Avoiding this patch
prevents doing that later in this series.
> - long mode cannot be changed after the shared info page is enabled (which
> makes sense because the shared info page also has a compat version)
How is this not introducing an additional restriction? This seems way more
onerous than what is effectively a revert.
> - the caches must be activated after the shared info page (which enforces
> that the vCPU attributes are set after the VM attributes)
>
> This is technically a userspace API break, but nobody is really using this
> API outside Amazon so... Patches coming after I finish testing.
It's not just userspace break, it affects the guest ABI as well. arch.xen.long_mode
isn't set just by userspace, it's also snapshot when the guest changes the hypercall
page. Maybe there's something in the XEN ABI that says the hypercall page can never
be changed, but barring that I don't see how to prevent ending up with a misaligned
cache due to the guest enabling long mode.
int kvm_xen_write_hypercall_page(struct kvm_vcpu *vcpu, u64 data)
{
struct kvm *kvm = vcpu->kvm;
u32 page_num = data & ~PAGE_MASK;
u64 page_addr = data & PAGE_MASK;
bool lm = is_long_mode(vcpu);
/* Latch long_mode for shared_info pages etc. */
vcpu->kvm->arch.xen.long_mode = lm;