Re: [PATCH 2/4] KVM: selftests: Setup ucall after loading program into guest memory

From: Sean Christopherson
Date: Thu Dec 08 2022 - 14:02:07 EST


On Thu, Dec 08, 2022, Ricardo Koller wrote:
> On Thu, Dec 08, 2022 at 12:37:23AM +0000, Oliver Upton wrote:
> > On Thu, Dec 08, 2022 at 12:24:20AM +0000, Sean Christopherson wrote:
> > > > Even still, that's just a kludge to make ucalls work. We have other
> > > > MMIO devices (GIC distributor, for example) that work by chance since
> > > > nothing conflicts with the constant GPAs we've selected in the tests.
> > > >
> > > > I'd rather we go down the route of having an address allocator for the
> > > > for both the VA and PA spaces to provide carveouts at runtime.
> > >
> > > Aren't those two separate issues? The PA, a.k.a. memslots space, can be solved
> > > by allocating a dedicated memslot, i.e. doesn't need a carve. At worst, collisions
> > > will yield very explicit asserts, which IMO is better than whatever might go wrong
> > > with a carve out.
> >
> > Perhaps the use of the term 'carveout' wasn't right here.
> >
> > What I'm suggesting is we cannot rely on KVM memslots alone to act as an
> > allocator for the PA space. KVM can provide devices to the guest that
> > aren't represented as memslots. If we're trying to fix PA allocations
> > anyway, why not make it generic enough to suit the needs of things
> > beyond ucalls?
>
> One extra bit of information: in arm, IO is any access to an address (within
> bounds) not backed by a memslot. Not the same as x86 where MMIO are writes to
> read-only memslots. No idea what other arches do.

I don't think that's correct, doesn't this code turn write abort on a RO memslot
into an io_mem_abort()? Specifically, the "(write_fault && !writable)" check will
match, and assuming none the the edge cases in the if-statement fire, KVM will
send the write down io_mem_abort().

gfn = fault_ipa >> PAGE_SHIFT;
memslot = gfn_to_memslot(vcpu->kvm, gfn);
hva = gfn_to_hva_memslot_prot(memslot, gfn, &writable);
write_fault = kvm_is_write_fault(vcpu);
if (kvm_is_error_hva(hva) || (write_fault && !writable)) {
/*
* The guest has put either its instructions or its page-tables
* somewhere it shouldn't have. Userspace won't be able to do
* anything about this (there's no syndrome for a start), so
* re-inject the abort back into the guest.
*/
if (is_iabt) {
ret = -ENOEXEC;
goto out;
}

if (kvm_vcpu_abt_iss1tw(vcpu)) {
kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
ret = 1;
goto out_unlock;
}

/*
* Check for a cache maintenance operation. Since we
* ended-up here, we know it is outside of any memory
* slot. But we can't find out if that is for a device,
* or if the guest is just being stupid. The only thing
* we know for sure is that this range cannot be cached.
*
* So let's assume that the guest is just being
* cautious, and skip the instruction.
*/
if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) {
kvm_incr_pc(vcpu);
ret = 1;
goto out_unlock;
}

/*
* The IPA is reported as [MAX:12], so we need to
* complement it with the bottom 12 bits from the
* faulting VA. This is always 12 bits, irrespective
* of the page size.
*/
fault_ipa |= kvm_vcpu_get_hfar(vcpu) & ((1 << 12) - 1);
ret = io_mem_abort(vcpu, fault_ipa);
goto out_unlock;
}