Re: [PATCHv4] kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVM

From: Thomas Huth
Date: Tue Apr 21 2015 - 03:37:24 EST


Am Tue, 21 Apr 2015 16:51:21 +1000
schrieb David Gibson <david@xxxxxxxxxxxxxxxxxxxxx>:

> On Tue, Apr 21, 2015 at 08:37:02AM +0200, Thomas Huth wrote:
> > Am Tue, 21 Apr 2015 10:41:51 +1000
> > schrieb David Gibson <david@xxxxxxxxxxxxxxxxxxxxx>:
> >
> > > On POWER, storage caching is usually configured via the MMU - attributes
> > > such as cache-inhibited are stored in the TLB and the hashed page table.
> > >
> > > This makes correctly performing cache inhibited IO accesses awkward when
> > > the MMU is turned off (real mode). Some CPU models provide special
> > > registers to control the cache attributes of real mode load and stores but
> > > this is not at all consistent. This is a problem in particular for SLOF,
> > > the firmware used on KVM guests, which runs entirely in real mode, but
> > > which needs to do IO to load the kernel.
> > >
> > > To simplify this qemu implements two special hypercalls, H_LOGICAL_CI_LOAD
> > > and H_LOGICAL_CI_STORE which simulate a cache-inhibited load or store to
> > > a logical address (aka guest physical address). SLOF uses these for IO.
> > >
> > > However, because these are implemented within qemu, not the host kernel,
> > > these bypass any IO devices emulated within KVM itself. The simplest way
> > > to see this problem is to attempt to boot a KVM guest from a virtio-blk
> > > device with iothread / dataplane enabled. The iothread code relies on an
> > > in kernel implementation of the virtio queue notification, which is not
> > > triggered by the IO hcalls, and so the guest will stall in SLOF unable to
> > > load the guest OS.
> > >
> > > This patch addresses this by providing in-kernel implementations of the
> > > 2 hypercalls, which correctly scan the KVM IO bus. Any access to an
> > > address not handled by the KVM IO bus will cause a VM exit, hitting the
> > > qemu implementation as before.
> > >
> > > Note that a userspace change is also required, in order to enable these
> > > new hcall implementations with KVM_CAP_PPC_ENABLE_HCALL.
> > >
> > > Signed-off-by: David Gibson <david@xxxxxxxxxxxxxxxxxxxxx>
> > > ---
> > > arch/powerpc/include/asm/kvm_book3s.h | 3 ++
> > > arch/powerpc/kvm/book3s.c | 76 +++++++++++++++++++++++++++++++++++
> > > arch/powerpc/kvm/book3s_hv.c | 12 ++++++
> > > arch/powerpc/kvm/book3s_pr_papr.c | 28 +++++++++++++
> > > 4 files changed, 119 insertions(+)
> > ...
> > > diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
> > > index cfbcdc6..453a8a4 100644
> > > --- a/arch/powerpc/kvm/book3s.c
> > > +++ b/arch/powerpc/kvm/book3s.c
> > > @@ -821,6 +821,82 @@ void kvmppc_core_destroy_vm(struct kvm *kvm)
> > > #endif
> > > }
> > >
> > > +int kvmppc_h_logical_ci_load(struct kvm_vcpu *vcpu)
> > > +{
> > > + unsigned long size = kvmppc_get_gpr(vcpu, 4);
> > > + unsigned long addr = kvmppc_get_gpr(vcpu, 5);
> > > + u64 buf;
> > > + int ret;
> > > +
> > > + if (!is_power_of_2(size) || (size > sizeof(buf)))
> > > + return H_TOO_HARD;
> > > +
> > > + ret = kvm_io_bus_read(vcpu, KVM_MMIO_BUS, addr, size, &buf);
> > > + if (ret != 0)
> > > + return H_TOO_HARD;
> > > +
> > > + switch (size) {
> > > + case 1:
> > > + kvmppc_set_gpr(vcpu, 4, *(u8 *)&buf);
> > > + break;
> > > +
> >
> > Most of the code in book3s.c seems not to use a empty line after a
> > "break;", so may I suggest to remove these empty lines here, too, to
> > keep the coding style a little bit more consistent?
>
> I don't think it's worth respinning just for that.
>
> > > + case 2:
> > > + kvmppc_set_gpr(vcpu, 4, be16_to_cpu(*(__be16 *)&buf));
> > > + break;
> > > +
> > > + case 4:
> > > + kvmppc_set_gpr(vcpu, 4, be32_to_cpu(*(__be32 *)&buf));
> > > + break;
> > > +
> > > + case 8:
> > > + kvmppc_set_gpr(vcpu, 4, be64_to_cpu(*(__be64 *)&buf));
> > > + break;
> > > +
> > > + default:
> > > + BUG();
> >
> > If I got the code right, a malicious guest could easily trigger this
> > BUG() statement, couldn't it? ... so a BUG() is maybe not the right
> > thing to do here. Would it be appropriate to return an error value to
> > the guest instead?
>
> Actually no - the test at the top of the function for
> is_power_of_2(size) etc. catches this safely before we get here. The
> BUG() is just paranoia.

Ah, missed that, you're right, so the code should be fine!

Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/