Re: [PATCH v5 02/10] KVM: arm64: Setup a framework for hypercall bitmap firmware registers
From: Raghavendra Rao Ananta
Date: Fri Apr 08 2022 - 13:35:02 EST
On Fri, Apr 8, 2022 at 9:59 AM Marc Zyngier <maz@xxxxxxxxxx> wrote:
>
> On Thu, 07 Apr 2022 18:24:14 +0100,
> Raghavendra Rao Ananta <rananta@xxxxxxxxxx> wrote:
> >
> > Hi Marc,
> >
> > > > +#define KVM_REG_ARM_STD_BIT_TRNG_V1_0 BIT(0)
> > >
> > > I'm really in two minds about this. Having one bit per service is easy
> > > from an implementation perspective, but is also means that this
> > > disallow fine grained control over which hypercalls are actually
> > > available. If tomorrow TRNG 1.1 adds a new hypercall and that KVM
> > > implements both, how does the selection mechanism works? You will
> > > need a version selector (a la PSCI), which defeats this API somehow
> > > (and renders the name of the #define invalid).
> > >
> > > I wonder if a more correct way to look at this is to enumerate the
> > > hypercalls themselves (all 5 of them), though coming up with an
> > > encoding is tricky (RNG32 and RNG64 would clash, for example).
> > >
> > > Thoughts?
> > >
> > I was on the fence about this too. The TRNG spec (ARM DEN 0098,
> > Table-4) mentions that v1.0 should have VERSION, FEATURES, GET_UUID,
> > and RND as mandatory features. Hence, if KVM advertised that it
> > supports TRNG v1.0, I thought it would be best to expose all or
> > nothing of v1.0 by guarding them with a single bit.
> > Broadly, the idea is to have a bit per version. If v1.1 comes along,
> > we can have another bit for that. If it's not too ugly to implement,
> > we can be a little more aggressive and ensure that userspace doesn't
> > enable v1.1 without enabling v1.0.
>
> OK, that'd be assuming that we'll never see a service where version A
> is incompatible with version B and that we have to exclude one or the
> other. Meh. Let's cross that bridge once it is actually built.
>
> [...]
>
> > > > + mutex_lock(&kvm->lock);
> > > > +
> > > > + /*
> > > > + * If the VM (any vCPU) has already started running, return success
> > > > + * if there's no change in the value. Else, return -EBUSY.
> > >
> > > No, this should *always* fail if a vcpu has started. Otherwise, you
> > > start allowing hard to spot races.
> > >
> > The idea came from the fact that userspace could spawn multiple
> > threads to configure the vCPU registers. Since we don't have the
> > VM-scoped registers yet, it may be possible that userspace has issued
> > a KVM_RUN on one of the vCPU, while the others are lagging behind and
> > still configuring the registers. The slower threads may see -EBUSY and
> > could panic. But if you feel that it's an overkill and the userspace
> > should deal with it, we can return EBUSY for all writes after KVM_RUN.
>
> I'd rather have that. There already is stuff that rely on things not
> changing once a vcpu has run, so I'd rather be consistent.
>
Sure, I'll return EBUSY if the VM has started regardless of the incoming value.
> >
> > > > + */
> > > > + if (test_bit(KVM_ARCH_FLAG_HAS_RAN_ONCE, &kvm->arch.flags)) {
> > > > + ret = *fw_reg_bmap != val ? -EBUSY : 0;
> > > > + goto out;
> > > > + }
> > > > +
> > > > + WRITE_ONCE(*fw_reg_bmap, val);
> > >
> > > I'm not sure what this WRITE_ONCE guards against. Do you expect
> > > concurrent reads at this stage?
> > >
> > Again, the assumption here is that userspace could have multiple
> > threads reading and writing to these registers. Without the VM scoped
> > registers in place, we may end up with a read/write to the same memory
> > location for all the vCPUs.
>
> We only have one vcpu updating this at any given time (that's what the
> lock ensures). A simple write should be OK, as far as I can tell.
>
I agree that a write against another write should be fine without the
WRITE_ONCE. But my little concern was this write against a read
(unsure how userspace accesses these registers). I'm guessing it
shouldn't hurt to keep them in place, no? :)
Regards,
Raghavendra
> Thanks,
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.