Re: [PATCH v3 2/3] KVM: x86: On emulation failure, convey the exit reason, etc. to userspace
From: Sean Christopherson
Date: Fri Aug 06 2021 - 20:59:38 EST
On Mon, Aug 02, 2021, David Edmondson wrote:
> On Monday, 2021-08-02 at 16:58:03 GMT, Sean Christopherson wrote:
>
> > On Mon, Aug 02, 2021, David Edmondson wrote:
> >> On Friday, 2021-07-30 at 22:14:48 GMT, Sean Christopherson wrote:
> >> When we add another flag (presuming that we do, because if not there was
> >> not much point in the flags) this will have to be restructured again. Is
> >> there an objection to the original style? (prime ndata=1, flags=0, OR in
> >> flags and adjust ndata as we go.)
> >
> > No objection, though if you OR in flags then you should truly _adjust_ ndata, not
> > set it, e.g.
>
> My understanding of Aaron's intent is that this would not be the case.
>
> That is, if we add another flag with payload and set that flag, we would
> still have space for the instruction stream in data[] even if
> KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES is not set.
Hmm, I don't think we have to make that decision yet. Userspace must check the
flag before consuming run->emulation_failure, so we haven't fully commited one
way or the other.
I believe the original thought was indeed to skip unused fields, but I don't think
that's actually the behavior we want for completely unrelated fields, i.e. flag
combinations that will _never_ be valid together. The only reason to skip fields
would be to keep the offset of a particular field constant, so I think the rule
can be that if fields that can coexist but are controlled by different flags, they
must be in the same anonymous struct, but in general a union is ok.
It seems rather unlikely that we'll gain many more flags, but it would be really
unfortunate if we commit to skipping fields and then run out of space because of
that decision.
> Given that, we must *set* ndata each time we add in a flag
Technically we wouldn't have to (being more than a bit pedantic), we could keep
the same flow and just do the ndata_start bump outside of the OR path, e.g.
/* Always include the flags as a 'data' entry. */
ndata_start = 1;
run->emulation_failure.flags = 0;
/* Skip unused fields instead of overloading them when they're not used. */
ndata_start += 2;
if (insn_size) {
run->emulation_failure.flags |=
KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES;
run->emulation_failure.insn_size = insn_size;
memset(run->emulation_failure.insn_bytes, 0x90,
sizeof(run->emulation_failure.insn_bytes));
memcpy(run->emulation_failure.insn_bytes, insn_bytes, insn_size);
}
so that it's easier to understand the magic numbers used to adjust ndata_start.
But a better solution would be to have no magic numbers at all and set ndata_start
via sizeof(run->emulation_failure). E.g.
/*
* There's currently space for 13 entries, but 5 are used for the exit
* reason and info. Restrict to 4 to reduce the maintenance burden
* when expanding kvm_run.emulation_failure in the future.
*/
if (WARN_ON_ONCE(ndata > 4))
ndata = 4;
ndata_start = sizeof(run->emulation_failure);
memcpy(&run->internal.data[], info, ARRAY_SIZE(info));
memcpy(&run->internal.data[ndata_start + ARRAY_SIZE(info)], data, ndata);
run->internal.ndata = ndata_start + ARRAY_SIZE(info) + ndata;
Though I'd prefer we not skip fields, at least not yet, e.g. to condition userspace
to do the right thing if we decide to not skip when adding a second flag (if that
even ever happens).
> with the value being the extent of data[] used by the payload corresponding
> to that flag, and the flags must be considered in ascending order (or we
> remember a "max" along the way).
>
> Dumping the arbitray debug data after the defined fields would require
> adjusting ndata, of course.
>
> If this is not the case, and the flag indicated payloads are packed at
> the head of data[], then the current structure definition is misleading
> and we should perhaps revise it.
Ah, because there's no wrapping union. I wouldn't object to something like this
to hint to userpace that the struct layout may not be purely additive in the
future.
diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
index d9e4aabcb31a..6c79c1ce3703 100644
--- a/tools/include/uapi/linux/kvm.h
+++ b/tools/include/uapi/linux/kvm.h
@@ -402,8 +402,12 @@ struct kvm_run {
__u32 suberror;
__u32 ndata;
__u64 flags;
- __u8 insn_size;
- __u8 insn_bytes[15];
+ union {
+ struct {
+ __u8 insn_size;
+ __u8 insn_bytes[15];
+ };
+ };
} emulation_failure;
/* KVM_EXIT_OSI */
struct {