Re: [PATCH] KVM: x86: Deflect unknown MSR accesses to user space

From: Alexander Graf
Date: Wed Jul 29 2020 - 16:29:10 EST




On 29.07.20 20:27, Jim Mattson wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



On Wed, Jul 29, 2020 at 2:06 AM Alexander Graf <graf@xxxxxxxxxx> wrote:



On 28.07.20 19:13, Jim Mattson wrote:

This sounds similar to Peter Hornyack's RFC from 5 years ago:
https://www.mail-archive.com/kvm@xxxxxxxxxxxxxxx/msg124448.html.

Yeah, looks very similar. Do you know the history why it never got
merged? I couldn't spot a non-RFC version of this on the ML.

I believe Peter got frustrated with all of the pushback he was
getting, and he moved on to other things. While Google still uses that
code, Aaron's new approach should give us equivalent functionality
without having to comment out the MSRs that kvm previously didn't know
about, and which we still want redirected to userspace.

It seems unlikely that userspace is going to know what to do with a
large number of MSRs. I suspect that a small enumerated list will
suffice. In fact, +Aaron Lewis is working on upstreaming a local
Google patch set that does just that.

I tend to disagree on that sentiment. One of the motivations behind this
patch is to populate invalid MSR accesses into user space, to move logic
like "ignore_msrs"[1] into user space. This is not very useful for the
cloud use case, but it does come in handy when you want to have VMs that
can handle unimplemented MSRs in parallel to ones that do not.

So whatever we implement, I would ideally want a mechanism at the end of
the day that allows me to "trap the rest" into user space.

I do think "the rest" should be explicitly specified, so that
userspace doesn't get surprises when kvm evolves. Maybe this can be
done using the allow-list you refer to later, along with a specified
action for disallowed MSRs: (1) raise #GP, (2) ignore, or (3) exit to
userspace. This actually seems orthogonal to what Aaron is working on,
which is to request that specific MSR accesses exit to userspace. But,
at least the plumbing for {RD,WR}MSR completion when coming back from
userspace can be leveraged by both.


Thinking about this for a while, I am quite confident that we don't need to complexify this all that much. The #GP path is never performance critical and thus can easily be handled in user space. There are a few niche cases where exiting to user space is "too complicated" (think nVMX MSR restore path). But they are niche and just bailing out for the user space exit path on them is fine.

So I think a patch that allows us to allow list MSRs that should be handled in KVM and another patch that allows us to deflect any MSR inflicted #GPs into user space is all it takes to make this a flexible and stable ABI.

The great thing is that by untangling the two bits, we can also support the "user space wants to leave it all to KVM, but be able to implement ignore_msrs itself" use case easily. User space would just not set an allow list.

Meanwhile, I have cleaned up Karim's old patch to add allow listing to KVM and would post it if Aaron doesn't beat me to it :).


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879