Re: [PATCH v1 2/3] x86/msr: Switch between WRMSRNS and WRMSR with the alternatives mechanism

From: Andrew Cooper
Date: Fri Aug 16 2024 - 18:28:08 EST


On 16/08/2024 10:26 pm, H. Peter Anvin wrote:
> On 8/16/24 11:40, Andrew Cooper wrote:
>>>
>>> As the CALL instruction is 5-byte long, and we need to pad nop for both
>>> WRMSR and WRMSRNS, what about not using segment prefix at all?
>>
>
> You can use up to 4 prefixes of any kind (which includes opcode
> prefixes before 0F) before most decoders start hurting, so we can pad
> it out to 5 bytes by doing 3f 3f .. .. ..
>
>>
>> My suggestion, not that I've had time to experiment, was to change
>> paravirt to use a non-C ABI and have asm_xen_write_msr() recombine
>> edx:eax into rsi.  That way the top level wrmsr() retains sensible
>> codegen for native even when paravirt is active.
>>
>
> I have attached what should be an "obvious" example... famous last words.

Ah, now I see what you mean about Xen's #GP semantics.

That's a neat way of doing it.  It means the faulting path will really
take 2 faults on Xen, but it's a faulting path anyway so speed is
already out of the window.

Do you mind about teaching the #UD handler to deal with WRMSR like that?

I ask, because I can't think of anything nicer.

There are plenty of 3-byte instructions which #GP in PV guests (CPL3),
and LTR is my go-to for debugging purposes, as it's not emulated by Xen.

Anything here (and it can't be an actual WRMSR) will be slightly
confusing to read in an OOPS, especially #UD for what is logically a #GP.

But, a clear UD of some form in the disassembly is probably better than
a random other instruction unrelated to the operation.

~Andrew