Re: [PATCH v1 2/3] x86/msr: Switch between WRMSRNS and WRMSR with the alternatives mechanism

From: Andrew Cooper
Date: Fri Aug 16 2024 - 18:35:06 EST


On 16/08/2024 11:27 pm, Andrew Cooper wrote:
> On 16/08/2024 10:26 pm, H. Peter Anvin wrote:
>> On 8/16/24 11:40, Andrew Cooper wrote:
>>>> As the CALL instruction is 5-byte long, and we need to pad nop for both
>>>> WRMSR and WRMSRNS, what about not using segment prefix at all?
>> You can use up to 4 prefixes of any kind (which includes opcode
>> prefixes before 0F) before most decoders start hurting, so we can pad
>> it out to 5 bytes by doing 3f 3f .. .. ..
>>
>>> My suggestion, not that I've had time to experiment, was to change
>>> paravirt to use a non-C ABI and have asm_xen_write_msr() recombine
>>> edx:eax into rsi.  That way the top level wrmsr() retains sensible
>>> codegen for native even when paravirt is active.
>>>
>> I have attached what should be an "obvious" example... famous last words.
> Ah, now I see what you mean about Xen's #GP semantics.
>
> That's a neat way of doing it.  It means the faulting path will really
> take 2 faults on Xen, but it's a faulting path anyway so speed is
> already out of the window.
>
> Do you mind about teaching the #UD handler to deal with WRMSR like that?
>
> I ask, because I can't think of anything nicer.
>
> There are plenty of 3-byte instructions which #GP in PV guests (CPL3),
> and LTR is my go-to for debugging purposes, as it's not emulated by Xen.
>
> Anything here (and it can't be an actual WRMSR) will be slightly
> confusing to read in an OOPS, especially #UD for what is logically a #GP.
>
> But, a clear UD of some form in the disassembly is probably better than
> a random other instruction unrelated to the operation.
>
> ~Andrew

Oh, P.S.

We can probably drop most of the register manipulation by making the new
xen_do_write_msr be no_caller_saved_registers.  As we're intentionally
not a C ABI to start with, we might as well not spill registers we don't
use either.

~Andrew