Re: [GIT PULL] x86/cpu for v5.17

From: Sean Christopherson
Date: Tue Jan 11 2022 - 17:48:52 EST


On Mon, Jan 10, 2022, Dave Hansen wrote:
> On 1/10/22 10:35, Borislav Petkov wrote:
> > >
> > > Not a big deal, I just thought I'd mention it since I reacted to it.
> > > And we don't seem to have those vendor checks in any of the other code
> > > that uses MSR_CSTAR (just grepping for that and seeing it mentioned in
> > > kvm code etc)
> > Right, the only point for doing the vendor check I see here is, well,
> > because it is Intel who doesn't have CSTAR, let's check for Intel. But
> > yeah, we do avoid the vendor checks if it can be helped.
> >
> > We can do a synthetic X86_FEATURE flag but that would be a waste. So the
> > _safe thing and keep the comment sounds optimal to me.
> >
> > I can whip up a patch ontop if people agree.
>
> There are four basic options here for TDX:
>
> 1. Paper over the #VE in the #VE handler itself
> 2. Do a check for TDX at the wrmsr(MSR_CSTAR) site
> 3. Do a check for X86_VENDOR_INTEL at the wrmsr(MSR_CSTAR) site
> 4. Use wrmsr*_safe() and rely on #VE -> fixup_exception()
>
> TDX originally did #1, passed over #2 and settled on #3 because of a
> comment:
>
> It's an obvious optimization (avoiding the WRMSR with a
> conditional) without TDX because the write is pointless
> independent of TDX." [1]
>
> I think doing wrmsr*_safe() is OK. But, on TDX systems, it will end up
> taking a weird route:
>
> WRMSR -> #VE -> hypercall -> ve_raise_fault() -> fixup_exception()
>
> instead of the "normal" _safe route which goes:
>
> WRMSR -> #GP -> ... -> fixup_exception()
>
> So, we should probably make sure wrmsr*_safe() is fine on TDX before we
> subject ourselves to any additional churn. Kirill, can you test that out?

wrmsr*_safe() isn't, uh, safe. The issue is that the #VE -> hypercall will expose
information about the guest kernel's layout (address of a kernel symbol) to the
untrusted hypervisor, and in theory could be used to attack the guest. Whether
or not the kernel's layout is all that valuable of a secret is debatable, but it's
hard to argue that the kernel should knowingly expose such information to the host.

Most VMMs probably won't even signal failure on the hypercall, i.e. the kernel will
never do fixup_exception(), the real motivation is to avoid the hypercall.

The argument against papering over the fault in the #VE handler is that there's
exactly one wrmsr(MSR_CSTAR) in the entire kernel, i.e. it's a wash from a code
perspective, and avoiding the wrmsr entirely will allow the #VE handler to WARN
if something else in the kernel ends up writing MSR_CSTAR.