Re: [PATCH 2/2] KVM: x86: Don't allow tsc_offset, tsc_scaling_ratio to change
From: Isaku Yamahata
Date: Thu Nov 21 2024 - 18:50:38 EST
On Mon, Oct 14, 2024 at 03:48:03PM +0000,
"Edgecombe, Rick P" <rick.p.edgecombe@xxxxxxxxx> wrote:
> On Sat, 2024-10-12 at 00:55 -0700, Isaku Yamahata wrote:
> > Problem
> > The current x86 KVM implementation conflicts with protected TSC because the
> > VMM can't change the TSC offset/multiplier. Disable or ignore the KVM
> > logic to change/adjust the TSC offset/multiplier somehow.
> >
> > Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
> > offset/multiplier, the TSC timer interrupts is injected to the guest at the
> > wrong time if the KVM TSC offset is different from what the TDX module
> > determined.
> >
> > Originally this issue was found by cyclic test of rt-test [1] as the
> > latency in TDX case is worse than VMX value + TDX SEAMCALL overhead. It
> > turned out that the KVM TSC offset is different from what the TDX module
> > determines.
> >
> > Solution
> > The solution is to keep the KVM TSC offset/multiplier the same as the value
> > of the TDX module somehow. Possible solutions are as follows.
> > - Skip the logic
> > Ignore (or don't call related functions) the request to change the TSC
> > offset/multiplier.
> > Pros
> > - Logically clean. This is similar to the guest_protected case.
> > Cons
> > - Needs to identify the call sites.
> >
> > - Revert the change at the hooks after TSC adjustment
> > x86 KVM defines the vendor hooks when TSC offset/multiplier are
> > changed. The callback can revert the change.
> > Pros
> > - We don't need to care about the logic to change the TSC
> > offset/multiplier.
> > Cons:
> > - Hacky to revert the KVM x86 common code logic.
> >
> > Choose the first one. With this patch series, SEV-SNP secure TSC can be
> > supported.
> >
> > [1] https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git
> >
> > Reported-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
>
> IIUC this problem was reported by Marcelo and he tested these patches and found
> that they did *not* resolve his issue? But offline you mentioned that you
> reproduced a similar seeming bug on your end that *was* resolved by these
> patches.
That's right. The first experimental patch didn't, but this patch does.
(At least I belive so. Marcelo, please jump in if I'm wrong.)
> If I got that right, I would think we should figure out Marcelo's
> problem before fixing this upstream. If it only affects out-of-tree TDX code we
> can take more time and not thrash the code as it gets untangled further.
Ok. This patch affects TDX code (and potentially SEV-SNP secure TSC host code.)
--
Isaku Yamahata <isaku.yamahata@xxxxxxxxx>