Re: [PATCH V2 2/2] x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present

From: Huang, Kai
Date: Mon Jul 07 2025 - 07:40:58 EST


On Mon, 2025-07-07 at 15:15 +0800, Chao Gao wrote:
> On Sun, Jul 06, 2025 at 09:23:05PM -0700, Dave Hansen wrote:
> > On 7/6/25 20:16, Chao Gao wrote:
> > > Even on a CPU w/ SEAM_NR and w/o X86_BUG_TDX_PW_MCE, is there still a risk of
> > > poisoned memory being returned to the host kernel? Since only poison
> > > consumption causes #MCE, if a poisoned page is never consumed in SEAM non-root
> > > mode, there will be no #MCE, and the mentioned commit won't mark the page as
> > > poisoned.
> > >
> > > A reclaimed poisoned page could be reused and potentially cause a kernel panic.
> > > While WBINVD could help, we believe it's not worth it as it will slow down the
> > > vast majority of cases. Is my understanding correct?
> >
> > How is this any different from any other kind of hardware poison?
>
> I wasn't arguing that MOVDIR64B should be kept. I was highlighting the risk of
> kernel panic on CPUs even without the partial write bug and guessing why it was
> not worth fixing.
>
> Regarding your question, the poison likely occurs due to software bugs rather
> than hardware issues. And, as stated in the comment removed in patch 1, unlike
> other hardware poison, this poison can be cleared using MOVDIR64B.
>
> >
> > Why should this specific kind of freeing (TDX private memory being freed
> > back to the host) operation be different from any other kind of free?
>
> To limit the impact of software bugs (e.g., TDX module bugs) to TDX guests
> rather than affecting the entire kernel. Debugging a TDX module bug that
> results in a #MCE in a random host context can be quite frustrating, right?
> But, on the other hand, MOVDIR64B incurs a 40% slowdown when shutting down a
> TD. So, It's a tradeoff between containing theoretical software bugs and
> experiencing a 40% slowdown.
>
> Personally, I also prefer to remove MOVDIR64B, but I also want to point out the
> bug triage issue and the risk of kernel panic after removing MOVDIR64B.

If we are only talking about the poison due to TD-mismatch or integrity
failure, per TDX spec the CPU only marks the memory as poisoned when the CPU
actually (performs read and) consumes the bad data in SEAM non-root mode, in
which case there will be a subsequent #MCE from SEAM non-root mode.

A TDX module bug which causes the module itself accidentally writes TDX
private memory using different KeyID won't mark the memory as poisoned.  A
further read (due to bug) from host kernel using KeyID 0 won't poison the
memory either.

A TDX module bug which causes the module itself accidentally reads TDX
private memory using different KeyID poisons that memory and causes #MCE
immediately in SEAM, but this is fatal to the system, so no poisoned memory
will be returned to the kernel.

In other words, I think it shouldn't be possible that a poisoned page is
never consumed in SEAM non-root but later returned to the host kernel.