Re: [RFC PATCH 3/5] x86/kexec: Disable kexec/kdump on platforms with TDX partial write erratum
From: Edgecombe, Rick P
Date: Mon Mar 17 2025 - 19:57:32 EST
On Mon, 2025-03-17 at 01:19 +0000, Huang, Kai wrote:
> On Fri, 2025-03-14 at 19:03 +0000, Edgecombe, Rick P wrote:
> > On Thu, 2025-03-13 at 23:57 +0000, Huang, Kai wrote:
> >
> >
> >
> > So I think the situation is we need at one kernel parameter. We already have one
> > for KVM, which controls the late initialization parts of TDX that we care about
> > here. So what about just using the existing one? I think we don't want two.
>
> Logically, KVM is one user of TDX. I think whether KVM has a parameter should
> not impact whether we should introduce one kernel parameter for TDX host core-
> kernel.
>
> Dan also made a point that in the context of TDX Connect, there's requirement to
> make SEAMCALLs even KVM is not going to run any TDX guest:
>
> https://lore.kernel.org/kvm/cover.1730120881.git.kai.huang@xxxxxxxxx/T/#m6928f5519de25def97d47fc6bbb77f5c3e958f7b
>
> So I agree ideally we don't want two, but I think it is also OK if there's good
> reason to do so.
What is the good reason to have two though? Do we just want one host side one
and lose the KVM one? It seems adding kernel parameters to make code problems go
away is usually frowned upon.
>
> >
> > If KVM has not initialized TDX (based on its own TDX parameter), then kexec is
> > fine.
> >
>
> For now. In the future TDX module could be initialized by other kernel
> components.
>
> > It could work by exposing an interface for features to be exclusive with
> > TDX. Since real TDX module initialization happens late anyway. I don't know if
> > it's better than a kernel one, but I don't see adding a second one going well.
> >
> >
> > Very, very rough:
> >
> > diff --git a/arch/x86/kernel/machine_kexec_64.c
> > b/arch/x86/kernel/machine_kexec_64.c
> > index a68f5a0a9f37..bfea4e78c577 100644
> > --- a/arch/x86/kernel/machine_kexec_64.c
> > +++ b/arch/x86/kernel/machine_kexec_64.c
> > @@ -315,6 +315,12 @@ int machine_kexec_prepare(struct kimage *image)
> > result = init_pgtable(image, __pa(control_page));
> > if (result)
> > return result;
> > +
> > + if (tdx_exclude_feature()) {
> > + pr_info_once("Not allowed once TDX has been used.\n");
> > + return -EOPNOTSUPP;
> > + }
> > +
> > kexec_va_control_page = (unsigned long)control_page;
> > kexec_pa_table_page = (unsigned long)__pa(image->arch.pgd);
> >
> > diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> > index f5e2a937c1e7..9b1f42a1059c 100644
> > --- a/arch/x86/virt/vmx/tdx/tdx.c
> > +++ b/arch/x86/virt/vmx/tdx/tdx.c
> > @@ -1215,6 +1215,21 @@ int tdx_enable(void)
> > }
> > EXPORT_SYMBOL_GPL(tdx_enable);
> >
> > +bool tdx_exclude_feature(void)
> > +{
> > + bool ret = false;
> > +
> > + mutex_lock(&tdx_module_lock);
> > + if (tdx_module_status == TDX_MODULE_INITIALIZED)
> > + ret = true;
> > + else
> > + tdx_module_status = TDX_MODULE_EXCLUDED;
> > + mutex_lock(&tdx_module_lock);
> > +
> > + return ret;
> > +}
>
> Assuming setting module status to "excluded" means we are not able to initialize
> TDX module for ever.
I was going for the simplest approach without adding a new kernel parameter. But
in practice for distros KVM will load at boot and it should work pretty much the
same. If there is the tdx parameter kexec is disabled, otherwise it's enabled.
>
> The thing is Kexec has two phases: 1) loading kernel image, and 2) actually do
> kexec. Your approach basically marks TDX unusable for ever when a user tries to
> load a kxec kernel image, but this is a little bit nasty because loading kexec
> kernel image successfully doesn't mean you have to actually do the kexec, i.e.,
> you can unload the image and move on.
This compared to tdx_host parameter means that sometimes the user may be able to
decide late whether they want TDX or kexec.
>
> I am not saying this doesn't work, but IMHO it is more straightforward to just
> let user make decision via kernel parameter.
Straightforward, yes agree. It's easier to document and the code would be
simpler.
I'm ok trying the tdx_host method, but I do think we need a better reason for
having two tdx kernel parameters when there is only one users of TDX today
(KVM).