Re: [PATCH 2/4] x86/tdx: Use ReportFatalError to report missing SEPT_VE_DISABLE

From: Kirill A. Shutemov
Date: Thu Dec 15 2022 - 12:19:15 EST


On Tue, Dec 13, 2022 at 03:06:07PM -0800, Dave Hansen wrote:
> On 12/9/22 05:25, Kirill A. Shutemov wrote:
> > The check for SEPT_VE_DISABLE happens early in the kernel boot where
> > earlyprintk is not yet functional. Kernel successfully detect broken
> > TD configuration and stops the kernel with panic(), but it cannot
> > communicate the reason to the user.
>
> Linux TDX guests require that the SEPT_VE_DISABLE "attribute" be set.
> If it is not set, the kernel is theoretically required to handle
> exceptions anywhere that kernel memory is accessed, including places
> like NMI handlers and in the syscall entry gap.
>
> Rather than even try to handle these exceptions, the kernel refuses to
> run if SEPT_VE_DISABLE is unset.
>
> However, the SEPT_VE_DISABLE detection and refusal code happens very
> early in boot, even before earlyprintk runs. Calling panic() will
> effectively just hang the system.
>
> Instead, call a TDX-specific panic() function. This makes a very simple
> TDVMCALL which gets a short error string out to the hypervisor without
> any console infrastructure.
>
> --
>
> Is that better?

Yes, thank you.

> Also, are you sure we want to do this? Is there any way to do this
> inside of panic() itself to get panic() itself to call tdx_panic() and
> get a short error message out to the hypervisor?
>
> Getting *all* users of panic this magic ability would be a lot better
> than giving it to one call-site of panic().
>
> I'm all for making the panic() path as short and simple as possible, but
> it would be nice if this fancy hypercall would get used in more than one
> spot.

Well, I don't see an obvious way to integrate this into panic().

There is panic_notifier_list and it kinda/sorta works, see the patch
below.

But it breaks panic_notifier_list contract: the callback will never return
and no other callback will be able to do their stuff. panic_timeout is
also broken.

So ReportFatalError() is no good for the task. And I don't have anything
else :/

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 83ca9a7f0b75..81f9a964dc1f 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -7,6 +7,7 @@
#include <linux/cpufeature.h>
#include <linux/export.h>
#include <linux/io.h>
+#include <linux/panic_notifier.h>
#include <asm/coco.h>
#include <asm/tdx.h>
#include <asm/vmx.h>
@@ -146,8 +147,10 @@ int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport)
}
EXPORT_SYMBOL_GPL(tdx_mcall_get_report0);

-static void __noreturn tdx_panic(const char *msg)
+static int tdx_panic(struct notifier_block *this,
+ unsigned long event, void *ptr)
{
+ const char *msg = ptr;
struct tdx_hypercall_args args = {
.r10 = TDX_HYPERCALL_STANDARD,
.r11 = TDVMCALL_REPORT_FATAL_ERROR,
@@ -219,7 +222,7 @@ static void tdx_parse_tdinfo(u64 *cc_mask)
if (td_attr & ATTR_DEBUG)
pr_warn("%s\n", msg);
else
- tdx_panic(msg);
+ panic(msg);
}
}

@@ -851,6 +854,10 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
return true;
}

+static struct notifier_block panic_block = {
+ .notifier_call = tdx_panic,
+};
+
void __init tdx_early_init(void)
{
u64 cc_mask;
@@ -863,6 +870,7 @@ void __init tdx_early_init(void)

setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);

+ atomic_notifier_chain_register(&panic_notifier_list, &panic_block);
cc_set_vendor(CC_VENDOR_INTEL);
tdx_parse_tdinfo(&cc_mask);
cc_set_mask(cc_mask);
--
Kiryl Shutsemau / Kirill A. Shutemov