Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests

From: Borislav Petkov
Date: Thu Feb 29 2024 - 12:00:50 EST


On Thu, Feb 22, 2024 at 08:24:04PM +0000, Kevin Loughlin wrote:
> SEV-SNP requires encrypted memory to be validated before access.
> Because the ROM memory range is not part of the e820 table, it is not
> pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
> to access this range, the guest must first validate the range.
>
> The current SEV-SNP code does indeed scan the ROM range during early
> boot and thus attempts to validate the ROM range in probe_roms().
> However, this behavior is neither necessary nor sufficient.

Why is this not necessary, all of a sudden?

> With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
> CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will

What is that use case exactly?

CONFIG_DMI_... is usually enabled but the absence of EFI_CONFIG_TABLES
tells me that you're booting some guest with some special OVMF which
doesn't sport such tables.

Why?

/me scrolls upthread

Aha, some project oak thing doing a minimal fw. I can see why but this
should be explained here as to why is this a relevant use case and what
it is using and so on so that future readers can piece it all together.

> attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> falls in the ROM range) prior to validation. The specific problematic
> call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> in a crash during boot if SEV-SNP is enabled under these conditions.
>
> With regards to necessity, SEV-SNP guests currently read garbage (which
> changes across boots) from the ROM range, meaning these scans are
> unnecessary. The guest reads garbage because the legacy ROM range
> is unencrypted data but is accessed via an encrypted PMD during early
> boot (where the PMD is marked as encrypted due to potentially mapping
> actually-encrypted data in other PMD-contained ranges).

I don't mind ripping that ROM probing thing but that thread we're on
here talks more about why it could be problematic to keep doing so so
pls summarize that here.

A commit should contain all arguments for why it has been arrived at
the decision to do it this way.

> While one solution would be to overhaul the early PMD mapping to treat
> the ROM region of the PMD as unencrypted, SEV-SNP guests do not rely on
> data from the legacy ROM region during early boot (nor can they
> currently, since the data would be garbage that changes across boots).

That's better.

> As such, this patch opts for the simpler approach of skipping the ROM

Avoid having "This patch" or "This commit" in the commit message. It is
tautologically useless.

Also, do

$ git grep 'This patch' Documentation/process

for more details.

> range scans (and the otherwise-necessary range validation) during
> SEV-SNP guest early boot.
>
> Ultimatly, the potential SEV-SNP guest crash due to lack of ROM range
^^^^^^^^^^

Please introduce a spellchecker into your patch creation workflow.

> validation is avoided by simply not accessing the ROM range.
>
> Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
> Signed-off-by: Kevin Loughlin <kevinloughlin@xxxxxxxxxx>
> ---
> arch/x86/include/asm/sev.h | 2 --
> arch/x86/kernel/mpparse.c | 7 +++++++
> arch/x86/kernel/probe_roms.c | 11 ++++-------
> arch/x86/kernel/sev.c | 15 ---------------
> drivers/firmware/dmi_scan.c | 7 ++++++-
> 5 files changed, 17 insertions(+), 25 deletions(-)

..

> diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
> index b223922248e9..39ea771e2d4c 100644
> --- a/arch/x86/kernel/mpparse.c
> +++ b/arch/x86/kernel/mpparse.c
> @@ -553,6 +553,13 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
> base, base + length - 1);
> BUILD_BUG_ON(sizeof(*mpf) != 16);
>
> + /*
> + * Skip scan in SEV-SNP guest if it would touch the legacy ROM region,
> + * as this memory is not pre-validated and would thus cause a crash.
> + */
> + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && base < 0x100000 && base + length >= 0xC0000)
> + return 0;

I don't like spreading around CoCo checks everywhere around the tree.

Think of a better way pls.

> diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
> index 015c95a825d3..22e27087eb5b 100644
> --- a/drivers/firmware/dmi_scan.c
> +++ b/drivers/firmware/dmi_scan.c
> @@ -703,7 +703,12 @@ static void __init dmi_scan_machine(void)
> dmi_available = 1;
> return;
> }
> - } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK)) {
> + } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK) &&
> + !cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {

Ditto.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette