Re: [PATCH] x86/sev: Apply RMP table fixups for kexec.

From: Kalra, Ashish
Date: Thu Apr 04 2024 - 14:07:51 EST


Hello Jeremi,

On 4/4/2024 3:17 AM, Jeremi Piotrowski wrote:
On Wed, Apr 03, 2024 at 04:08:35PM -0500, Kalra, Ashish wrote:
On 4/2/2024 5:42 PM, Michael Roth wrote:
On Tue, Apr 02, 2024 at 05:31:09PM -0500, Kalra, Ashish wrote:
On 4/2/2024 5:09 PM, Tom Lendacky wrote:
On 3/12/24 13:47, Ashish Kalra wrote:
From: Ashish Kalra <ashish.kalra@xxxxxxx>

RMP table start and end physical range may not be aligned to 2MB in
the e820 tables causing fatal RMP page faults during kexec boot when
new page allocations are done in the same 2MB page as the RMP table.
Check if RMP table start and end physical range in e820_table is not
aligned to 2MB and in that case use e820__range_update() to map this
range to reserved.

Override e820__memory_setup_default() to check and apply these RMP table
fixups in e820_table before e820_table is used to setup
e280_table_firmware and e820_table_kexec.

Fixes: c3b86e61b756 ("x86/cpufeatures: Enable/unmask SEV-SNP CPU
feature")
Signed-off-by: Ashish Kalra <ashish.kalra@xxxxxxx>
---
  arch/x86/virt/svm/sev.c | 52 +++++++++++++++++++++++++++++++++++++++++
  1 file changed, 52 insertions(+)

diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index cffe1157a90a..e0d7584df28f 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -65,6 +65,8 @@ static u64 probed_rmp_base, probed_rmp_size;
  static struct rmpentry *rmptable __ro_after_init;
  static u64 rmptable_max_pfn __ro_after_init;
  +static char *__init snp_rmptable_e820_fixup(void);
+
  static LIST_HEAD(snp_leaked_pages_list);
  static DEFINE_SPINLOCK(snp_leaked_pages_list_lock);
  @@ -160,9 +162,59 @@ bool snp_probe_rmptable_info(void)
      pr_info("RMP table physical range [0x%016llx - 0x%016llx]\n",
          probed_rmp_base, probed_rmp_base + probed_rmp_size - 1);
  +    /*
+     * Override e820__memory_setup_default() to do any RMP table fixups
+     * for kexec if required.
+     */
+    x86_init.resources.memory_setup = snp_rmptable_e820_fixup;
This produces a build warning:

WARNING: modpost: vmlinux: section mismatch in reference:
snp_probe_rmptable_info+0x95 (section: .text) -> x86_init (section:
.init.data)
WARNING: modpost: vmlinux: section mismatch in reference:
snp_probe_rmptable_info+0x99 (section: .text) -> snp_rmptable_e820_fixup
(section: .init.text)

Oh, so this requires snp_probe_rmptable_info() to be fixed to use the __init
macro.

I believe that snp_probe_rmptable_info() should be anyway using the __init
macro and this fix for snp_probe_rmptable_info() needs to be sent as a
separate patch and regardless of this patch getting merged or not.
I think you'll hit issues with:

bsp_determine_snp() -> //non-__init
snp_probe_rmptable_info() //__init

and bsp_determine_snp() sticks around as a function pointer assigned to
cpuinfo_x86 so I don't think you can use __init there.

So might need to just drop __init from snp_rmptable_e820_fixup().
Actually, that will not help as snp_probe_rmptable_info() is *also*
accessing x86_init.resources.memory_setup

What if we flipped the whole flow? Borislav is adding a CC_ATTR to indicate
HOST_SEV_SNP support, we don't need to clear X86_FEATURE_SEV_SNP at this phase
(or at all for that matter). snp_probe_rmptable_info() can be done later
during kernel init, once e820 has been parsed.

One way of doing this would be to override x86_init.resources.memory_setup()
to do both snp_probe_rmptable_info() and snp_rmptable_e820_fixup().

What do you think?

I like this idea of overriding x86_init.resources_memory_setup() to do both snp_probe_rmptable_info() and snp_rmptable_e820_fixup().

This callback seems to be meant for platform specific memory setup and SNP enabled RMP table platforms seem to be a good candidate to use this callback to probe RMP table, sanitize it and then apply e820 fixups or adjustments specific to the RMP table.

Additionally. setup_arch() -> e820__memory_setup() initializes all the three e820 tables (including the two tables used for kexec - e820_table_kexec and e820_table_firmware) and then there is *no* e820 API functions available to update e820_table_kexec and e820_table_firmware. Even if i add new e820 API interfaces to allow e820 memmap updates to e820_table_kexec and e820_table_firmware, before setup_arch() returns it calls e820__reserve_resources() which exposes the e820_table_firmware to sysfs (/sys/firmware/memmap) which is then used directly by kexec-tools.

I don't see any callbacks i can use to call any newly added e820 API to update e820_table_firmware before it is exposed to sysfs between setup_arch()->e820__memory_setup() and setup_arch()->e820__reserve_resources().

For the same reason, i tried to use initcalls to invoke snp_rmptable_e820_fixup() but then initcalls happen after setup_arch() and by that time e820_table_firmware has already been exposed to sysfs.

That's why i see x86_init.resources_memory_setup() an ideal callback to probe RMP table itself and then apply any RMP table platform specific quirks.

Thanks, Ashish