Re: [PATCH 0/3] Resolve problems with kexec identity mapping

From: Dave Young
Date: Wed May 22 2024 - 22:54:22 EST


Cc kexec list as well.

On Thu, 23 May 2024 at 10:52, Dave Young <dyoung@xxxxxxxxxx> wrote:
>
> Add Tao in the cc list.
>
> On Tue, 21 May 2024 at 02:37, Steve Wahl <steve.wahl@xxxxxxx> wrote:
> >
> > Although there was a previous fix to avoid early kernel access to the
> > EFI config table on Intel systems, the problem can still exist on AMD
> > systems that support SEV (Secure Encrypted Virtualization). The
> > command line option "nogbpages" brings this bug to the surface. And
> > this is what caused the regression with my earlier patch that
> > attempted to reduce the use of gbpages. This patch series fixes that
> > problem and restores my earlier patch.
> >
> > The following 2 commits caused the EFI config table, and the CC_BLOB
> > entry in that table, to be accessed when enabling SEV at kernel
> > startup.
> >
> > commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
> > earlier during boot")
> > commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
> > detection/setup")
> >
> > These accesses happen before the new kernel establishes its own
> > identity map, and before establishing a routine to handle page faults.
> > But the areas referenced are not explicitly added to the kexec
> > identity map.
> >
> > This goes unnoticed when these areas happen to be placed close enough
> > to others areas that are explicitly added to the identity map, but
> > that is not always the case.
> >
> > Under certain conditions, for example Intel Atom processors that don't
> > support 1GB pages, it was found that these areas don't end up mapped,
> > and the SEV initialization code causes an unrecoverable page fault,
> > and the kexec fails.
> >
> > Tau Liu had offered a patch to put the config table into the kexec
> > identity map to avoid this problem:
> >
> > https://lore.kernel.org/all/20230601072043.24439-1-ltao@xxxxxxxxxx/
> >
> > But the community chose instead to avoid referencing this memory on
> > non-AMD systems where the problem was reported.
> >
> > commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> > on non-AMD hardware")
> >
> > I later wanted to make a different change to kexec identity map
> > creation, and had this patch accepted:
> >
> > commit d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
> >
> > but it quickly needed to be reverted because of problems on AMD systems.
> >
> > The reported regression problems on AMD systems were due to the above
> > mentioned references to the EFI config table. In fact, on the same
> > systems, the "nogbpages" command line option breaks kexec as well.
> >
> > So I resubmit Tau Liu's original patch that maps the EFI config
> > table, add an additional patch by me that ensures that the CC blob is
> > also mapped (if present), and also resubmit my earlier patch to use
> > gpbages only when a full GB of space is requested to be mapped.
> >
> > I do not advocate for removing the earlier, non-AMD fix. With kexec,
> > two different kernel versions can be in play, and the earlier fix
> > still covers non-AMD systems when the kexec'd-from kernel doesn't have
> > these patches applied.
> >
> > All three of the people who reported regression with my earlier patch
> > have retested with this patch series and found it to work where my
> > single patch previously did not. With current kernels, all fail to
> > kexec when "nogbpages" is on the command line, but all succeed with
> > "nogbpages" after the series is applied.
> >
> > Tao Liu (1):
> > x86/kexec: Add EFI config table identity mapping for kexec kernel
> >
> > Steve Wahl (2):
> > x86/kexec: Add EFI Confidential Computing blob to kexec identity
> > mapping.
> > x86/mm/ident_map: Use gbpages only where full GB page should be
> > mapped.
> >
> > arch/x86/kernel/machine_kexec_64.c | 82 ++++++++++++++++++++++++++++--
> > arch/x86/mm/ident_map.c | 23 +++++++--
> > 2 files changed, 95 insertions(+), 10 deletions(-)
> >
> > --
> > 2.26.2
> >