Re: [PATCH v5 11/16] kexec: add config option for KHO
From: Dave Young
Date: Mon Mar 24 2025 - 23:06:58 EST
On Tue, 25 Mar 2025 at 09:24, Dave Young <dyoung@xxxxxxxxxx> wrote:
>
> On Tue, 25 Mar 2025 at 03:27, Pasha Tatashin <pasha.tatashin@xxxxxxxxxx> wrote:
> >
> > On Mon, Mar 24, 2025 at 12:18 AM Dave Young <dyoung@xxxxxxxxxx> wrote:
> > >
> > > On Thu, 20 Mar 2025 at 23:05, Changyuan Lyu <changyuanl@xxxxxxxxxx> wrote:
> > > >
> > > > From: Alexander Graf <graf@xxxxxxxxxx>
> > > >
> > > > We have all generic code in place now to support Kexec with KHO. This
> > > > patch adds a config option that depends on architecture support to
> > > > enable KHO support.
> > > >
> > > > Signed-off-by: Alexander Graf <graf@xxxxxxxxxx>
> > > > Co-developed-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx>
> > > > Signed-off-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx>
> > > > Co-developed-by: Changyuan Lyu <changyuanl@xxxxxxxxxx>
> > > > Signed-off-by: Changyuan Lyu <changyuanl@xxxxxxxxxx>
> > > > ---
> > > > kernel/Kconfig.kexec | 15 +++++++++++++++
> > > > 1 file changed, 15 insertions(+)
> > > >
> > > > diff --git a/kernel/Kconfig.kexec b/kernel/Kconfig.kexec
> > > > index 4d111f871951..57db99e758a8 100644
> > > > --- a/kernel/Kconfig.kexec
> > > > +++ b/kernel/Kconfig.kexec
> > > > @@ -95,6 +95,21 @@ config KEXEC_JUMP
> > > > Jump between original kernel and kexeced kernel and invoke
> > > > code in physical address mode via KEXEC
> > > >
> > > > +config KEXEC_HANDOVER
> > > > + bool "kexec handover"
> > > > + depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE
> > > > + select MEMBLOCK_KHO_SCRATCH
> > > > + select KEXEC_FILE
> > > > + select DEBUG_FS
> > > > + select LIBFDT
> > > > + select CMA
> > > > + select XXHASH
> > > > + help
> > > > + Allow kexec to hand over state across kernels by generating and
> > > > + passing additional metadata to the target kernel. This is useful
> > > > + to keep data or state alive across the kexec. For this to work,
> > > > + both source and target kernels need to have this option enabled.
> > > > +
> > >
> > > Have you tested kdump? In my mind there are two issues, one is with
> > > CMA enabled, it could cause kdump crashkernel memory reservation
> > > failures more often due to the fragmented low memory. Secondly, in
> >
> > As I understand cma low memory scratch reservation is needed only to
> > support some legacy pci devices that cannot use the full 64-bit space.
> > If so, I am not sure if KHO needs to be supported on machines with
> > such devices. However, even if we keep it, it should really be small,
> > so I would not expect that to be a problem for crash kernel memory
> > reservation.
>
> It is not easy to estimate how much of the KHO reserved memory is
> needed. I assume this as a mechanism for all different users, it is
> not predictable. Also it is not only about the size, but also it
> makes the memory fragmented.
>
> >
> > > kdump kernel dump the crazy scratch memory in vmcore is not very
> > > meaningful. Otherwise I suspect this is not tested under kdump. If
> > > so please disable this option for kdump.
> >
> > The scratch memory will appear as regular CMA in the vmcore. The crash
> > kernel can be kexec loaded only from userland, long after the scratch
> > memory is converted to CMA.
>
> Depending on the reserved size, if big enough it should be excluded in
> vmcore dumping.
> Otherwise if it is a kdump kernel it should skip the handling of the
> KHO passed previous old states.
If you do not want to make the KHO conflicts with kdump, then the
above should be handled and well tested. And then leave to end user
and distribution to determine if they want the both enabled
considering the risk of crashkernel reservation failure.
>
> >
> > Pasha
> >