Re: [PATCH v5 07/16] kexec: add Kexec HandOver (KHO) generation helpers

From: Frank van der Linden
Date: Mon Mar 24 2025 - 14:41:08 EST


On Wed, Mar 19, 2025 at 6:56 PM Changyuan Lyu <changyuanl@xxxxxxxxxx> wrote:
>
> From: Alexander Graf <graf@xxxxxxxxxx>
>
> Add the core infrastructure to generate Kexec HandOver metadata. Kexec
> HandOver is a mechanism that allows Linux to preserve state - arbitrary
> properties as well as memory locations - across kexec.
>
> It does so using 2 concepts:
>
> 1) State Tree - Every KHO kexec carries a state tree that describes the
> state of the system. The state tree is represented as hash-tables.
> Device drivers can add/remove their data into/from the state tree at
> system runtime. On kexec, the tree is converted to FDT (flattened
> device tree).
>
> 2) Scratch Regions - CMA regions that we allocate in the first kernel.
> CMA gives us the guarantee that no handover pages land in those
> regions, because handover pages must be at a static physical memory
> location. We use these regions as the place to load future kexec
> images so that they won't collide with any handover data.
>
> Signed-off-by: Alexander Graf <graf@xxxxxxxxxx>
> Co-developed-by: Pratyush Yadav <ptyadav@xxxxxxxxx>
> Signed-off-by: Pratyush Yadav <ptyadav@xxxxxxxxx>
> Co-developed-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx>
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx>
> Co-developed-by: Changyuan Lyu <changyuanl@xxxxxxxxxx>
> Signed-off-by: Changyuan Lyu <changyuanl@xxxxxxxxxx>
> ---
> MAINTAINERS | 2 +-
> include/linux/kexec_handover.h | 109 +++++
> kernel/Makefile | 1 +
> kernel/kexec_handover.c | 865 +++++++++++++++++++++++++++++++++
> mm/mm_init.c | 8 +
> 5 files changed, 984 insertions(+), 1 deletion(-)
> create mode 100644 include/linux/kexec_handover.h
> create mode 100644 kernel/kexec_handover.c
[...]
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index 04441c258b05..757659b7a26b 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -30,6 +30,7 @@
> #include <linux/crash_dump.h>
> #include <linux/execmem.h>
> #include <linux/vmstat.h>
> +#include <linux/kexec_handover.h>
> #include "internal.h"
> #include "slab.h"
> #include "shuffle.h"
> @@ -2661,6 +2662,13 @@ void __init mm_core_init(void)
> report_meminit();
> kmsan_init_shadow();
> stack_depot_early_init();
> +
> + /*
> + * KHO memory setup must happen while memblock is still active, but
> + * as close as possible to buddy initialization
> + */
> + kho_memory_init();
> +
> mem_init();
> kmem_cache_init();
> /*


Thanks for the work on this.

Obviously it needs to happen while memblock is still active - but why
as close as possible to buddy initialization?

Ordering is always a sticky issue when it comes to doing things during
boot, of course. In this case, I can see scenarios where code that
runs a little earlier may want to use some preserved memory. The
current requirement in the patch set seems to be "after sparse/page
init", but I'm not sure why it needs to be as close as possibly to
buddy init.

- Frank