Re: [PATCH v12 3/7] crash: add generic infrastructure for crash hotplug support

From: Sourabh Jain
Date: Tue Oct 04 2022 - 02:39:04 EST



On 10/09/22 02:35, Eric DeVolder wrote:
CPU and memory change notifications are received in order to
regenerate the elfcorehdr.

To support cpu hotplug, a callback is registered to capture the
CPUHP_AP_ONLINE_DYN online and offline events via
cpuhp_setup_state_nocalls().

To support memory hotplug, a notifier is registered to capture the
MEM_ONLINE and MEM_OFFLINE events via register_memory_notifier().

The cpu callback and memory notifiers call handle_hotplug_event()
which performs needed tasks and then dispatches the event to the
architecture specific arch_crash_handle_hotplug_event(). During the
process, the kexec_mutex is held.

Signed-off-by: Eric DeVolder <eric.devolder@xxxxxxxxxx>
Acked-by: Baoquan He <bhe@xxxxxxxxxx>
---
include/linux/crash_core.h | 8 +++
include/linux/kexec.h | 26 +++++++
kernel/crash_core.c | 134 +++++++++++++++++++++++++++++++++++++
3 files changed, 168 insertions(+)

diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index de62a722431e..a270f8660538 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -84,4 +84,12 @@ int parse_crashkernel_high(char *cmdline, unsigned long long system_ram,
int parse_crashkernel_low(char *cmdline, unsigned long long system_ram,
unsigned long long *crash_size, unsigned long long *crash_base);
+#define KEXEC_CRASH_HP_REMOVE_CPU 0
+#define KEXEC_CRASH_HP_ADD_CPU 1
+#define KEXEC_CRASH_HP_REMOVE_MEMORY 2
+#define KEXEC_CRASH_HP_ADD_MEMORY 3
+#define KEXEC_CRASH_HP_INVALID_CPU -1U
+
+struct kimage;
+
#endif /* LINUX_CRASH_CORE_H */
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 4eefa631e0ae..9597b41136ec 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -374,6 +374,13 @@ struct kimage {
struct purgatory_info purgatory_info;
#endif
+#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
+ bool hotplug_event;
+ unsigned int offlinecpu;
+ bool elfcorehdr_index_valid;
+ int elfcorehdr_index;

Do we really need elfcorehdr_index_valid to decide elfcorehdr_index holds a valid index?

How about initializing elfcorehdr_index to a negative number while loading kdump kernel (or kexec kernel if needed)
for both kexec_load and kexec_file_load case and consider that as invalid index to find the correct one.

Some thing like this:

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 5bc5159d9cb1..0cccdb2f7f26 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -656,7 +656,7 @@ static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
                 * segment containing the elfcorehdr, if not already found.
                 * This works for both the kexec_load and kexec_file_load paths.
                 */
-               if (!image->elfcorehdr_index_valid) {
+               if (image->elfcorehdr_index < 0) {
                        unsigned char *ptr;
                        unsigned long mem, memsz;
                        unsigned int n;
diff --git a/kernel/kexec.c b/kernel/kexec.c
index b5e40f069768..ed1c6a88879b 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -156,6 +156,10 @@ static int do_kexec_load(unsigned long entry, unsigned long nr_segments,
        if (ret)
                goto out;

+       /* Below check is not necessary */
+       if (flags & KEXEC_FILE_ON_CRASH)
+               image->elfcorehdr_index = -1;
+
        /* Install the new kernel and uninstall the old */
        image = xchg(dest_image, image);

diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index d0c2661b3509..535dbc26930a 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -400,6 +400,10 @@ SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
        if (ret)
                goto out;

+       /* Below check is not necessary */
+       if (flags & KEXEC_FILE_ON_CRASH)
+               image->elfcorehdr_index = -1;
+
        /*
         * Free up any temporary buffers allocated which are not needed
         * after image has been loaded

Thanks,
Sourabh Jain