On 08/05/24 at 10:38am, Sourabh Jain wrote:
Commit 79365026f869 ("crash: add a new kexec flag for hotplug support")The overall looks good to me, except of concern from Petr. Thanks.
generalizes the crash hotplug support to allow architectures to update
multiple kexec segments on CPU/Memory hotplug and not just elfcorehdr.
Therefore, update the relevant kernel documentation to reflect the same.
No functional change.
Cc: Petr Tesarik <petr@xxxxxxxxxxx>
Cc: Hari Bathini <hbathini@xxxxxxxxxxxxx>
Cc: kexec@xxxxxxxxxxxxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx
Cc: linuxppc-dev@xxxxxxxxxxxxxxxx
Cc: x86@xxxxxxxxxx
Signed-off-by: Sourabh Jain <sourabhjain@xxxxxxxxxxxxx>
---
Discussion about the documentation update:
https://lore.kernel.org/all/68d0328d-531a-4a2b-ab26-c97fd8a12e8b@xxxxxxxxxxxxx/
---
.../ABI/testing/sysfs-devices-memory | 6 ++--
.../ABI/testing/sysfs-devices-system-cpu | 6 ++--
.../admin-guide/mm/memory-hotplug.rst | 5 ++--
Documentation/core-api/cpu_hotplug.rst | 10 ++++---
kernel/crash_core.c | 29 ++++++++++++-------
5 files changed, 33 insertions(+), 23 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-devices-memory b/Documentation/ABI/testing/sysfs-devices-memory
index a95e0f17c35a..421acc8e2c6b 100644
--- a/Documentation/ABI/testing/sysfs-devices-memory
+++ b/Documentation/ABI/testing/sysfs-devices-memory
@@ -115,6 +115,6 @@ What: /sys/devices/system/memory/crash_hotplug
Date: Aug 2023
Contact: Linux kernel mailing list <linux-kernel@xxxxxxxxxxxxxxx>
Description:
- (RO) indicates whether or not the kernel directly supports
- modifying the crash elfcorehdr for memory hot un/plug and/or
- on/offline changes.
+ (RO) indicates whether or not the kernel update of kexec
+ segments on memory hot un/plug and/or on/offline events,
+ avoiding the need to reload kdump kernel.
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 325873385b71..f4ada1cd2f96 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -703,9 +703,9 @@ What: /sys/devices/system/cpu/crash_hotplug
Date: Aug 2023
Contact: Linux kernel mailing list <linux-kernel@xxxxxxxxxxxxxxx>
Description:
- (RO) indicates whether or not the kernel directly supports
- modifying the crash elfcorehdr for CPU hot un/plug and/or
- on/offline changes.
+ (RO) indicates whether or not the kernel update of kexec
+ segments on CPU hot un/plug and/or on/offline events,
+ avoiding the need to reload kdump kernel.
What: /sys/devices/system/cpu/enabled
Date: Nov 2022
diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
index 098f14d83e99..cb2c080f400c 100644
--- a/Documentation/admin-guide/mm/memory-hotplug.rst
+++ b/Documentation/admin-guide/mm/memory-hotplug.rst
@@ -294,8 +294,9 @@ The following files are currently defined:
``crash_hotplug`` read-only: when changes to the system memory map
occur due to hot un/plug of memory, this file contains
'1' if the kernel updates the kdump capture kernel memory
- map itself (via elfcorehdr), or '0' if userspace must update
- the kdump capture kernel memory map.
+ map itself (via elfcorehdr and other relevant kexec
+ segments), or '0' if userspace must update the kdump
+ capture kernel memory map.
Availability depends on the CONFIG_MEMORY_HOTPLUG kernel
configuration option.
diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst
index dcb0e379e5e8..a21dbf261be7 100644
--- a/Documentation/core-api/cpu_hotplug.rst
+++ b/Documentation/core-api/cpu_hotplug.rst
@@ -737,8 +737,9 @@ can process the event further.
When changes to the CPUs in the system occur, the sysfs file
/sys/devices/system/cpu/crash_hotplug contains '1' if the kernel
-updates the kdump capture kernel list of CPUs itself (via elfcorehdr),
-or '0' if userspace must update the kdump capture kernel list of CPUs.
+updates the kdump capture kernel list of CPUs itself (via elfcorehdr and
+other relevant kexec segment), or '0' if userspace must update the kdump
+capture kernel list of CPUs.
The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration
option.
@@ -750,8 +751,9 @@ file can be used in a udev rule as follows:
SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
For a CPU hot un/plug event, if the architecture supports kernel updates
-of the elfcorehdr (which contains the list of CPUs), then the rule skips
-the unload-then-reload of the kdump capture kernel.
+of the elfcorehdr (which contains the list of CPUs) and other relevant
+kexec segments, then the rule skips the unload-then-reload of the kdump
+capture kernel.
Kernel Inline Documentations Reference
======================================
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 63cf89393c6e..64dad01e260b 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -520,18 +520,25 @@ int crash_check_hotplug_support(void)
}
/*
- * To accurately reflect hot un/plug changes of cpu and memory resources
- * (including onling and offlining of those resources), the elfcorehdr
- * (which is passed to the crash kernel via the elfcorehdr= parameter)
- * must be updated with the new list of CPUs and memories.
+ * To accurately reflect hot un/plug changes of CPU and Memory resources
+ * (including onling and offlining of those resources), the relevant
+ * kexec segments must be updated with latest CPU and Memory resources.
*
- * In order to make changes to elfcorehdr, two conditions are needed:
- * First, the segment containing the elfcorehdr must be large enough
- * to permit a growing number of resources; the elfcorehdr memory size
- * is based on NR_CPUS_DEFAULT and CRASH_MAX_MEMORY_RANGES.
- * Second, purgatory must explicitly exclude the elfcorehdr from the
- * list of segments it checks (since the elfcorehdr changes and thus
- * would require an update to purgatory itself to update the digest).
+ * Architectures must ensure two things for all segments that need
+ * updating during hotplug events:
+ *
+ * 1. Segments must be large enough to accommodate a growing number of
+ * resources.
+ * 2. Exclude the segments from SHA verification.
+ *
+ * For example, on most architectures, the elfcorehdr (which is passed
+ * to the crash kernel via the elfcorehdr= parameter) must include the
+ * new list of CPUs and memory. To make changes to the elfcorehdr, it
+ * should be large enough to permit a growing number of CPU and Memory
+ * resources. One can estimate the elfcorehdr memory size based on
+ * NR_CPUS_DEFAULT and CRASH_MAX_MEMORY_RANGES. The elfcorehdr is
+ * excluded from SHA verification by default if the architecture
+ * supports crash hotplug.
*/
static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu, void *arg)
{
--
2.45.2