Re: [PATCH linux 1/2] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32

From: Juergen Gross
Date: Tue Oct 12 2021 - 04:48:50 EST


On 12.10.21 09:24, Dongli Zhang wrote:
The sched_clock() can be used very early since upstream
commit 857baa87b642 ("sched/clock: Enable sched clock early"). In addition,
with upstream commit 38669ba205d1 ("x86/xen/time: Output xen sched_clock
time from 0"), kdump kernel in Xen HVM guest may panic at very early stage
when accessing &__this_cpu_read(xen_vcpu)->time as in below:

setup_arch()
-> init_hypervisor_platform()
-> x86_init.hyper.init_platform = xen_hvm_guest_init()
-> xen_hvm_init_time_ops()
-> xen_clocksource_read()
-> src = &__this_cpu_read(xen_vcpu)->time;

This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info'
embedded inside 'shared_info' during early stage until xen_vcpu_setup() is
used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address.

However, when Xen HVM guest panic on vcpu >= 32, since
xen_vcpu_info_reset(0) would set per_cpu(xen_vcpu, cpu) = NULL when
vcpu >= 32, xen_clocksource_read() on vcpu >= 32 would panic.

This patch delays xen_hvm_init_time_ops() to later in
xen_hvm_smp_prepare_boot_cpu() after the 'vcpu_info' for boot vcpu is
registered when the boot vcpu is >= 32.

This issue can be reproduced on purpose via below command at the guest
side when kdump/kexec is enabled:

"taskset -c 33 echo c > /proc/sysrq-trigger"

Cc: Joe Jin <joe.jin@xxxxxxxxxx>
Signed-off-by: Dongli Zhang <dongli.zhang@xxxxxxxxxx>
---
arch/x86/xen/enlighten_hvm.c | 20 +++++++++++++++++++-
arch/x86/xen/smp_hvm.c | 3 +++
2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index e68ea5f4ad1c..152279416d9a 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -216,7 +216,25 @@ static void __init xen_hvm_guest_init(void)
WARN_ON(xen_cpuhp_setup(xen_cpu_up_prepare_hvm, xen_cpu_dead_hvm));
xen_unplug_emulated_devices();
x86_init.irqs.intr_init = xen_init_IRQ;
- xen_hvm_init_time_ops();
+
+ /*
+ * Only MAX_VIRT_CPUS 'vcpu_info' are embedded inside 'shared_info'
+ * and the VM would use them until xen_vcpu_setup() is used to
+ * allocate/relocate them at arbitrary address.
+ *
+ * However, when Xen HVM guest panic on vcpu >= MAX_VIRT_CPUS,
+ * per_cpu(xen_vcpu, cpu) is still NULL at this stage. To access
+ * per_cpu(xen_vcpu, cpu) via xen_clocksource_read() would panic.
+ *
+ * Therefore we delay xen_hvm_init_time_ops() to
+ * xen_hvm_smp_prepare_boot_cpu() when boot vcpu is >= MAX_VIRT_CPUS.
+ */
+ if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS)
+ pr_info("Delay xen_hvm_init_time_ops() as kernel is running on vcpu=%d\n",
+ xen_vcpu_nr(0));
+ else
+ xen_hvm_init_time_ops();
+
xen_hvm_init_mmu_ops();
#ifdef CONFIG_KEXEC_CORE
diff --git a/arch/x86/xen/smp_hvm.c b/arch/x86/xen/smp_hvm.c
index 6ff3c887e0b9..60cd4fafd188 100644
--- a/arch/x86/xen/smp_hvm.c
+++ b/arch/x86/xen/smp_hvm.c
@@ -19,6 +19,9 @@ static void __init xen_hvm_smp_prepare_boot_cpu(void)
*/
xen_vcpu_setup(0);
+ if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS)
+ xen_hvm_init_time_ops();
+

Please add a comment referencing the related code in
xen_hvm_guest_init().


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature