Re: [PATCH linux v3 3/9] xen: introduce xen_vcpu_id mapping

From: Stefano Stabellini
Date: Mon Sep 05 2016 - 15:21:02 EST


On Mon, 5 Sep 2016, Vitaly Kuznetsov wrote:
> Julien Grall <julien.grall@xxxxxxx> writes:
>
> > Hi Vitaly,
> >
> > On 26/07/16 13:30, Vitaly Kuznetsov wrote:
> >> It may happen that Xen's and Linux's ideas of vCPU id diverge. In
> >> particular, when we crash on a secondary vCPU we may want to do kdump
> >> and unlike plain kexec where we do migrate_to_reboot_cpu() we try booting
> >> on the vCPU which crashed. This doesn't work very well for PVHVM guests as
> >> we have a number of hypercalls where we pass vCPU id as a parameter. These
> >> hypercalls either fail or do something unexpected. To solve the issue
> >> introduce percpu xen_vcpu_id mapping. ARM and PV guests get direct mapping
> >> for now. Boot CPU for PVHVM guest gets its id from CPUID. With secondary
> >> CPUs it is a bit more trickier. Currently, we initialize IPI vectors
> >> before these CPUs boot so we can't use CPUID. Use ACPI ids from MADT
> >> instead.
> >>
> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> >> ---
> >> Changes since v2:
> >> - Use uint32_t for xen_vcpu_id mapping [Julien Grall]
> >>
> >> Changes since v1:
> >> - Introduce xen_vcpu_nr() helper [David Vrabel]
> >> - Use ACPI ids instead of vLAPIC ids /2 [Andrew Cooper, Jan Beulich]
> >> ---
> >> arch/arm/xen/enlighten.c | 10 ++++++++++
> >> arch/x86/xen/enlighten.c | 23 ++++++++++++++++++++++-
> >> include/xen/xen-ops.h | 6 ++++++
> >> 3 files changed, 38 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> >> index 75cd734..fe32267 100644
> >> --- a/arch/arm/xen/enlighten.c
> >> +++ b/arch/arm/xen/enlighten.c
> >> @@ -46,6 +46,10 @@ struct shared_info *HYPERVISOR_shared_info = (void *)&xen_dummy_shared_info;
> >> DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu);
> >> static struct vcpu_info __percpu *xen_vcpu_info;
> >>
> >> +/* Linux <-> Xen vCPU id mapping */
> >> +DEFINE_PER_CPU(uint32_t, xen_vcpu_id) = U32_MAX;
> >> +EXPORT_PER_CPU_SYMBOL(xen_vcpu_id);
> >> +
> >> /* These are unused until we support booting "pre-ballooned" */
> >> unsigned long xen_released_pages;
> >> struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS] __initdata;
> >> @@ -179,6 +183,9 @@ static void xen_percpu_init(void)
> >> pr_info("Xen: initializing cpu%d\n", cpu);
> >> vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
> >>
> >> + /* Direct vCPU id mapping for ARM guests. */
> >> + per_cpu(xen_vcpu_id, cpu) = cpu;
> >> +
> >
> > We did some internal testing on ARM64 with the latest Linux kernel
> > (4.8-rc4) and noticed that this patch is breaking SMP support. Sorry
> > for noticing the issue that late.
>
> Sorry for the breakage :-(
>
> >
> > This function is called on the running CPU whilst some code (e.g
> > init_control_block in drivers/xen/events/events_fifo.c) is executed
> > whilst preparing the CPU on the boot CPU.
> >
> > So xen_vcpu_nr(cpu) will always return 0 in this case and
> > init_control_block will fail to execute.
> >
>
> I see,
>
> CPU_UP_PREPARE event happens before xen_starting_cpu() is called.
>
>
> > I am not sure how to fix. I guess we could setup per_cpu(xen_vcpu_id,
> > *) in xen_guest_init. Any opinions?
>
> As we're not doing kexec on ARM we can fix the immediate issue. I don't
> know much about ARM and unfortunatelly I don't have a setup to test but
> it seems there is no early_per_cpu* infrastructure for ARM so we may fix
> it with the following:
>
> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> index 3d2cef6..f193414 100644
> --- a/arch/arm/xen/enlighten.c
> +++ b/arch/arm/xen/enlighten.c
> @@ -170,9 +170,6 @@ static int xen_starting_cpu(unsigned int cpu)
> pr_info("Xen: initializing cpu%d\n", cpu);
> vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
>
> - /* Direct vCPU id mapping for ARM guests. */
> - per_cpu(xen_vcpu_id, cpu) = cpu;
> -
> info.mfn = virt_to_gfn(vcpup);
> info.offset = xen_offset_in_page(vcpup);
>
> @@ -330,6 +327,7 @@ static int __init xen_guest_init(void)
> {
> struct xen_add_to_physmap xatp;
> struct shared_info *shared_info_page = NULL;
> + int cpu;
>
> if (!xen_domain())
> return 0;
> @@ -380,7 +378,8 @@ static int __init xen_guest_init(void)
> return -ENOMEM;
>
> /* Direct vCPU id mapping for ARM guests. */
> - per_cpu(xen_vcpu_id, 0) = 0;
> + for_each_possible_cpu(cpu)
> + per_cpu(xen_vcpu_id, cpu) = cpu;
>
> xen_auto_xlat_grant_frames.count = gnttab_max_grant_frames();
> if (xen_xlate_map_ballooned_pages(&xen_auto_xlat_grant_frames.pfn,
>
> (not tested, if we can't use for_each_possible_cpu() that early we'll
> have to check against NR_CPUS instead).

Kind of defeat the purpose of xen_vcpu_id, but I guess it should work.


> But unfortunatelly we'll have to get back to this in future. Turns out
> we need to know Xen's idea of vCPU id _before_ this vCPU starts
> executing code.

Why?


> On x86 we used ACPI_ID from MADT. Is there anything like
> it on ARM?

MPIDR:

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0388e/CIHEBGFG.html

But first we should formally document the relationship between MPIDR and
vcpu id.