Re: [PATCH RESEND v2 2/2] xen: enable vnuma for PV guest

From: David Vrabel
Date: Tue Nov 19 2013 - 09:57:12 EST


On 19/11/13 14:46, Konrad Rzeszutek Wilk wrote:
> On Tue, Nov 19, 2013 at 02:35:59PM +0000, David Vrabel wrote:
>> On 19/11/13 14:16, Konrad Rzeszutek Wilk wrote:
>>> On Tue, Nov 19, 2013 at 11:54:08AM +0000, David Vrabel wrote:
>>>> On 18/11/13 21:58, Elena Ufimtseva wrote:
>>>>> Enables numa if vnuma topology hypercall is supported and it is domU.
>>>> [...]
>>>>> --- a/arch/x86/xen/setup.c
>>>>> +++ b/arch/x86/xen/setup.c
>>>>> @@ -20,6 +20,7 @@
>>>>> #include <asm/numa.h>
>>>>> #include <asm/xen/hypervisor.h>
>>>>> #include <asm/xen/hypercall.h>
>>>>> +#include <asm/xen/vnuma.h>
>>>>>
>>>>> #include <xen/xen.h>
>>>>> #include <xen/page.h>
>>>>> @@ -598,6 +599,9 @@ void __init xen_arch_setup(void)
>>>>> WARN_ON(xen_set_default_idle());
>>>>> fiddle_vdso();
>>>>> #ifdef CONFIG_NUMA
>>>>> - numa_off = 1;
>>>>> + if (!xen_initial_domain() && xen_vnuma_supported())
>>>>> + numa_off = 0;
>>>>> + else
>>>>> + numa_off = 1;
>>>>> #endif
>>>>> }
>>>>
>>>> I think this whole #ifdef CONFIG_NUMA can be removed and hence
>>>> xen_vnuma_supported() can be removed as well.
>>>>
>>>> For any PV guest we can call the xen_numa_init() and it will do the
>>>> right thing.
>>>>
>>>> For dom0, the hypercall will either: return something sensible (if in
>>>> the future Xen sets something up), or it will error.
>>>>
>>>> If Xen does not have vnuma support, the hypercall will error.
>>>>
>>>> In both error cases, the dummy numa node is setup as required.
>>>
>>> Incorrect. It will end up calling:
>>>
>>> if (!numa_init(amd_numa_init))
>>>
>>> which will crash dom0 (see 8d54db795 "xen/boot: Disable NUMA for PV guests.")
>>> as that amd_numa_init is called before the dummy node init.
>>
>> No it won't. Any error path after the check for a PV guest will add the
>> dummy node and return success, skipping any of the hardware-specific setup.
>
> Duh! I totally missed 'return' at the end of the check!
>
> However, even with that (so the return), that means
> this part won't be called:
>
> 649 numa_init(dummy_numa_init);
>
> Which means there won't be any dummy numa setup?

The relevant bits in dummy_numa_init are in the error path of
xen_numa_init().

I do think this approach (using the provided API to setup the single
(dummy) node), is preferable to calling dummy_numa_init().

If I thought the hypervisor ABI was finalized, I'd be happy with this
series as-is -- the remaining issues are superficial.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/