Re: [PATCH 1/2] sched: add virt sched domain for the guest

From: Liu ping fan
Date: Wed May 23 2012 - 05:58:27 EST


On Wed, May 23, 2012 at 4:48 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Wed, 2012-05-23 at 16:34 +0800, Liu ping fan wrote:
>> so we need to migrate some of vcpus from node-B to node-A, or to
>> node-C.
>
> This is absolutely broken, you cannot do that.
>
> A guest task might want to be node affine, it looks at the topology sets
> a cpu affinity mask and expects to stay on that node.
>
> But then you come along, and flip one of those cpus to another node. The
> guest task will now run on another node and get remote memory accesses.
>
Oh, I had thought using -smp to handle such situation. The memory
accesses cost problem can be partly handled by kvm,
while opening a gap for guest's scheduler to see the host numa info.

> Similarly for the guest kernel, it assumes cpu:node maps are static, it
> will use this for all kinds of things, including the allocation of
> per-cpu memory to be node affine to that cpu.
>
> If you go migrate cpus across nodes everything comes down.
>
>
> Please go do something else, I'll do this.

OK, thanks.
pingfan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/