Re: [RFC 00/60] Coscheduling for Linux

From: Jan H. SchÃnherr
Date: Wed Sep 12 2018 - 15:34:26 EST


On 09/12/2018 02:24 AM, Nishanth Aravamudan wrote:
> [ I am not subscribed to LKML, please keep me CC'd on replies ]
>
> I tried a simple test with several VMs (in my initial test, I have 48
> idle 1-cpu 512-mb VMs and 2 idle 2-cpu, 2-gb VMs) using libvirt, none
> pinned to any CPUs. When I tried to set all of the top-level libvirt cpu
> cgroups' to be co-scheduled (/bin/echo 1 >
> /sys/fs/cgroup/cpu/machine/<VM-x>.libvirt-qemu/cpu.scheduled), the
> machine hangs. This is using cosched_max_level=1.
>
> There are several moving parts there, so I tried narrowing it down, by
> only coscheduling one VM, and thing seemed fine:
>
> /sys/fs/cgroup/cpu/machine/<VM-1>.libvirt-qemu# echo 1 > cpu.scheduled
> /sys/fs/cgroup/cpu/machine/<VM-1>.libvirt-qemu# cat cpu.scheduled
> 1
>
> One thing that is not entirely obvious to me (but might be completely
> intentional) is that since by default the top-level libvirt cpu cgroups
> are empty:
>
> /sys/fs/cgroup/cpu/machine/<VM-1>.libvirt-qemu# cat tasks
>
> the result of this should be a no-op, right? [This becomes relevant
> below] Specifically, all of the threads of qemu are in sub-cgroups,
> which do not indicate they are co-scheduling:
>
> /sys/fs/cgroup/cpu/machine/<VM-1>.libvirt-qemu# cat emulator/cpu.scheduled
> 0
> /sys/fs/cgroup/cpu/machine/<VM-1>.libvirt-qemu# cat vcpu0/cpu.scheduled
> 0
>

This setup *should* work. It should be possible to set cpu.scheduled
independent of the cpu.scheduled values of parent and child task groups.
Any intermediate regular task group (i.e. cpu.scheduled==0) will still
contribute the group fairness aspects.

That said, I see a hang, too. It seems to happen, when there is a
cpu.scheduled!=0 group that is not a direct child of the root task group.
You seem to have "/sys/fs/cgroup/cpu/machine" as an intermediate group.
(The case ==0 within !=0 within the root task group works for me.)

I'm going to dive into the code.

[...]
> I am happy to do any further debugging I can do, or try patches on top
> of those posted on the mailing list.

If you're willing, you can try to get rid of the intermediate "machine"
cgroup in your setup for the moment. This might tell us, whether we're
looking at the same issue.

Thanks,
Jan