Re: [RFC PATCH v4 00/19] Core scheduling v4

From: Aaron Lu
Date: Mon Feb 24 2020 - 22:44:49 EST


On Fri, Feb 14, 2020 at 02:10:53PM +0800, Aubrey Li wrote:
> On Fri, Feb 14, 2020 at 2:37 AM Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> wrote:
> >
> > On 2/12/20 3:07 PM, Julien Desfossez wrote:
> >
> > >>
> > >> Have you guys been able to make progress on the issues with I/O intensive workload?
> > >
> > > I finally have some results with the following branch:
> > > https://github.com/digitalocean/linux-coresched/tree/coresched/v4-v5.5.y
> > >
> > >
> > > So the main conclusion is that for all the test cases we have studied,
> > > core scheduling performs better than nosmt ! This is different than what
> > > we tested a while back, so it's looking really good !
> >
> > Thanks for the data. They look really encouraging.
> >
> > Aubrey is working on updating his patches so it will load balance
> > to the idle cores a bit better. We are testing those and will post
> > the update soon.
>
> I added a helper to check task and cpu cookie match, including the
> entire core idle case. The refined patchset updated at here:
> https://github.com/aubreyli/linux/tree/coresched_v4-v5.5.2
>
> This branch also includes Tim's patchset. According to our testing
> result, the performance data looks on par with the previous version.
> A good news is, v5.4.y stability issue on our 8 numa node machine
> is gone on this v5.5.2 branch.

One problem I have when testing this branch: the weight of the croup
seems to be ignored.

On a 2sockets/16cores/32threads VM, I grouped 8 sysbench(cpu mode)
threads into one cgroup(cgA) and another 16 sysbench(cpu mode) threads
into another cgroup(cgB). cgA and cgB's cpusets are set to the same
socket's 8 cores/16 CPUs and cgA's cpu.shares is set to 10240 while cgB's
cpu.shares is set to 2(so consider cgB as noise workload and cgA as
the real workload).

I had expected cgA to occupy 8 cpus(with each cpu on a different core)
most of the time since it has way more weight than cgB, while cgB should
occupy almost no CPUs since:
- when cgB's task is in the same CPU queue as cgA's task, then cgB's
task is given very little CPU due to its small weight;
- when cgB's task is in a CPU queue whose sibling's queue has cgA's
task, cgB's task should be forced idle(again, due to its small weight).

But testing shows cgA occupies only 2 cpus during the entire run while
cgB enjoys the remaining 14 cpus. As a comparison, when coresched is off,
cgA can occupy 8 cpus during its run.

I haven't taken a look at the patches, but would like to raise the
problem first. My gut feeling is that, we didn't make the CPU's load
balanced.

P.S. it's not that I care about VM's performance, it's just easier to
test kernel stuff using a VM than on a bare metal. Its CPU setup might
seem weird, I just set it up to be the same as my host setup.