Re: [PATCH v4 11/11] sched/fair: rework find_idlest_group

From: Qais Yousef
Date: Thu Nov 21 2019 - 09:58:15 EST


On 11/20/19 19:55, Qais Yousef wrote:
> On 11/20/19 20:28, Vincent Guittot wrote:
> > I run few more tests and i can get either hw counter with 0 or not.
> > The main difference is on which CPU it runs: either big or little
> > little return always 0 and big always non-zero value
> >
> > on v5.4-rc7 and tip/sched/core, cpu0-3 return 0 and other non zeroa
> > but on next, it's the opposite cpu0-3 return non zero ratio
> >
> > Could you try to run the test with taskset to run it on big or little ?
>
> Nice catch!
>
> Yes indeed using taskset and forcing it to run on the big cpus it passes even
> on linux-next/next-20191119.
>
> So the relation to your patch is that it just biased where this test is likely
> to run in my case and highlighted the breakage in the counters, probably?
>
> FWIW, if I use taskset to force always big it passes. Always small, the counters
> are always 0 and it passes too. But if I have mixed I see what I pasted before,
> the counters have valid value but nhw is 0.
>
> So the questions are, why little counters aren't working. And whether we should
> run the test with taskset generally as it can't handle the asymmetry correctly.
>
> Let me first try to find out why the little counters aren't working.

So it turns out there's a caveat on usage of perf counters on big.LITTLE
systems.

Mark on CC can explain this better than me so I'll leave the details to him.

Sorry about the noise Vincent - it seems your patch was shifting things
slightly to cause migrating the task to another CPU, hence trigger the failure
on reading the perf counters, and the test in return.

Thanks

--
Qais Yousef