also, it almost looks like there is a fundamental assumption in the code
that you can get the current effective P state to make scheduler decisions on;
on Intel at least that is basically impossible... and getting more so with every generation
(likewise for AMD afaics)
(you can get what you ran at on average over some time in the past, but not
what you're at now or going forward)
As described above, it is not a strict assumption. From a scheduler
point of view we somehow need to know if the cpus are truly fully
utilized (at their highest P-state)
so we need to throw more cpus at the
problem (assuming that we have more than one task per cpu) or if we can
just go to a higher P-state. We don't need a strict guarantee that we
get exactly the P-state that we request for each cpu. The power
scheduler generates hints and the power driver gives us feedback on what
we can roughly expect to get.
I'm rather nervous about calculating how many cores you want active as a core scheduler feature.
I understand that for your big.LITTLE architecture you need this due to the asymmetry,
but as a general rule for more symmetric systems it's known to be suboptimal by quite a
real percentage. For a normal Intel single CPU system it's sort of the worst case you can do
in that it leads to serializing tasks that could have run in parallel over multiple cores/threads.
So at minimum this kind of logic must be enabled/disabled based on architecture decisions.
Packing clearly has to take power topology into account and do the right
thing for the particular platform. It is not in place yet, but will be
addressed. I believe it would make sense for dual cpu Intel systems to
pack at socket level?