Re: [PATCH v4 02/16] sched/core: uclamp: map TASK's clamp values into CPU's clamp groups

From: Peter Zijlstra
Date: Wed Sep 12 2018 - 13:42:45 EST


On Wed, Sep 12, 2018 at 06:35:15PM +0100, Patrick Bellasi wrote:
> On 12-Sep 18:12, Peter Zijlstra wrote:

> > No idea; but if you want to go all fancy you can replace he whole
> > uclamp_map thing with something like:
> >
> > struct uclamp_map {
> > union {
> > struct {
> > unsigned long v : 10;
> > unsigned long c : BITS_PER_LONG - 10;
> > };
> > atomic_long_t s;
> > };
> > };
>
> That sounds really cool and scary at the same time :)
>
> The v:10 requires that we never set SCHED_CAPACITY_SCALE>1024
> or that we use it to track a percentage value (i.e. [0..100]).

Or we pick 11 bits, it seems unlikely that capacity be larger than 2k.

> One of the last patches introduces percentage values to userspace.
> But, I was considering that in kernel space we should always track
> full scale utilization values.
>
> The c:(BITS_PER_LONG-10) restricts the range of concurrently active
> SE refcounting the same clamp value. Which, for some 32bit systems is
> only 4 milions among tasks and cgroups... maybe still reasonable...

Yeah, on 32bit having 4M tasks seems 'unlikely'.