Re: [RESEND PATCH v2 1/2] sched/rt: add utilization tracking
From: Vincent Guittot
Date: Tue Aug 08 2017 - 09:56:52 EST
On 7 August 2017 at 18:44, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Fri, Aug 04, 2017 at 03:40:21PM +0200, Vincent Guittot wrote:
>
>> There were several comments on v1:
>> - As raised by Peter for v1, if IRQ time is taken into account in
>> rt_avg, it will not be accounted in rq->clock_task. This means that cfs
>> utilization is not affected by some extra contributions or decays
>> because of IRQ.
>
> Right.
>
>> - Regading the sync of rt and cfs utilization, both cfs and rt use the same
>> rq->clock_task. Then, we have the same issue than cfs regarding blocked value.
>> The utilization of idle cfs/rt rqs are not updated regularly but only when a
>> load_balance is triggered (more precisely a call to update_blocked_average).
>> I'd like to fix this issue for both cfs and rt with a separate patch that
>> will ensure that utilization (and load) are updated regularly even for
>> idle CPUs
>
> Yeah, that needs help.
>
>> - One last open question is the location of rt utilization function in fair.c
>> file. PELT related funtions should probably move in a dedicated pelt.c file.
>> This would also help to address one comment about having a place to update
>> metrics of NOHZ idle CPUs. Thought ?
>
> Probably, but I have a bunch of patches lined up changing that code, so
> lets not do that now.
ok. I can rebase and move the code once your patches will be there
>
> In any case, would something like the attached patches make sense? It
> completely replaces rt_avg with separate IRQ,RT and DL tracking.
That would be nice if we can replace rt_avg by something that has the
same dynamic as PELT.
The DL patch looks fine but can't we rely on deadline running
bandwidth to get the figures instead ?
I don't think that IRQ tracking patch is working.
update_irq_load_avg(rq->clock, cpu_of(rq), rq, 1); is called in
update_rq_clock_task() which is never called in irq context. In order
to use PELT for tracking irq and paravirt, we should call
update_irq_load_avg() for every context switch between irq/paravirt
and task which will probably be too heavy. Nevertheless, we can
Because PELT is cpu invariant, the used value must now be subtracted
to cpu_capacity_orig of the local cpu in scale_rt_capacity,
>
>