On 12 May 2017 at 22:19, Rohit Jain wrote:
On 05/12/2017 12:46 PM, Peter Zijlstra wrote:If fact, the capacity is already taken into account in the wake up
On Fri, May 12, 2017 at 11:04:26AM -0700, Rohit Jain wrote:
The patch avoids CPUs which might be considered interrupt-heavy whenIRQ_TIME_ACCOUNTING you mean?
trying to schedule threads (on the push side) in the system. Interrupt
Awareness has only been added into the fair scheduling class.
It does so by, using the following algorithm:
--------------------------------------------------------------------------
1) When the interrupt is getting processed, the start and the end times
are noted for the interrupt on a per-cpu basis.
Yes. Exactly
2) On a periodic basis the interrupt load is processed for each runYou mean like like how its already added to rt_avg? Which is then used
queue and this is mapped in terms of percentage in a global array. The
interrupt load for a given CPU is also decayed over time, so that the
most recent interrupt load has the biggest contribution in the interrupt
load calculations. This would mean the scheduler will try to avoid CPUs
(if it can) when scheduling threads which have been recently busy with
handling hardware interrupts.
to lower a CPU's capacity.
Right. The only difference I see is that it is not being used on the
enqueue side as of now.
3) Any CPU which lies above the 80th percentile in terms of percentageI would much rather you work with the EAS people and extend the capacity
interrupt load is considered interrupt-heavy.
4) During idle CPU search from the scheduler perspective this
information is used to skip CPUs if better are available.
5) If none of the CPUs are better in terms of idleness and interrupt
load, then the interrupt-heavy CPU is considered to be the best
available CPU.
awareness of those code paths. Then, per the existing logic, things
should just work out.
Did you mean we should use the capacity as a metric on the enqueue side
and not introduce a new metric?
path. you can look at wake_affine(), wake_cap() and
capacity_spare_wake()
The current implementations takes care of original capacity but it
might be extended to take into account capacity stolen by irq/rt as
well
It doesn't matter how the capacity is lowered, at some point you just
don't want to put tasks on. It really doesn't matter if that's because
IRQs, SoftIRQs, (higher priority) Real-Time tasks, thermal throttling or
anything else.