Re: [PATCH 2/2] sched/fair: Relax task_hot() for misfit tasks

From: Valentin Schneider
Date: Fri May 07 2021 - 09:47:09 EST



Hi Vincent, apologies for the belated reply

On 30/04/21 08:58, Vincent Guittot wrote:
> On Wed, 21 Apr 2021 at 12:52, Valentin Schneider
> <valentin.schneider@xxxxxxx> wrote:
>> On 20/04/21 16:33, Vincent Guittot wrote:
>> > Is it something that happens often or just a sporadic/transient state
>> > ? I mean does it really worth the extra complexity and do you see
>> > performance improvement ?
>> >
>>
>> "Unfortunately" yes, this is a relatively common scenario when running "1
>> big task per CPU" types of workloads. The expected behaviour for big.LITTLE
>> systems is to upmigrate tasks stuck on the LITTLE CPUs as soon as a big CPU
>> becomes free, usually via newidle balance (which, since they process work
>> faster than the LITTLEs, is bound to happen), and an extra task being
>> enqueued at "the wrong time" can prevent this from happening.
>>
>> This usually means a misfit task can take a few dozen extra ms than it
>
> A few dozens is quite long. With a big core being idle, it should try
> every 8ms on a quad x quad system and I suspect the next try will be
> during the next tick. Would be good to understand why it has to wait
> so much
>

True, IIRC this was mostly due to a compound effect of the different issues
I've described in that thread (and the previous one). Now that

9bcb959d05ee ("sched/fair: Ignore percpu threads for imbalance pulls")

is in, I'll re-run some tests against upstream and see how we fare.