Re: [RFC PATCH v4 00/19] Core scheduling v4

From: Tim Chen
Date: Fri Feb 28 2020 - 18:55:28 EST


On 2/26/20 1:54 PM, Vineeth Remanan Pillai wrote:

> rq->curr being NULL can mean that the sibling is idle or forced idle.
> In both the cases, I think it makes sense to migrate a task so that it can
> compete with the other sibling for a chance to run. This function
> can_migrate_task actually only says if this task is eligible and
> later part of the code decides whether it is okay to migrate it
> based on factors like load and util and capacity. So I think its
> fine to declare the task as eligible if the dest core is running
> idle. Does this thinking make sense?
>
> On our testing, it did not show much degradation in performance with
> this change. I am reworking the fix by removing the check for
> task_est_util. It doesn't seem to be valid to check for util to migrate
> the task.
>

In Aaron's test case, there is a great imbalance in the load on one core
where all the grp A tasks are vs the other cores where the grp B tasks are
spread around. Normally, load balancer will move the tasks for grp A.

Aubrey's can_migrate_task patch prevented the load balancer to migrate tasks if the core
cookie on the target queue don't match. The thought was it will induce
force idle and reduces cpu utilization if we migrate task to it.
That kept all the grp A tasks from getting migrated and kept the imbalance
indefinitely in Aaron's test case.

Perhaps we should also look at the load imbalance between the src rq and
target rq. If the imbalance is big (say two full cpu bound tasks worth
of load), we should migrate anyway despite the cookie mismatch. We are willing
to pay a bit for the force idle by balancing the load out more.
I think Aubrey's patch on can_migrate_task should be more friendly to
Aaron's test scenario if such logic is incorporated.

In Vinnet's fix, we only look at the currently running task's weight in
src and dst rq. Perhaps the load on the src and dst rq needs to be considered
to prevent too great an imbalance between the run queues?

Tim