Re: [RFC PATCH v3 00/16] Core scheduling v3

From: Tim Chen
Date: Wed Sep 11 2019 - 12:19:07 EST


On 9/11/19 7:02 AM, Aaron Lu wrote:
> Hi Tim & Julien,
>
> On Fri, Sep 06, 2019 at 11:30:20AM -0700, Tim Chen wrote:
>> On 8/7/19 10:10 AM, Tim Chen wrote:
>>
>>> 3) Load balancing between CPU cores
>>> -----------------------------------
>>> Say if one CPU core's sibling threads get forced idled
>>> a lot as it has mostly incompatible tasks between the siblings,
>>> moving the incompatible load to other cores and pulling
>>> compatible load to the core could help CPU utilization.
>>>
>>> So just considering the load of a task is not enough during
>>> load balancing, task compatibility also needs to be considered.
>>> Peter has put in mechanisms to balance compatible tasks between
>>> CPU thread siblings, but not across cores.
>>>
>>> Status:
>>> I have not seen patches on this issue. This issue could lead to
>>> large variance in workload performance based on your luck
>>> in placing the workload among the cores.
>>>
>>
>> I've made an attempt in the following two patches to address
>> the load balancing of mismatched load between the siblings.
>>
>> It is applied on top of Aaron's patches:
>> - sched: Fix incorrect rq tagged as forced idle
>> - wrapper for cfs_rq->min_vruntime
>> https://lore.kernel.org/lkml/20190725143127.GB992@aaronlu/
>> - core vruntime comparison
>> https://lore.kernel.org/lkml/20190725143248.GC992@aaronlu/
>
> So both of you are working on top of my 2 patches that deal with the
> fairness issue, but I had the feeling Tim's alternative patches[1] are
> simpler than mine and achieves the same result(after the force idle tag

I think Julien's result show that my patches did not do as well as
your patches for fairness. Aubrey did some other testing with the same
conclusion. So I think keeping the forced idle time balanced is not
enough for maintaining fairness.

Will love to see if my load balancing patches help for your workload.

Tim

> fix), so unless there is something I missed, I think we should go with
> the simpler one?
>
> [1]: https://lore.kernel.org/lkml/b7a83fcb-5c34-9794-5688-55c52697fd84@xxxxxxxxxxxxxxx/
>