Re: [RFC PATCH v3 00/16] Core scheduling v3

From: Aaron Lu
Date: Mon Oct 14 2019 - 05:57:38 EST


On Sun, Oct 13, 2019 at 08:44:32AM -0400, Vineeth Remanan Pillai wrote:
> On Fri, Oct 11, 2019 at 11:55 PM Aaron Lu <aaron.lu@xxxxxxxxxxxxxxxxx> wrote:
>
> >
> > I don't think we need do the normalization afterwrads and it appears
> > we are on the same page regarding core wide vruntime.

Should be "we are not on the same page..."

[...]
> > The weird thing about my patch is, the min_vruntime is often increased,
> > it doesn't point to the smallest value as in a traditional cfs_rq. This
> > probabaly can be changed to follow the tradition, I don't quite remember
> > why I did this, will need to check this some time later.
>
> Yeah, I noticed this. In my patch, I had already accounted for this and changed
> to min() instead of max() which is more logical that min_vruntime should be the
> minimum of both the run queue.

I now remembered why I used max().

Assume rq1 and rq2's min_vruntime are both at 2000 and the core wide
min_vruntime is also 2000. Also assume both runqueues are empty at the
moment. Then task t1 is queued to rq1 and runs for a long time while rq2
keeps empty. rq1's min_vruntime will be incremented all the time while
the core wide min_vruntime stays at 2000 if min() is used. Then when
another task gets queued to rq2, it will get really large unfair boost
by using a much smaller min_vruntime as its base.

To fix this, either max() is used as is done in my patch, or adjust
rq2's min_vruntime to be the same as rq1's on each
update_core_cfs_min_vruntime() when rq2 is found empty and then use
min() to get the core wide min_vruntime. Looks not worth the trouble to
use min().