Re: [RFC PATCH v3 00/16] Core scheduling v3

From: Aaron Lu
Date: Fri Oct 11 2019 - 23:55:16 EST


On Fri, Oct 11, 2019 at 08:10:30AM -0400, Vineeth Remanan Pillai wrote:
> > Thanks for the clarification.
> >
> > Yes, this is the initialization issue I mentioned before when core
> > scheduling is initially enabled. rq1's vruntime is bumped the first time
> > update_core_cfs_rq_min_vruntime() is called and if there are already
> > some tasks queued, new tasks queued on rq1 will be starved to some extent.
> >
> > Agree that this needs fix. But we shouldn't need do this afterwards.
> >
> > So do I understand correctly that patch1 is meant to solve the
> > initialization issue?
>
> I think we need this update logic even after initialization. I mean, core
> runqueue's min_vruntime can get updated every time when the core
> runqueue's min_vruntime changes with respect to the sibling's min_vruntime.
> So, whenever this update happens, we would need to propagate the changes
> down the tree right? Please let me know if I am visualizing it wrong.

I don't think we need do the normalization afterwrads and it appears
we are on the same page regarding core wide vruntime.

The intent of my patch is to treat all the root level sched entities of
the two siblings as if they are in a single cfs_rq of the core. With a
core wide min_vruntime, the core scheduler can decide which sched entity
to run next. And the individual sched entity's vruntime shouldn't be
changed based on the change of core wide min_vruntime, or faireness can
hurt(if we add or reduce vruntime of a sched entity, its credit will
change).

The weird thing about my patch is, the min_vruntime is often increased,
it doesn't point to the smallest value as in a traditional cfs_rq. This
probabaly can be changed to follow the tradition, I don't quite remember
why I did this, will need to check this some time later.

All those sub cfs_rq's sched entities are not interesting. Because once
we decided which sched entity in the root level cfs_rq should run next,
we can then pick the final next task from there(using the usual way). In
other words, to make scheduler choose the correct candidate for the core,
we only need worry about sched entities on both CPU's root level cfs_rqs.

Does this make sense?