Re: [RFC PATCH 2/5] sched: Add NOHZ_STATS_KICK
From: Morten Rasmussen
Date: Thu Jan 18 2018 - 05:38:17 EST
On Mon, Jan 15, 2018 at 09:26:09AM +0100, Vincent Guittot wrote:
> Le Wednesday 03 Jan 2018 à 10:16:00 (+0100), Vincent Guittot a écrit :
> > Hi Peter,
> >
> > On 22 December 2017 at 21:42, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > > On Fri, Dec 22, 2017 at 07:56:29PM +0100, Peter Zijlstra wrote:
> > >> Right; but I figured we'd try and do it 'right' and see how horrible it
> > >> is before we try and do funny things.
> > >
> > > So now it should have a 32ms tick for up to .5s when the system goes
> > > completely idle.
> > >
> > > No idea how bad that is..
> >
> > I have tested your branch but the timer doesn't seem to fire correctly
> > because i can still see blocked load in the use case i have run.
> > I haven't found the reason yet
>
> Hi Peter,
>
> With the patch below on top of your branch, the blocked loads are updated and
> decayed regularly. The main differences are:
> - It doesn't use a timer to trig ilb but the tick and when a cpu becomes idle.
> The main drawback of this solution is that the load is blocked when the
> system is fully idle with the advantage of not waking up a fully idle
> system. We have to wait for the next tick or newly idle event for updating
> blocked load when the system leaves idle stat which can be up to a tick long.
> If this is too long, we can check for kicking ilb when task wakes up so the
> blocked load will be updated as soon as the system leaves idle state.
> The main advantage is that we don't wake up a fully idle system every 32ms to
> update blocked load that will be not used.
> - I'm working on one more improvement to use nohz_idle_balance in the newly
> idle case when the system is not overloaded and
> (this_rq->avg_idle > sysctl_sched_migration_cost). In this case, we can try to
> use nohz_idle_balance with NOHZ_STATS_KICK and abort as soon as it exceed
> this_rq->avg_idle. This will remove some calls to kick_ilb and some wake up
> of an idle cpus.
This sound like what I meant in my other reply :-)
It seems pointless to have a timer to update PELT if the system is
completely idle, and when it isn't we can piggy back other events to
make the updates happen.