Re: [PATCH v3 2/2] sched: consider missed ticks when updating global cpu load
From: Frederic Weisbecker
Date: Mon Oct 12 2015 - 13:45:47 EST
On Mon, Oct 05, 2015 at 10:15:55AM +0200, Peter Zijlstra wrote:
> On Sun, Oct 04, 2015 at 03:58:19PM +0900, Byungchul Park wrote:
> > anyway, it's wrong for update_process_times() to assume 1 tick because
> > tick_irq_exit() -> tick_nohz_irq_exit() -> tick_nohz_full_update_tick()
> > -> tick_nohz_restart_sched_tick() can happen at full NOHZ as i already
> > said. in this full NOHZ case for tick to restart from non-idle,
>
> NO_HZ_FULL is very much a work in progress, there's plenty wrong with
> it. But yes, if it does this then its broken here too, I'm not sure if
> Frederic is aware of this or not (I'm sure he's got a fairly big list of
> broken for NO_HZ_FULL).
Indeed and cpu load active is part of what needs to be fixed. I hope this
patchset will help.
>
> > 1. update_process_times() -> account_process_tick() must be able to handle
> > more than one tick, or tick_nohz_restart_sched_tick() should handle the
> > case additionally. (i think the latter is better.) i will try to modify
> > the code to handle it if you agree with me.
>
> Yes, and we need to audit all the other stuff called from
> update_process_times().
>
> run_local_timers() seems be ok.
> rcu_check_clalbacks() also doesn't seem to care about ticks.
>
> I _think_ we fixed most of the scheduler_tick()
> stuff (under the assumption that TSC is stable), but I'm not sure.
Concerning the variable pending ticks, we are fine with update_process_times()
except a few stuff in scheduler_tick():
* cpu load active
* sched_avg_update() handles well missed ticks as it's based on rq clock
and specific period for updates. But I'm worried about remote reads of rt_avg,
if any.
* calc_global_load_tick(), not sure about this one
* trigger_load_balance()
* the infamous task_tick() :-)
But load avg appears to me as a pretty standalone issue. So are each of these small
issues.
>
> and run_posix_cpu_timers() might also be ok.
>
> > 2. to handle full NOHZ, tick_nohz_restart_sched_tick() should call
> > update_cpu_load_active() instead of update_cpu_load_nohz() with my 1/2
> > patch and 2/2 patch, or we should modify update_cpu_load_nohz() to know
> > full NOHZ, which currently don't know full NOHZ. (you may agree with the
> > latter.) in any case, 1/2 patch is necessary which current code is
> > absolutely missing.
> >
> > peter, what do you think about my opinion? and about my 1/2 patch?
>
> I did not look too closely, but it might have the right shape for
> dealing with !idle ticks. I'd have to look more closely at it.
>
> > i will modify 2/2 patch depending on your feedback.
>
> I think it will take more than a single patch to rework all of
> update_process_times(). And we should also ask Thomas for his opinion,
> but I think we want:
>
> - make update_process_times() take a nr_ticks argument
> - fixup everything below it
>
> - fix tick_nohz_handler to not ignore the hrtimer_forward()
> return value and pass it into
> tick_sched_handle()/update_process_times().
>
> (assuming this is the right oneshot tick part, tick-common
> seems to be about periodic timers which aren't used much ?!)
this_nohz_handler() is the low res nohz handler. tick_sched_handle()
is the high res one (I should rename these). I think we should rather
find out the pending updates from update_process_times() itself and pass
it to scheduler_tick() which is the one interested in it.
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/