Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
From: Ingo Molnar
Date: Sun Aug 23 2015 - 01:41:03 EST
* Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
> From: Vatika Harlalka <vatikaharlalka@xxxxxxxxx>
>
> The problem addressed in this patch is about affining unpinned timers.
> Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary
> jitter due to firing of such timers on them.
>
> This patch will affine timers to online CPUs which are not full dynticks
> in NOHZ_FULL configured systems. It should not introduce overhead in
> nohz full off case due to static keys.
>
> Reviewed-by: Preeti U Murthy <preeti@xxxxxxxxxxxxxxxxxx>
> Signed-off by: Vatika Harlalka <vatikaharlalka@xxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Christoph Lameter <cl@xxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> Cc: Chris Metcalf <cmetcalf@xxxxxxxxxx>
> Signed-off-by: Frederic Weisbecker <fweisbec@xxxxxxxxx>
> ---
> include/linux/tick.h | 9 ++++++++-
> kernel/sched/core.c | 7 +++++--
> 2 files changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index 3741ba1..51e6493 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
> if (tick_nohz_full_enabled())
> cpumask_or(mask, mask, tick_nohz_full_mask);
> }
> -
> +static inline int housekeeping_any_cpu(void)
> +{
> + return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> +}
> extern void __tick_nohz_full_check(void);
> extern void tick_nohz_full_kick(void);
> extern void tick_nohz_full_kick_cpu(int cpu);
> extern void tick_nohz_full_kick_all(void);
> extern void __tick_nohz_task_switch(struct task_struct *tsk);
> #else
> +static inline int housekeeping_any_cpu(void)
> +{
> + return smp_processor_id();
> +}
> static inline bool tick_nohz_full_enabled(void) { return false; }
> static inline bool tick_nohz_full_cpu(int cpu) { return false; }
> static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9917c96..4fd42e4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
> int i, cpu = smp_processor_id();
> struct sched_domain *sd;
>
> - if (!idle_cpu(cpu))
> + if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
> return cpu;
>
> rcu_read_lock();
> for_each_domain(cpu, sd) {
> for_each_cpu(i, sched_domain_span(sd)) {
> - if (!idle_cpu(i)) {
> + if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
> cpu = i;
> goto unlock;
> }
> }
> }
> +
> + if (!is_housekeeping_cpu(cpu))
> + cpu = housekeeping_any_cpu();
> unlock:
> rcu_read_unlock();
> return cpu;
So I almost applied this yesterday, but had the following question: what ensures
that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu()
returns cpumask_any_and() of an empty cpumask - which returns an out of range
index AFAICS - which will crash and burn in:
kernel/time/hrtimer.c: return &per_cpu(hrtimer_bases, get_nohz_timer_target());
kernel/time/timer.c: return per_cpu_ptr(&tvec_bases, get_nohz_timer_target());
housekeeping_mask itself is derived from tick_nohz_full_mask (it's the inverse of
it in essence), and tick_nohz_full_mask is set via two methods, either via a boot
parameter:
if (cpulist_parse(str, tick_nohz_full_mask) < 0) {
in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not
completely full - making housekeeping_mask empty?
The other method is via CONFIG_NO_HZ_FULL_ALL:
cpumask_setall(tick_nohz_full_mask);
here it's fully set - triggering the bug I'm worried about. So what am I missing,
what prevents CONFIG_NO_HZ_FULL_ALL from crashing?
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/