Re: [PATCH -rt] timer: upper bound on loops of __run_timers processing

From: Rik van Riel
Date: Tue Feb 17 2015 - 18:08:36 EST


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12/30/2014 02:52 PM, Marcelo Tosatti wrote:
>
> Commit "timers: do not raise softirq unconditionally", allows for
> timer wheel processing (__run_timers) to be delayed for long
> periods of time.
>
> The effect is that
>
> loops = jiffies - base->timer_jiffies
>
> Can grow to very large values resulting in __run_timers taking
> hundreds of milliseconds to execute.
>
> Fix by creating an upper bound on the number of loops to be
> processed. This allows a nohz=off kernel to achieve desired
> latencies.

Turns out that when we make nohz=off actually work correctly
with realtime tasks, and the scheduler tick is really disabled,
run_local_timers does not get called while the scheduler tick
is disabled, and this patch does nothing...

> +++ b/kernel/timer.c @@ -1488,6 +1488,12 @@ void
> run_local_timers(void) } #endif
>
> + if (time_after_eq(jiffies, base->timer_jiffies)) { + unsigned
> long jiffies_delta = jiffies - base->timer_jiffies; + if
> (jiffies_delta > TVR_SIZE) + raise_softirq(TIMER_SOFTIRQ); + } +
> if (!base->active_timers) goto out;

I suspect we may need an alternate approach, where we
change the way __run_timers works, so it does not go
around its main loop thousands of times.

Is there any sane way we could going around this list
thousands of times?

This is especially annoying when there are no timers
pending in the first place :)

static inline void __run_timers(struct tvec_base *base)
{
struct timer_list *timer;

spin_lock_irq(&base->lock);
while (time_after_eq(jiffies, base->timer_jiffies)) {
struct list_head work_list;
struct list_head *head = &work_list;
int index = base->timer_jiffies & TVR_MASK;

/*
* Cascade timers:
*/
if (!index &&
(!cascade(base, &base->tv2, INDEX(0))) &&
(!cascade(base, &base->tv3, INDEX(1))) &&
!cascade(base, &base->tv4,
INDEX(2)))
cascade(base, &base->tv5, INDEX(3));
++base->timer_jiffies;
list_replace_init(base->tv1.vec + index, &work_list);
while (!list_empty(head)) {


- --
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJU48nfAAoJEM553pKExN6DKpoH/2Cf/nmra2uhsJzSm+FgbWgi
8EMM7TszF24Ys+JwtbdVmTonBsAXOmjJEGxvc9yknUymO3Wj6Iph4Z1cPk/9nS/v
vU7RWBiu9P6lmHfRuK+dN4w7SQnrtkOOuOLWlNSfwklVxCZasIPOsDazt/uivnpk
H3AMhGk0LduzAg1GSfJnCawnrIx/BgCu1UzHOMdE21SNZIS8z8o/MPQJ1qMbneQe
HbumMiB/0PmSmsMK9SVmYp7oEo0KliMJ+MW09Yrjfg0umz/h3TQ1aY3l5JhYK2rX
mmhlOIR4Sg720vZ6DzbDoRaJQHVY9Mc/QammpCU5ZY/f7diLzRsmm1FteNFQeAY=
=q2AN
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/