Re: [PATCH] hrtimer: Avoid double reprogramming in __hrtimer_start_range_ns()

From: Peter Zijlstra
Date: Mon Apr 26 2021 - 05:40:55 EST


On Mon, Apr 26, 2021 at 10:49:33AM +0200, Thomas Gleixner wrote:
> If __hrtimer_start_range_ns() is invoked with an already armed hrtimer then
> the timer has to be canceled first and then added back. If the timer is the
> first expiring timer then on removal the clockevent device is reprogrammed
> to the next expiring timer to avoid that the pending expiry fires needlessly.
>
> If the new expiry time ends up to be the first expiry again then the clock
> event device has to reprogrammed again.
>
> Avoid this by checking whether the timer is the first to expire and in that
> case, keep the timer on the current CPU and delay the reprogramming up to
> the point where the timer has been enqueued again.
>
> Reported-by: Lorenzo Colitti <lorenzo@xxxxxxxxxx>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> ---
> kernel/time/hrtimer.c | 60 ++++++++++++++++++++++++++++++++++++++++++++------
> 1 file changed, 53 insertions(+), 7 deletions(-)
>
> --- a/kernel/time/hrtimer.c
> +++ b/kernel/time/hrtimer.c
> @@ -1030,12 +1030,13 @@ static void __remove_hrtimer(struct hrti
> * remove hrtimer, called with base lock held
> */
> static inline int
> -remove_hrtimer(struct hrtimer *timer, struct hrtimer_clock_base *base, bool restart)
> +remove_hrtimer(struct hrtimer *timer, struct hrtimer_clock_base *base,
> + bool restart, bool keep_local)
> {
> u8 state = timer->state;
>
> if (state & HRTIMER_STATE_ENQUEUED) {
> - int reprogram;
> + bool reprogram;
>
> /*
> * Remove the timer and force reprogramming when high
> @@ -1048,8 +1049,16 @@ remove_hrtimer(struct hrtimer *timer, st
> debug_deactivate(timer);
> reprogram = base->cpu_base == this_cpu_ptr(&hrtimer_bases);
>
> + /*
> + * If the timer is not restarted then reprogramming is
> + * required if the timer is local. If it is local and about
> + * to be restarted, avoid programming it twice (on removal
> + * and a moment later when it's requeued).
> + */
> if (!restart)
> state = HRTIMER_STATE_INACTIVE;
> + else
> + reprogram &= !keep_local;

reprogram = reprogram && !keep_local;

perhaps?

>
> __remove_hrtimer(timer, base, state, reprogram);
> return 1;
> @@ -1103,9 +1112,31 @@ static int __hrtimer_start_range_ns(stru
> struct hrtimer_clock_base *base)
> {
> struct hrtimer_clock_base *new_base;
> + bool force_local, first;
>
> - /* Remove an active timer from the queue: */
> - remove_hrtimer(timer, base, true);
> + /*
> + * If the timer is on the local cpu base and is the first expiring
> + * timer then this might end up reprogramming the hardware twice
> + * (on removal and on enqueue). To avoid that by prevent the
> + * reprogram on removal, keep the timer local to the current CPU
> + * and enforce reprogramming after it is queued no matter whether
> + * it is the new first expiring timer again or not.
> + */
> + force_local = base->cpu_base == this_cpu_ptr(&hrtimer_bases);
> + force_local &= base->cpu_base->next_timer == timer;

Using bitwise ops on a bool is cute and all, but isn't that more
readable when written like:

force_local = base->cpu_base == this_cpu_ptr(&hrtimer_bases) &&
base->cpu_base->next_timer == timer;


> +
> + /*
> + * Remove an active timer from the queue. In case it is not queued
> + * on the current CPU, make sure that remove_hrtimer() updates the
> + * remote data correctly.
> + *
> + * If it's on the current CPU and the first expiring timer, then
> + * skip reprogramming, keep the timer local and enforce
> + * reprogramming later if it was the first expiring timer. This
> + * avoids programming the underlying clock event twice (once at
> + * removal and once after enqueue).
> + */
> + remove_hrtimer(timer, base, true, force_local);
>
> if (mode & HRTIMER_MODE_REL)
> tim = ktime_add_safe(tim, base->get_time());
> @@ -1115,9 +1146,24 @@ static int __hrtimer_start_range_ns(stru
> hrtimer_set_expires_range_ns(timer, tim, delta_ns);
>
> /* Switch the timer base, if necessary: */
> - new_base = switch_hrtimer_base(timer, base, mode & HRTIMER_MODE_PINNED);
> + if (!force_local) {
> + new_base = switch_hrtimer_base(timer, base,
> + mode & HRTIMER_MODE_PINNED);
> + } else {
> + new_base = base;
> + }
>
> - return enqueue_hrtimer(timer, new_base, mode);
> + first = enqueue_hrtimer(timer, new_base, mode);
> + if (!force_local)
> + return first;
> +
> + /*
> + * Timer was forced to stay on the current CPU to avoid
> + * reprogramming on removal and enqueue. Force reprogram the
> + * hardware by evaluating the new first expiring timer.
> + */
> + hrtimer_force_reprogram(new_base->cpu_base, 1);
> + return 0;
> }

There is an unfortunate amount of duplication between
hrtimer_force_reprogram() and hrtimer_reprogram(). The obvious cleanups
don't work however :/ Still, does that in_hrtirq optimization make sense
to have in force_reprogram ?