Re: [PATCH] sched: tg_set_cfs_bandwidth() causes rq->lock deadlock

From: Peter Zijlstra
Date: Tue May 20 2014 - 10:02:13 EST


On Tue, May 20, 2014 at 03:15:26PM +0200, Peter Zijlstra wrote:
> Which leads us to what I think is a BUG in the current hrtimer code (and
> one wonders why we never hit that), because we drop the cpu_base->lock
> over calling hrtimer::function, hrtimer_start_range_ns() can in fact
> come in and (re)enqueue the timer, if hrtimer::function then returns
> HRTIMER_RESTART, we'll hit that BUG_ON() before trying to enqueue the
> timer once more.

> ---
> kernel/hrtimer.c | 9 ++++++---
> kernel/sched/core.c | 10 ++++++----
> kernel/sched/fair.c | 42 +++---------------------------------------
> kernel/sched/sched.h | 2 +-
> 4 files changed, 16 insertions(+), 47 deletions(-)
>
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index 3ab28993f6e0..28942c65635e 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -1273,11 +1273,14 @@ static void __run_hrtimer(struct hrtimer *timer, ktime_t *now)
> * Note: We clear the CALLBACK bit after enqueue_hrtimer and
> * we do not reprogramm the event hardware. Happens either in
> * hrtimer_start_range_ns() or in hrtimer_interrupt()
> + *
> + * Note: Because we dropped the cpu_base->lock above,
> + * hrtimer_start_range_ns() can have popped in and enqueued the timer
> + * for us already.
> */
> - if (restart != HRTIMER_NORESTART) {
> - BUG_ON(timer->state != HRTIMER_STATE_CALLBACK);
> + if (restart != HRTIMER_NORESTART &&
> + !(timer->state & HRTIMER_STATE_ENQUEUED))
> enqueue_hrtimer(timer, base);
> - }
>
> WARN_ON_ONCE(!(timer->state & HRTIMER_STATE_CALLBACK));
>

Hmm,. doesn't this also mean its entirely unsafe to call
hrtimer_forward*() from the timer callback, because it might be changing
the time of an already enqueued timer, which would corrupt the rb-tree
order.

Lemme go find a nice way out of this mess, I think I'm responsible for
creating it in the first place :-(

Attachment: pgp3agELZoH_J.pgp
Description: PGP signature