Re: timers: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
From: Thomas Gleixner
Date: Wed Jan 13 2016 - 04:06:55 EST
Sasha,
On Tue, 12 Jan 2016, Sasha Levin wrote:
Cc'ing Paul, Peter
> While fuzzing with trinity inside a KVM tools guest, running the latest -next
> kernel, I've hit the following lockdep warning:
> [ 3408.474461] Possible interrupt unsafe locking scenario:
>
> [ 3408.474461]
>
> [ 3408.475239] CPU0 CPU1
>
> [ 3408.475809] ---- ----
>
> [ 3408.476380] lock(&lock->wait_lock);
>
> [ 3408.476925] local_irq_disable();
>
> [ 3408.477640] lock(&(&new_timer->it_lock)->rlock);
>
> [ 3408.478607] lock(&lock->wait_lock);
That comes from rcu_read_unlock:
rcu_read_unlock()
rcu_read_unlock_special()
...
rt_mutex_unlock(&rnp->boost_mtx);
raw_spin_lock(&boost_mtx->wait_lock);
> [ 3408.479445] <Interrupt>
>
> [ 3408.479796] lock(&(&new_timer->it_lock)->rlock);
So the task on CPU0 holds rnp->boost_mtx.wait_lock and then the interrupt
deadlocks on the timer->it_lock.
We can fix that particular issue in the posix-timer code by making the
locking symetric:
rcu_read_lock();
spin_lock_irq(timer->lock);
...
spin_unlock_irq(timer->lock);
rcu_read_unlock();
instead of:
rcu_read_lock();
spin_lock_irq(timer->lock);
rcu_read_unlock();
...
spin_unlock_irq(timer->lock);
But the question is, whether this is the only offending code path in tree. We
can avoid the hassle by making rtmutex->wait_lock irq safe.
Thoughts?
Thanks,
tglx