On Thu, Feb 10, 2022 at 12:04:59PM -0500, Waiman Long wrote:OK, got it. However, I thought the circular dependency has been explained by that part of the lockdep splat:
On 2/10/22 05:19, Peter Zijlstra wrote:Right, so please just transcribe the relevant bits instead of including
I am sorry that I might have stripped out too much for the lockdep splat to
make it understandable. Below is the full lockdep splat:
this massive splat. It really isn't too complicated.
That can be summarized as:
0: 1: 2:
pi_lock rq->lock console_sem
rq->lock console_sem pi_lock
Which is *much* shorter and *much* easier to read.
You may have been confused by the name "console_sem". A semaphore is basically a count protected by a raw spinlock. So console_sem.lock is actually a raw spinlock. It is perfectly legal to take a raw spinlock after holding another raw spinlock.1: above. You cannot take a semaphore inside a (raw) spinlock.More concerning, that ordering is invalid to begin with, so the aboveWhich lock ordering are you considered invalid?
seems like a very poor justification for this patch.
The stack trace included in the patch description show theThe right solution is to burn printk_deferred at the stake and most of
(console_sem).lock --> &p->pi_lock --> &rq->__lock sequence because of the
wake_up_process() call while holding the console_sem.lock.
The reverse &rq->__lock lock may happen when a printk() statement is called
while holding the rq lock.
In this case, the printk() is triggered by a SCHED_WARN_ON() statement in
update_rq_clock() which don't call printk_deferred and so won't have
LOGLEVEL_SCHED set. I guess there is alternative way to work around this
issue, but moving the process wakeup out from the semaphore spinlock will
solve this problem in case there are other corner cases like that.
I will update the patch description to include this additional information.
printk along with it (they're working on it).
Hitting that WARN is the real problem, the rest is collateral damage and
I'm really not interested in fixing that.