Re: [PATCH printk v3 2/7] printk: Reduce console_unblank() usage in unsafe scenarios

From: Sergey Senozhatsky
Date: Mon Jul 17 2023 - 23:39:43 EST


On (23/07/17 21:52), John Ogness wrote:
>
> A semaphore is not NMI-safe, even when using down_trylock(). Both
> down_trylock() and up() are using internal spinlocks and up()
> might even call wake_up_process().
>
> In the panic() code path it gets even worse because the internal
> spinlocks of the semaphore may have been taken by a CPU that has
> been stopped.
>
> To reduce the risk of deadlocks caused by the console semaphore in
> the panic path, make the following changes:
>
> - First check if any consoles have implemented the unblank()
> callback. If not, then there is no reason to take the console
> semaphore anyway. (This check is also useful for the non-panic
> path since the locking/unlocking of the console lock can be
> quite expensive due to console printing.)
>
> - If the panic path is in NMI context, bail out without attempting
> to take the console semaphore or calling any unblank() callbacks.
> Bailing out is acceptable because console_unblank() would already
> bail out if the console semaphore is contended. The alternative of
> ignoring the console semaphore and calling the unblank() callbacks
> anyway is a bad idea because these callbacks are also not NMI-safe.
>
> If consoles with unblank() callbacks exist and console_unblank() is
> called from a non-NMI panic context, it will still attempt a
> down_trylock(). This could still result in a deadlock if one of the
> stopped CPUs is holding the semaphore internal spinlock. But this
> is a risk that the kernel has been (and continues to be) willing
> to take.
>
> Signed-off-by: John Ogness <john.ogness@xxxxxxxxxxxxx>

Reviewed-by: Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx>