Re: [PATCH] lockdep: Panic on warning if panic_on_warn is set

From: Vincent Whitchurch
Date: Fri Aug 19 2022 - 07:00:06 EST


On Thu, Aug 18, 2022 at 11:49:17PM +0200, Boqun Feng wrote:
> On Thu, Aug 18, 2022 at 01:42:58PM +0200, Vincent Whitchurch wrote:
> > There does not seem to be any way to get the system to panic if a
> > lockdep warning is emitted, since those warnings don't use the normal
> > WARN() infrastructure. Panicking on any lockdep warning can be
> > desirable when the kernel is being run in a controlled environment
> > solely for the purpose of testing. Make lockdep respect panic_on_warn
> > to allow this, similar to KASAN and others.
> >
>
> I'm not completely against this, but could you explain why you want to
> panic on lockdep warning? I assume you want to have a kdump so that you
> can understand the lock bugs closely? But lockdep discovers lock issue
> possiblity, so it's not an after-the-fact detector. In other words, when
> lockdep warns, the deadlock cases don't happen in the meanwhile. And
> also lockdep tries very hard to print useful information to locate the
> issues.

I'm not trying to obtain a kdump in this case. I test device drivers
under UML[0] and I want to make the tests stop and fail immediately if
the driver triggers any kind of problem which results in splats in the
log. I achieve this using panic_on_warn, panic_on_taint, and oops=panic
which result in a panic and an error exit code from UML.

[0] https://lore.kernel.org/lkml/20220311162445.346685-1-vincent.whitchurch@xxxxxxxx/

For lockdep, without this patch, I would be forced to parse the logs
after each test to determine if the test trigger a lockdep splat or not.

> This patch add lockdep_panic() to a few places, and it's a pain for
> maintaining. So why do you want to panic on lockdep warning?

It's adding the call to a lot of places since there is no existing
common function indicating the end of a lockdep warning. I can move the
already duplicated dump_stack() calls into the new function too so that
some code is removed. The "stack backtrace" could possible be
consolidated too, but one of the call sites uses printk instead of
pr_warn so I wasn't sure if it was OK to change that to a warn too.