Re: [patch] mm, debug: allow suppressing panic on CONFIG_DEBUG_VM checks

From: David Rientjes
Date: Mon May 22 2023 - 20:57:46 EST


On Mon, 22 May 2023, Linus Torvalds wrote:

> On Mon, May 22, 2023 at 11:39 AM David Rientjes <rientjes@xxxxxxxxxx> wrote:
> >
> > I think VM_BUG_ON*() and friends are used to crash the kernel for
> > debugging so that we get a crash dump and because some variants don't
> > exist for VM_WARN_ON().
>
> I do think that from a VM developer standpoint, I think it should be
> fine to just effectively turn VM_BUG_ON() into WARN_ON_ONCE() together
> with panic_on_warn.
>
> Maybe we could even extend 'panic_on_warn' to be a bitmap and
> effectively have a "don't panic on non-VM warnings" option.
>

I hadn't thought of that approach, it would definitely help us achieve our
goal of emitting warnings on a small set of production hosts that we don't
want to crash. It's also very clean.

Right now kernel.panic_on_warn can either be 0 or 1. We can keep the
lowest bit to be "panic on all warnings" and then bit-1 as "panic on debug
VM warnings." When CONFIG_DEBUG_VM is enabled, set the new bit by
default so there's no behavior change.

Then, we can keep VM_BUG_ON*() and friends around and extend them to check
whether they should BUG() after the WARN_ON(1) or not.

On our production hosts, we'll just set kernel.panic_on_oom to 0.

I'll give it a few days to see if anybody else has any comments or
concerns; if not, I'll send a v2 based on this.