Re: [PATCH] mm/slub: use WARN_ON() for some slab errors

From: Miles Chen
Date: Mon Jan 21 2019 - 23:21:52 EST

On Mon, 2019-01-21 at 22:02 +0000, Christopher Lameter wrote:
> On Mon, 21 Jan 2019, miles.chen@xxxxxxxxxxxx wrote:
> > From: Miles Chen <miles.chen@xxxxxxxxxxxx>
> >
> > When debugging with slub.c, sometimes we have to trigger a panic in
> > order to get the coredump file. To do that, we have to modify slub.c and
> > rebuild kernel. To make debugging easier, use WARN_ON() for these slab
> > errors so we can dump stack trace by default or set panic_on_warn to
> > trigger a panic.
> These locations really should dump stack and not terminate. There is
> subsequent processing that should be done.

Understood. We should not terminate the process for normal case. The
change only terminate the process when panic_on_warn is set.

> Slub terminates by default. The messages you are modifying are only
> enabled if the user specified that special debugging should be one
> (typically via a kernel parameter slub_debug).

I'm a little bit confused about this: Do you mean that I should use the
following approach?

1. Add a special debugging flag (say SLAB_PANIC_ON_ERROR) and call
panic() by:

if (s->flags & SLAB_PANIC_ON_ERROR)
panic("slab error");

2. The SLAB_PANIC_ON_ERROR should be set by slub_debug param.

> It does not make sense to terminate the process here.

Thanks for you comment. Sometimes it's useful to trigger a panic and get
its coredump file before any restore/reset processing because we can
exam the unmodified data in the coredump file with this approach.

I added BUG() for the slab errors in internal branches for a few years
and it does help for both software issues and bit flipping issues. It's
a quite useful in developing stage.