Re: [PATCH v3 2/2] mm: slub: call WARN() when the slab detect an error

From: Harry Yoo
Date: Fri Feb 21 2025 - 03:31:41 EST


On Thu, Feb 20, 2025 at 12:39:44PM +0900, Hyesoo Yu wrote:
> If a slab object is corrupted or an error occurs in its internal
> value, continuing after restoration may cause other side effects.
> At this point, it is difficult to debug because the problem occurred
> in the past. It is useful to use WARN() to catch errors at the point
> of issue because WARN() could trigger panic for system debugging when
> panic_on_warn is enabled. WARN() is added where to detect the error
> on slab_err and object_err.
>
> There are cases where slab_err is called before meaningful logs are
> printed. If the WARN() in slab_err cause a panic, these logs will not
> be printed. WARN() should called after these logs are printed. Thus
> slab_err() is splited to __slab_err that calls the WARN() and it is
> called after printing logs.
>
> Changes in v3:
> - move the WARN from slab_fix to slab_err, object_err to use WARN on
> all error reporting paths.
>
> Changes in v2:
> - Replace direct calling with BUG_ON with the use of WARN in slab_fix.
>
> Signed-off-by: Hyesoo Yu <hyesoo.yu@xxxxxxxxxxx>
> Change-Id: I90b2ea9ffc58e3826f7ae9f1a774bb48c2d43bf4
> ---
> mm/slub.c | 31 +++++++++++++++++++------------
> 1 file changed, 19 insertions(+), 12 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index de62fed12236..7f0583a71cda 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -5473,8 +5481,7 @@ static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n)
> remove_partial(n, slab);
> list_add(&slab->slab_list, &discard);
> } else {
> - list_slab_objects(s, slab,
> - "Objects remaining in %s on __kmem_cache_shutdown()");
> + list_slab_objects(s, slab);

Could you remove WARN() in kmem_cache_destroy()?

When a cache is destroyed with remaining objects, two WARNINGs being are
printed, one from list_slab_objects() and another from kmem_cache_destroy().
The latter becomes redundant with this patch.

The WARN() is added there because it's good to catch such an error.
At that time, slab_err() and object_err() did not call WARN().

> }
> }
> spin_unlock_irq(&n->list_lock);
> --
> 2.28.0
>

--
Cheers,
Harry