Re: [PATCH v3 1/2] mm: slub: Print the broken data before restoring slub.

From: Harry Yoo
Date: Fri Feb 21 2025 - 03:20:14 EST


On Thu, Feb 20, 2025 at 12:39:43PM +0900, Hyesoo Yu wrote:
> Previously, the restore occured after printing the object in slub.
> After commit 47d911b02cbe ("slab: make check_object() more consistent"),
> the bytes are printed after the restore. This information about the bytes
> before the restore is highly valuable for debugging purpose.
> For instance, in a event of cache issue, it displays byte patterns
> by breaking them down into 64-bytes units. Without this information,
> we can only speculate on how it was broken. Hence the corrupted regions
> should be printed prior to the restoration process. However if an object
> breaks in multiple places, the same log may be output multiple times.
> Therefore the slub log is reported only once to prevent redundant printing,
> by sending a parameter indicating whether an error has occurred previously.
>
> Changes in v3:
> - Change the parameter type of check_bytes_and_report.
>
> Changes in v2:
> - Instead of using print_section every time on check_bytes_and_report,
> just print it once for the entire slub object before the restore.
>
> Signed-off-by: Hyesoo Yu <hyesoo.yu@xxxxxxxxxxx>
> Change-Id: I73cf76c110eed62506643913517c957c05a29520
> ---
> mm/slub.c | 29 ++++++++++++++---------------
> 1 file changed, 14 insertions(+), 15 deletions(-)
>

> @@ -1212,11 +1213,14 @@ check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
> if (slab_add_kunit_errors())
> goto skip_bug_print;
>
> - slab_bug(s, "%s overwritten", what);
> pr_err("0x%p-0x%p @offset=%tu. First byte 0x%x instead of 0x%x\n",
> fault, end - 1, fault - addr,
> fault[0], value);
>
> + scnprintf(buf, 100, "%s overwritten", what);
> + if (slab_obj_print)
> + object_err(s, slab, object, buf);


Wait, I think it's better to keep printing "%s overwritten" regardless
of slab_obj_print and only call __slab_err() if slab_obj_print == true
as discussed here [1]? Becuase in case there are multiple errors,
users should know.

[1] https://lore.kernel.org/all/2ff52c5e-4b6b-4b3d-9047-f00967315d3e@xxxxxxx

--
Cheers,
Harry