Re: [PATCH v6 1/4] mm/slub: enable debugging memory wasting of kmalloc

From: Hyeonggon Yoo
Date: Mon Oct 31 2022 - 07:36:46 EST


On Mon, Oct 31, 2022 at 10:05:58AM +0000, John Thomson wrote:
> On Mon, 31 Oct 2022, at 02:36, Feng Tang wrote:
> > Hi John,
> >
> > Thanks for the bisecting and reporting!
> >
> > On Mon, Oct 31, 2022 at 05:30:24AM +0800, Vlastimil Babka wrote:
> >> On 10/30/22 20:23, John Thomson wrote:
> >> > On Tue, 13 Sep 2022, at 06:54, Feng Tang wrote:
> >> >> kmalloc's API family is critical for mm, with one nature that it will
> >> >> round up the request size to a fixed one (mostly power of 2). Say
> >> >> when user requests memory for '2^n + 1' bytes, actually 2^(n+1) bytes
> >> >> could be allocated, so in worst case, there is around 50% memory
> >> >> space waste.
> >> >
> >> >
> >> > I have a ralink mt7621 router running Openwrt, using the mips ZBOOT kernel, and appear to have bisected
> >> > a very-nearly-clean kernel v6.1rc-2 boot issue to this commit.
> >> > I have 3 commits atop 6.1-rc2: fix a ZBOOT compile error, use the Openwrt LZMA options,
> >> > and enable DEBUG_ZBOOT for my platform. I am compiling my kernel within the Openwrt build system.
> >> > No guarantees this is not due to something I am doing wrong, but any insight would be greatly appreciated.
> >> >
> >> >
> >> > On UART, No indication of the (once extracted) kernel booting:
> >> >
> >> > transfer started ......................................... transfer ok, time=2.01s
> >> > setting up elf image... OK
> >> > jumping to kernel code
> >> > zimage at: 80BA4100 810D4720
> >> > Uncompressing Linux at load address 80001000
> >> > Copy device tree to address 80B96EE0
> >> > Now, booting the kernel...
> >>
> >> It's weird that the commit would cause no output so early, SLUB code is
> >> run only later.
> >
> > I noticed your cmdline has console setting, could you enable the
> > earlyprintk in cmdline like "earlyprintk=ttyS0,115200" etc to see
> > if there is more message printed out.
>
> Still nothing from vmlinux with earlykprint on UART unless revert.
>
> >
> > Also I want to confirm this is a boot failure and not only a boot
> > message missing.
>
> Yes, boot failure.
> Network comes up automatically on successful boot. Not happening when no kernel UART

It is really weird that I see no boot issue on my MIPS emulation with almost same
config, with different target - Malta board that QEMU supports. it just boot fine.

Can you attach debugger to the board?
(Which I hadn't tried. I had tried it only to QEMU)

[...]

> >> >
> >> >
> >> > possibly relevant config options:
> >> > grep -E '(SLUB|SLAB)' .config
> >> > # SLAB allocator options
> >> > # CONFIG_SLAB is not set
> >> > CONFIG_SLUB=y
> >> > CONFIG_SLAB_MERGE_DEFAULT=y
> >> > # CONFIG_SLAB_FREELIST_RANDOM is not set
> >> > # CONFIG_SLAB_FREELIST_HARDENED is not set
> >> > # CONFIG_SLUB_STATS is not set
> >> > CONFIG_SLUB_CPU_PARTIAL=y
> >> > # end of SLAB allocator options
> >> > # CONFIG_SLUB_DEBUG is not set
> >>
> >> Also not having CONFIG_SLUB_DEBUG enabled means most of the code the
> >> patch/commit touches is not even active.
> >> Could this be some miscompile or code layout change exposing some
> >> different bug, hmm.
>
> Yes, it could be.

What happens with clang?

>
> >> Is it any different if you do enable CONFIG_SLUB_DEBUG ?
>
> No change
>
> >> Or change to CONFIG_SLAB? (that would be really weird if not)
>
> This boots fine
>
> > I haven't found any clue from the code either, and I compiled
> > kernel with the config above and tested booting on an Alder-lake
> > desktop and a QEMU, which boot fine.
> >
> > Could you provide the full kernel config and demsg (in compressed
> > format if you think it's too big), so we can check more?
>
> Attached
>
> > Thanks,
> > Feng
>
> vmlinux is bigger, and entry point is larger (0x8074081c vs 0x807407dc revert vs 0x8073fcbc),
> so that may be it? Or not?
> revert + SLUB_DEBUG + SLUB_DEBUG_ON is bigger still, but does successfully boot.
> vmlinux entry point is 0x8074705c
>
>
> transfer started ......................................... transfer ok, time=2.01s
> setting up elf image... OK
> jumping to kernel code
> zimage at: 80BA4100 810D6FA0
> Uncompressing Linux at load address 80001000
> Copy device tree to address 80B9EEE0
> Now, booting the kernel...
> [ 0.000000] Linux version 6.1.0-rc2 (john@john) (mipsel-openwrt-linux-musl-gc
> c (OpenWrt GCC 11.3.0 r19724+16-1521d5f453) 11.3.0, GNU ld (GNU Binutils) 2.37)
> #0 SMP Fri Oct 28 03:48:10 2022
>
>
> I will keep looking.
>
> Thank you,
> --
> John Thomson



--
Thanks,
Hyeonggon