Re: ODEBUG: Out of memory. ODEBUG disabled

From: Qian Cai
Date: Sat Nov 10 2018 - 09:16:59 EST


On 11/10/18 at 8:59 AM, Waiman Long wrote:

> On 11/09/2018 08:45 PM, Qian Cai wrote:
> >> Sent: Friday, November 09, 2018 at 5:08 PM
> >> From: "Waiman Long" <longman@xxxxxxxxxx>
> >> To: "Qian Cai" <cai@xxxxxx>, "Yang Shi" <yang.shi@xxxxxxxxxxxxxxxxx>
> >> Cc: "open list" <linux-kernel@xxxxxxxxxxxxxxx>, "Thomas Gleixner" <tglx@xxxxxxxxxxxxx>, "Arnd Bergmann" <arnd@xxxxxxxx>, "Joel Fernandes (Google)" <joel@xxxxxxxxxxxxxxxxx>, "Zhong Jiang" <zhongjiang@xxxxxxxxxx>
> >> Subject: Re: ODEBUG: Out of memory. ODEBUG disabled
> >>
> >> On 11/09/2018 04:51 PM, Qian Cai wrote:
> >>>> On Nov 9, 2018, at 4:42 PM, Yang Shi <yang.shi@xxxxxxxxxxxxxxxxx> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 11/9/18 1:36 PM, Qian Cai wrote:
> >>>>> It is a bit annoying on this aarch64 server with 64 CPUs that is
> >>>>> booting the latest mainline (3541833fd1f2) causes object debugging
> >>>>> always running out of memory.
> >>>> May you please paste the detail failure log?
> >>> I assume you mean dmesg.
> >>>
> >>> Here is the dmesg for 64 CPUs,
> >>> https://paste.ubuntu.com/p/BnhvXXhn7k/
> >>>>> I have to boot the kernel with only 16 CPUs instead (nr_cpus=16)
> >>>>> to make it work. Is it expected that object debugging is not going
> >>>>> to work with large machines?
> >>>> I don't think so. I'm supposed it works well with large CPU number on x86.
> >>> Here is the one with nr_cpus workaround,
> >>> https://paste.ubuntu.com/p/qMpd2CCPSV/
> >> The debugobjects code have a set of 1024 statically allocated debug
> >> objects that can be used in early boot before the slab memory allocator
> >> is initialized. Apparently, the system may have used up all the
> >> statically allocated objects. Try double ODEBUG_POOL_SIZE to see if it
> >> helps.
> > Great, you are right. Doubling the size makes it work. Does it make sense
> > to have a kconfig option instead?
>
> First, I think you need to figure out what your system needed to use up
> so many debug objects in early boot. If there is a legitimate reason for
> this behavior, we can talk about having a kconfig option to increase that.
Anybody else not getting ODEBUG OOM with more than 64-CPU? As mentioned, restricting to 16-CPU works fine. How can I figure out why the system uses so much debug objects?