Re: [syzbot] BUG: MAX_LOCKDEP_KEYS too low! (2)

From: Dmitry Vyukov
Date: Fri May 21 2021 - 03:07:36 EST


On Thu, May 20, 2021 at 7:02 AM Tetsuo Handa
<penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On 2021/05/20 5:09, Dmitry Vyukov wrote:
> > On Wed, May 19, 2021 at 9:58 PM Randy Dunlap <rdunlap@xxxxxxxxxxxxx> wrote:
> >>
> >> On 5/19/21 12:48 PM, Dmitry Vyukov wrote:
> >>> On Wed, May 19, 2021 at 7:35 PM syzbot
> >>> <syzbot+a70a6358abd2c3f9550f@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> >>>>
> >>>> Hello,
> >>>>
> >>>> syzbot found the following issue on:
> >>>>
> >>>> HEAD commit: b81ac784 net: cdc_eem: fix URL to CDC EEM 1.0 spec
> >>>> git tree: net
> >>>> console output: https://syzkaller.appspot.com/x/log.txt?x=15a257c3d00000
> >>>> kernel config: https://syzkaller.appspot.com/x/.config?x=5b86a12e0d1933b5
> >>>> dashboard link: https://syzkaller.appspot.com/bug?extid=a70a6358abd2c3f9550f
> >>>>
> >>>> Unfortunately, I don't have any reproducer for this issue yet.
> >>>>
> >>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >>>> Reported-by: syzbot+a70a6358abd2c3f9550f@xxxxxxxxxxxxxxxxxxxxxxxxx
> >>>>
> >>>> BUG: MAX_LOCKDEP_KEYS too low!
> >>>
> >>
> >> include/linux/lockdep.h
> >>
> >> #define MAX_LOCKDEP_KEYS_BITS 13
> >> #define MAX_LOCKDEP_KEYS (1UL << MAX_LOCKDEP_KEYS_BITS)
> >
> > Ouch, so it's not configurable yet :(
>
> I didn't try to make this value configurable, for
>
> > Unless, of course, we identify the offender that produced thousands of
> > lock classes in the log and fix it.
>
> number of currently active locks should decrease over time.
> If this message is printed, increasing this value unlikely helps.
>
> We have https://lkml.kernel.org/r/c099ad52-0c2c-b886-bae2-c64bd8626452@xxxxxxxxx
> which seems to be unresolved.
>
> Regarding this report, cleanup of bonding device is too slow to catch up to
> creation of bonding device?
>
> We might need to throttle creation of BPF, bonding etc. which involve WQ operation for clean up?

I see, thanks for digging into it.

Unbounded asynchronous queueing is always a recipe for disaster... I
assume such issues can affect production as well, if some program
creates namespaces/devices in a loop. So I think ideally such things
are throttled/restricted in the kernel, e.g. new namespaces/devices
are not created if some threshold is reached.

Potentially syzkaller could throttle creation of new
namespaces/devices if we find a good and reliable way to monitor
backlog. Something like the length of a particular workqueue. It may
also help with OOMs. But so far I haven't found it.