Re: BUG: MAX_STACK_TRACE_ENTRIES too low! (2)

From: Eric Biggers
Date: Wed Jul 10 2019 - 14:02:47 EST


On Wed, Jul 10, 2019 at 10:46:00AM -0700, Bart Van Assche wrote:
> On 7/10/19 10:21 AM, Eric Biggers wrote:
> > With my simplified reproducer, on commit 669de8bda87b ("kernel/workqueue: Use
> > dynamic lockdep keys for workqueues") I see:
> >
> > WARNING: CPU: 3 PID: 189 at kernel/locking/lockdep.c:747 register_lock_class+0x4f6/0x580
> >
> > and then somewhat later:
> >
> > BUG: MAX_LOCKDEP_KEYS too low!
> >
> > If on top of that I cherry pick commit 28d49e282665 ("locking/lockdep: Shrink
> > struct lock_class_key"), I see instead:
> >
> > BUG: MAX_STACK_TRACE_ENTRIES too low!
> >
> > I also see that on mainline.
> >
> > Alternatively, if I check out 669de8bda87b and revert it, I don't see anything.
>
> Hi Eric,
>
> Is the rdma_ucm code the only code that triggers the "BUG:
> MAX_STACK_TRACE_ENTRIES too low!" complaint or is this complaint also
> triggered by other kernel code? I'm asking this because I think that
> fixing this would require to implement garbage collection for the
> stack_trace[] array in the lockdep code. That would make the lockdep
> code slower. I don't think that making the lockdep code slower would be
> welcome.
>
> Bart.

I already mentioned that io_uring triggers it too.

Those are just 2 cases that syzbot happened to generate reproducers for. I
expect there are many others too, since many places in the kernel allocate
workqueues. AFAICS most are placed in static or global variables which avoids
this issue, but there are still many cases where a workqueue is owned by some
dynamic structure that can have a much shorter lifetime.

You can also check the other syzbot reports that look similar
(https://lore.kernel.org/lkml/20190710055838.GC2152@xxxxxxxxxxxxxxx/).
Two of them have C reproducers too.

- Eric