Re: BUG: MAX_LOCKDEP_CHAINS too low!

From: Dmitry Vyukov
Date: Tue Jan 14 2020 - 05:19:11 EST


On Tue, Jan 14, 2020 at 9:43 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Thu, Jan 09, 2020 at 11:59:25AM +0100, Dmitry Vyukov wrote:
>
> > Or are there some ID leaks in lockdep? syzbot has a bunch of very
> > simple reproducers for these bugs, so not really a maximally diverse
> > load. And I think I saw these bugs massively when testing just a
> > single subsystem too, e.g. netfilter.
>
> Can you share me one of the simple ones? A .c files I can run on my
> regular test box that should make it go *splat* ?
>
> Often in the past hitting these limits was the result of some
> particularly poor annotation.
>
> For instance, locks in per-cpu data used to trigger this, since
> static locks don't need explicit {mutex,spin_lock}_init() calls and
> instead use their (static) address. This worked fine for global state,
> but per-cpu is an exception, there it causes a nr_cpus explosion in
> lockdep state because you get nr_cpus different addresses.
>
> Now, we fixed that particular issue:
>
> 383776fa7527 ("locking/lockdep: Handle statically initialized PER_CPU locks properly")
>
> but maybe there's something else going on.
>
> Just blindly bumping the number without analysis of what exactly is
> happening is never a good idea.


Each of these has a fair amount of C reproducers:

BUG: MAX_LOCKDEP_KEYS too low!
https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc

BUG: MAX_LOCKDEP_ENTRIES too low!
https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b

BUG: MAX_LOCKDEP_CHAINS too low!
https://syzkaller.appspot.com/bug?id=bf037f4725d40a8d350b2b1b3b3e0947c6efae85

BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
https://syzkaller.appspot.com/bug?id=381cb436fe60dc03d7fd2a092b46d7f09542a72a


I just took the latest one for one of the bugs:
https://syzkaller.appspot.com/text?tag=ReproC&x=15a096aee00000
run on next-20200114 with this config:
https://github.com/google/syzkaller/blob/master/dashboard/config/upstream-kasan.config
and got:

root@syzkaller:~# ./a.out
[ 110.512842][ T6486] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
[ 110.518638][ T6486] turning off the locking correctness validator.
[ 110.520779][ T6486] CPU: 3 PID: 6486 Comm: kworker/u9:1 Not tainted
5.5.0-rc6-next-20200114 #1
[ 110.522496][ T6486] Hardware name: QEMU Standard PC (Q35 + ICH9,
2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebu4
[ 110.524367][ T6486] Workqueue: hci0 hci_power_on
[ 110.525126][ T6486] Call Trace:
[ 110.525633][ T6486] dump_stack+0x199/0x216
[ 110.526294][ T6486] __lock_acquire.cold.64+0x18/0x385
[ 110.527084][ T6486] ? lock_downgrade+0x900/0x900
[ 110.527942][ T6486] ? __kasan_check_write+0x14/0x20
[ 110.528716][ T6486] ? do_raw_spin_lock+0x132/0x2e0
[ 110.529499][ T6486] ? mark_held_locks+0x130/0x130
[ 110.530284][ T6486] ? kfree_skbmem+0xfb/0x1c0
[ 110.531020][ T6486] ? kfree_skbmem+0xfb/0x1c0
[ 110.531748][ T6486] ? hci_sock_dev_event+0x335/0x580
[ 110.532578][ T6486] lock_acquire+0x194/0x410
[ 110.533305][ T6486] ? skb_dequeue+0x22/0x180
[ 110.534028][ T6486] _raw_spin_lock_irqsave+0x99/0xd0
[ 110.534873][ T6486] ? skb_dequeue+0x22/0x180
[ 110.535581][ T6486] skb_dequeue+0x22/0x180
[ 110.536263][ T6486] skb_queue_purge+0x26/0x40
[ 110.536996][ T6486] ? vhci_open+0x310/0x310
[ 110.537794][ T6486] vhci_flush+0x3b/0x50
[ 110.538481][ T6486] hci_dev_do_close+0x5c8/0x1070
[ 110.539297][ T6486] ? hci_dev_open+0x290/0x290
[ 110.540068][ T6486] hci_power_on+0x1d8/0x690
[ 110.540805][ T6486] ? hci_error_reset+0xf0/0xf0
[ 110.541603][ T6486] ? rcu_read_lock_sched_held+0x9c/0xd0
[ 110.542504][ T6486] ? rcu_read_lock_any_held.part.11+0x50/0x50
[ 110.543478][ T6486] ? trace_hardirqs_on+0x67/0x240
[ 110.544300][ T6486] process_one_work+0x9b5/0x1750
[ 110.545138][ T6486] ? pwq_dec_nr_in_flight+0x320/0x320
[ 110.545976][ T6486] ? lock_acquire+0x194/0x410
[ 110.546708][ T6486] worker_thread+0x8b/0xd20
[ 110.547414][ T6486] ? process_one_work+0x1750/0x1750
[ 110.548317][ T6486] kthread+0x365/0x450
[ 110.548960][ T6486] ? kthread_mod_delayed_work+0x1b0/0x1b0
[ 110.549867][ T6486] ret_from_fork+0x24/0x30


There are probably some clusters of reproducers, I see several that
use /dev/vhci. But there may be other types of reproducers
(potentially for different issues).