Re: [d_alloc_parallel] WARNING: bad unlock balance detected!

From: Al Viro
Date: Mon Nov 06 2017 - 21:33:35 EST


On Tue, Nov 07, 2017 at 10:01:13AM +0800, Fengguang Wu wrote:
> Hi,
>
> Here is a warning in v4.14-rc8 -- it's not necessarily a new bug.

Why is it a bug at all?

> [ 428.512005] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
> LKP: HOSTNAME vm-lkp-wsx03-openwrt-i386-8, MAC , kernel 4.14.0-rc8 158, serial console /dev/ttyS0
> [ 429.798345] Kernel tests: Boot OK!
> [ 430.761760] [ 430.766166] =====================================
> [ 430.775297] WARNING: bad unlock balance detected!
> [ 430.784342] 4.14.0-rc8 #158 Not tainted
> [ 430.792153] -------------------------------------
> [ 430.801319] pidof/1024 is trying to release lock (rcu_preempt_state) at:
> [ 430.813514] [<c10e4348>] rcu_read_unlock_special+0x5f8/0x620
> [ 430.824041] but there are no more locks to release!

Er... yes? What of that? Since when is rcu_read_lock() not allowed to
be used under an rwsem?

> [ 430.833342] [ 430.833342] other info that might help us debug this:
> [ 430.845985] 2 locks held by pidof/1024:
> [ 430.853826] #0: (&sb->s_type->i_mutex_key){....}, at: [<c1266efa>] lookup_slow+0x8a/0x310
> [ 430.869344] #1: (rcu_read_lock){....}, at: [<c128094e>] d_alloc_parallel+0x7e/0xd10

No shit - we are doing RCU cache chain walk while holding ->i_rwsem. As in
down_read(&rwsem);
...
rcu_read_lock();
...
rcu_read_unlock();

Why is that a problem? If we are suddenly not allowed to have an RCU reader
section while holding any kind of a blocking lock, a *lot* of places in the
kernel are screwed.

Please, explain.