Re: [PATCH v1] rcu: Fix and improve RCU read lock checks when !CONFIG_DEBUG_LOCK_ALLOC

From: Gao Xiang
Date: Wed Jul 12 2023 - 22:10:42 EST




On 2023/7/13 10:02, Gao Xiang wrote:


On 2023/7/13 08:32, Joel Fernandes wrote:
On Wed, Jul 12, 2023 at 02:20:56PM -0700, Sandeep Dhavale wrote:
[..]
As such this patch looks correct to me, one thing I noticed is that
you can check rcu_is_watching() like the lockdep-enabled code does.
That will tell you also if a reader-section is possible because in
extended-quiescent-states, RCU readers should be non-existent or
that's a bug.

Please correct me if I am wrong, reading from the comment in
kernel/rcu/update.c rcu_read_lock_held_common()
..
   * The reason for this is that RCU ignores CPUs that are
  * in such a section, considering these as in extended quiescent state,
  * so such a CPU is effectively never in an RCU read-side critical section
  * regardless of what RCU primitives it invokes.

It seems rcu will treat this as lock not held rather than a fact that
lock is not held. Is my understanding correct?

If RCU treats it as a lock not held, that is a fact for RCU ;-). Maybe you
mean it is not a fact for erofs?

I'm not sure if I get what you mean, EROFS doesn't take any RCU read lock
here:

z_erofs_decompressqueue_endio() is actually a "bio->bi_end_io", previously
which can be called under two scenarios:

 1) under softirq context, which is actually part of device I/O compleltion;

 2) under threaded context, like what dm-verity or likewise calls.

But EROFS needs to decompress in a threaded context anyway, so we trigger
a workqueue to resolve the case 1).


Recently, someone reported there could be some case 3) [I think it was
introduced recently but I have no time to dig into it]:

 case 3: under RCU read lock context, which is shown by this:
https://lore.kernel.org/r/4a8254eb-ac39-1e19-3d82-417d3a7b9f94@xxxxxxxxxxxxxxxxx/T/#u

Sorry about the incorrect link (I really don't know who initally reported
this and on which device):

https://lore.kernel.org/linux-erofs/161f1615-3d85-cf47-d2d5-695adf1ca7d4@xxxxxxxxxxxxxxxxx/T/#t


 and such RCU read lock is taken in __blk_mq_run_dispatch_ops().

But as the commit shown, we only need to trigger a workqueue for case 1)
and 3) due to performance reasons.

Hopefully I show it more clear.

Thanks,
Gao Xiang