On Thu, Jul 13, 2023 at 10:02:17AM +0800, Gao Xiang wrote:
On 2023/7/13 08:32, Joel Fernandes wrote:
On Wed, Jul 12, 2023 at 02:20:56PM -0700, Sandeep Dhavale wrote:
[..]
As such this patch looks correct to me, one thing I noticed is thatPlease correct me if I am wrong, reading from the comment in
you can check rcu_is_watching() like the lockdep-enabled code does.
That will tell you also if a reader-section is possible because in
extended-quiescent-states, RCU readers should be non-existent or
that's a bug.
kernel/rcu/update.c rcu_read_lock_held_common()
..
* The reason for this is that RCU ignores CPUs that are
* in such a section, considering these as in extended quiescent state,
* so such a CPU is effectively never in an RCU read-side critical section
* regardless of what RCU primitives it invokes.
It seems rcu will treat this as lock not held rather than a fact that
lock is not held. Is my understanding correct?
If RCU treats it as a lock not held, that is a fact for RCU ;-). Maybe you
mean it is not a fact for erofs?
I'm not sure if I get what you mean, EROFS doesn't take any RCU read lock
here:
The key point is that we need lockdep to report errors when
rcu_read_lock(), rcu_dereference(), and friends are used when RCU is
not watching. We also need lockdep to report an error when someone
uses rcu_dereference() when RCU is not watching, but also forgets the
rcu_read_lock().
And this is the job of rcu_read_lock_held(), which is one reason why
that rcu_is_watching() is needed.
z_erofs_decompressqueue_endio() is actually a "bio->bi_end_io", previously
which can be called under two scenarios:
1) under softirq context, which is actually part of device I/O compleltion;
2) under threaded context, like what dm-verity or likewise calls.
But EROFS needs to decompress in a threaded context anyway, so we trigger
a workqueue to resolve the case 1).
Recently, someone reported there could be some case 3) [I think it was
introduced recently but I have no time to dig into it]:
case 3: under RCU read lock context, which is shown by this:
https://lore.kernel.org/r/4a8254eb-ac39-1e19-3d82-417d3a7b9f94@xxxxxxxxxxxxxxxxx/T/#u
and such RCU read lock is taken in __blk_mq_run_dispatch_ops().
But as the commit shown, we only need to trigger a workqueue for case 1)
and 3) due to performance reasons.
Just out of curiosity, exactly how much is it costing to trigger the
workqueue?