Re: [PATCH] rcu: Only check tasks blocked on leaf rcu_nodes for PREEMPT_RCU

From: Paul E. McKenney
Date: Thu Jul 21 2022 - 21:16:19 EST


On Fri, Jul 22, 2022 at 08:52:13AM +0800, Zqiang wrote:
> In PREEMPT_RCU kernel, for multi-node rcu tree, if the RCU read
> critical section is preempted, the current task are only queued
> leaf rcu_node blkd list, for single-node rcu tree, the root node
> is also leaf node.
>
> This commit add rcu_is_leaf_node() to filter out checks for non-leaf
> rcu_node.
>
> Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx>

First, thank you for digging into this!

> ---
> kernel/rcu/tree_plugin.h | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index b2219577fbe2..a9df11ec65af 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -693,6 +693,8 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
>
> RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_check_blocked_tasks() invoked with preemption enabled!!!\n");
> raw_lockdep_assert_held_rcu_node(rnp);
> + if (!rcu_is_leaf_node(rnp))
> + goto end;
> if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp)))
> dump_blkd_tasks(rnp, 10);
> if (rcu_preempt_has_tasks(rnp) &&

So you are adding a simple check for all rcu_node structures
that removes two simple checks for non-leaf rcu_node structures.

Assuming that the costs of all these checks is roughly the same (but
please feel free to actually measure them on real hardware), what what
fraction of the rcu_node structures must be non-leaf for this change to
be a win? Then for what configurations using default fanouts is this
fraction exceeded?

The default fanout is 16 from non-leaf to leaf and 64 from non-leaf
to non-leaf (32 for non-leaf to non-leaf on 32-bit systems).

Hey, you wanted some algebra practice anyway. ;-)

> @@ -703,6 +705,7 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
> trace_rcu_unlock_preempted_task(TPS("rcu_preempt-GPS"),
> rnp->gp_seq, t->pid);
> }
> +end:
> WARN_ON_ONCE(rnp->qsmask);
> }
>
> @@ -1178,7 +1181,8 @@ static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags)
> */
> static void rcu_preempt_boost_start_gp(struct rcu_node *rnp)
> {
> - rnp->boost_time = jiffies + RCU_BOOST_DELAY_JIFFIES;
> + if (rcu_is_leaf_node(rnp))
> + rnp->boost_time = jiffies + RCU_BOOST_DELAY_JIFFIES;

And here you are adding a check on all nodes to eliminate an addition
and a store on non-leaf nodes. Same questions as above, with the same
invitation to actually measure the costs of these operations.

Thanx, Paul

> }
>
> /*
> --
> 2.25.1
>