Re: [syzbot] KCSAN: data-race in call_rcu / rcu_gp_kthread
From: Marco Elver
Date: Tue Jul 20 2021 - 10:17:46 EST
On Tue, 20 Jul 2021 at 15:18, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
[...]
> Good catch! And yes, this would be hard to reproduce.
>
> How about as shown below?
Acked-by: Marco Elver <elver@xxxxxxxxxx>
I was merely a little surprised syzbot was able to exercise RCU in a
way that resulted in a data race your torture runs hadn't found yet
(or perhaps it did and missed?).
Thanks,
-- Marco
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> commit 43e0f01f3b6f510dbe31d02a8f4c909c45deff04
> Author: Paul E. McKenney <paulmck@xxxxxxxxxx>
> Date: Tue Jul 20 06:16:27 2021 -0700
>
> rcu: Mark accesses to rcu_state.n_force_qs
>
> This commit marks accesses to the rcu_state.n_force_qs. These data
> races are hard to make happen, but syzkaller was equal to the task.
>
> Reported-by: syzbot+e08a83a1940ec3846cd5@xxxxxxxxxxxxxxxxxxxxxxxxx
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index a7379c44a2366..245bca7cdf6ee 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1913,7 +1913,7 @@ static void rcu_gp_fqs(bool first_time)
> struct rcu_node *rnp = rcu_get_root();
>
> WRITE_ONCE(rcu_state.gp_activity, jiffies);
> - rcu_state.n_force_qs++;
> + WRITE_ONCE(rcu_state.n_force_qs, rcu_state.n_force_qs + 1);
> if (first_time) {
> /* Collect dyntick-idle snapshots. */
> force_qs_rnp(dyntick_save_progress_counter);
> @@ -2556,7 +2556,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
> /* Reset ->qlen_last_fqs_check trigger if enough CBs have drained. */
> if (count == 0 && rdp->qlen_last_fqs_check != 0) {
> rdp->qlen_last_fqs_check = 0;
> - rdp->n_force_qs_snap = rcu_state.n_force_qs;
> + rdp->n_force_qs_snap = READ_ONCE(rcu_state.n_force_qs);
> } else if (count < rdp->qlen_last_fqs_check - qhimark)
> rdp->qlen_last_fqs_check = count;
>
> @@ -2904,10 +2904,10 @@ static void __call_rcu_core(struct rcu_data *rdp, struct rcu_head *head,
> } else {
> /* Give the grace period a kick. */
> rdp->blimit = DEFAULT_MAX_RCU_BLIMIT;
> - if (rcu_state.n_force_qs == rdp->n_force_qs_snap &&
> + if (READ_ONCE(rcu_state.n_force_qs) == rdp->n_force_qs_snap &&
> rcu_segcblist_first_pend_cb(&rdp->cblist) != head)
> rcu_force_quiescent_state();
> - rdp->n_force_qs_snap = rcu_state.n_force_qs;
> + rdp->n_force_qs_snap = READ_ONCE(rcu_state.n_force_qs);
> rdp->qlen_last_fqs_check = rcu_segcblist_n_cbs(&rdp->cblist);
> }
> }
> @@ -4134,7 +4134,7 @@ int rcutree_prepare_cpu(unsigned int cpu)
> /* Set up local state, ensuring consistent view of global state. */
> raw_spin_lock_irqsave_rcu_node(rnp, flags);
> rdp->qlen_last_fqs_check = 0;
> - rdp->n_force_qs_snap = rcu_state.n_force_qs;
> + rdp->n_force_qs_snap = READ_ONCE(rcu_state.n_force_qs);
> rdp->blimit = blimit;
> rdp->dynticks_nesting = 1; /* CPU not up, no tearing. */
> rcu_dynticks_eqs_online();