Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head

From: Paul E. McKenney
Date: Mon Feb 17 2020 - 13:16:18 EST


On Mon, Feb 17, 2020 at 01:38:51PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 14, 2020 at 04:25:18PM -0800, paulmck@xxxxxxxxxx wrote:
> > From: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
> >
> > The RCU tasks list of callbacks, rcu_tasks_cbs_head, is sampled locklessly
> > by rcu_tasks_kthread() when waiting for work to do. This commit therefore
> > applies READ_ONCE() to that lockless sampling and WRITE_ONCE() to the
> > single potential store outside of rcu_tasks_kthread.
> >
> > This data race was reported by KCSAN. Not appropriate for backporting
> > due to failure being unlikely.
>
> What failure is possible here? AFAICT this is (again) one of them
> load-complare-against-constant-discard patterns that are impossible to
> mess up.

First, please keep in mind that this is RCU code. Rather uncomplicated
for RCU, to be sure, but still RCU code.

The failure modes are thus as follows:

o I produce a patch for which KCSAN gives a legitimate warning,
but this warning is obscured by a pile of other warnings.
Yes, we should continue improving KCSAN's ability to adapt
to the users desired compiler-optimization risk level, but
in RCU's case that risk level is set quite low.

In RCU, what others are calling false positives are therefore
addressed. Yes, this does cost me a bit of work, but it is
trivial compared to the work required to track down a real bug.

o Someone optimizes or otherwise changes the wait/wakeup code,
which inadvertently gives the compiler more scope for mischief.

In short, within RCU, I am handling all KCSAN complaints. This is looking
to be an extremely inexpensive insurance policy for RCU. Other subsystems
are of course free to make their own tradeoffs, and subsystems having
less-aggressive concurrency control might be well-advised to take a
different path than the one I am taking.

Thanx, Paul

> > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > ---
> > kernel/rcu/update.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> > index 6c4b862..a27df76 100644
> > --- a/kernel/rcu/update.c
> > +++ b/kernel/rcu/update.c
> > @@ -528,7 +528,7 @@ void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func)
> > rhp->func = func;
> > raw_spin_lock_irqsave(&rcu_tasks_cbs_lock, flags);
> > needwake = !rcu_tasks_cbs_head;
> > - *rcu_tasks_cbs_tail = rhp;
> > + WRITE_ONCE(*rcu_tasks_cbs_tail, rhp);
> > rcu_tasks_cbs_tail = &rhp->next;
> > raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags);
> > /* We can't create the thread unless interrupts are enabled. */
> > @@ -658,7 +658,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
> > /* If there were none, wait a bit and start over. */
> > if (!list) {
> > wait_event_interruptible(rcu_tasks_cbs_wq,
> > - rcu_tasks_cbs_head);
> > + READ_ONCE(rcu_tasks_cbs_head));
> > if (!rcu_tasks_cbs_head) {
> > WARN_ON(signal_pending(current));
> > schedule_timeout_interruptible(HZ/10);
> > --
> > 2.9.5
> >