Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head

From: Joel Fernandes
Date: Mon Feb 17 2020 - 14:32:27 EST


On Mon, Feb 17, 2020 at 07:38:01PM +0100, Marco Elver wrote:
> On Mon, 17 Feb 2020 at 19:23, Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
> >
> > On Mon, Feb 17, 2020 at 01:38:51PM +0100, Peter Zijlstra wrote:
> > > On Fri, Feb 14, 2020 at 04:25:18PM -0800, paulmck@xxxxxxxxxx wrote:
> > > > From: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
> > > >
> > > > The RCU tasks list of callbacks, rcu_tasks_cbs_head, is sampled locklessly
> > > > by rcu_tasks_kthread() when waiting for work to do. This commit therefore
> > > > applies READ_ONCE() to that lockless sampling and WRITE_ONCE() to the
> > > > single potential store outside of rcu_tasks_kthread.
> > > >
> > > > This data race was reported by KCSAN. Not appropriate for backporting
> > > > due to failure being unlikely.
> > >
> > > What failure is possible here? AFAICT this is (again) one of them
> > > load-complare-against-constant-discard patterns that are impossible to
> > > mess up.
> >
> > You mean that because we are only testing for NULL, so load/store tearing of
> > rcu_tasks_cbs_head is not an issue right?
> >
> > I agree. Even with invented stores, worst case we have a false-wakeup and go
> > right back to sleep. Or, we read a partial rcu_tasks_cbs_head, and then go
> > acquire the lock and read the whole thing correctly under lock.
> >
> > I wonder if we can teach KCSAN to actually ignore this kind of situation so
> > we don't need to employ READ_ONCE() for no reason. Basically ask it to not
> > bother if the read was only NULL-testing. +Marco since it is KCSAN related.
>
> This came up before. It requires somehow making the compiler tell us
> what type of operation we're doing and in what context:
> https://lore.kernel.org/lkml/CANpmjNNZQsatHexXHm4dXvA0na6r9xMgVD5R+-8d7VXEBRi32w@xxxxxxxxxxxxxx/

Oh, wow. Ok.

> In particular:
>
> > > This particular rule relies on semantic analysis that is beyond what
> > > the TSAN instrumentation currently supports. Right now we support GCC
> > > and Clang; changing the compiler probably means we'd end up with only
> > > one (probably Clang), and many more years before the change has
> > > propagated to the majority of used compiler versions. It'd be good if
> > > we can do this purely as a change in the kernel's codebase.
>
> Load/store tearing might not be an issue, but we also have to be aware
> of things like load fusing, e.g. in a loop.

Ok. Makes sense.

> That being said, there may be ways to get similar results without yet
> changing the compiler.

Interesting. thanks!

- Joel


> Thanks,
> -- Marco
>
> > thanks,
> >
> > - Joel
> >
> >
> > > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > > > ---
> > > > kernel/rcu/update.c | 4 ++--
> > > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> > > > index 6c4b862..a27df76 100644
> > > > --- a/kernel/rcu/update.c
> > > > +++ b/kernel/rcu/update.c
> > > > @@ -528,7 +528,7 @@ void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func)
> > > > rhp->func = func;
> > > > raw_spin_lock_irqsave(&rcu_tasks_cbs_lock, flags);
> > > > needwake = !rcu_tasks_cbs_head;
> > > > - *rcu_tasks_cbs_tail = rhp;
> > > > + WRITE_ONCE(*rcu_tasks_cbs_tail, rhp);
> > > > rcu_tasks_cbs_tail = &rhp->next;
> > > > raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags);
> > > > /* We can't create the thread unless interrupts are enabled. */
> > > > @@ -658,7 +658,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
> > > > /* If there were none, wait a bit and start over. */
> > > > if (!list) {
> > > > wait_event_interruptible(rcu_tasks_cbs_wq,
> > > > - rcu_tasks_cbs_head);
> > > > + READ_ONCE(rcu_tasks_cbs_head));
> > > > if (!rcu_tasks_cbs_head) {
> > > > WARN_ON(signal_pending(current));
> > > > schedule_timeout_interruptible(HZ/10);
> > > > --
> > > > 2.9.5
> > > >