Re: [PATCH rcu 14/16] rxrpc: Use call_rcu_hurry() instead of call_rcu()
From: Joel Fernandes
Date: Wed Nov 30 2022 - 18:25:42 EST
On Wed, Nov 30, 2022 at 11:05 PM David Howells <dhowells@xxxxxxxxxx> wrote:
>
> Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
>
> > > Note that this conflicts with my patch:
> > >
> > > rxrpc: Don't hold a ref for connection workqueue
> > > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/commit/?h=rxrpc-next&id=450b00011290660127c2d76f5c5ed264126eb229
> > >
> > > which should render it unnecessary. It's a little ahead of yours in the
> > > net-next queue, if that means anything.
> >
> > Could you clarify why it is unnecessary?
>
> Rather than tearing down parts of the connection it only logs a trace line,
> frees the memory and decrements the counter on the namespace. This it used to
> account that all the pieces of memory allocated in that namespace are gone
> before the namespace is removed to check for leaks. The RCU cleanup used to
> use some other stuff (such as the peer hash) in the rxrpc_net struct but no
> longer will after the patches I submitted.
>
> > After your patch, you are still doing a wake up in your call_rcu() callback:
> >
> > - ASSERTCMP(refcount_read(&conn->ref), ==, 0);
> > + if (atomic_dec_and_test(&rxnet->nr_conns))
> > + wake_up_var(&rxnet->nr_conns);
> > +}
> >
> > Are you saying the code can now tolerate delays? What if the RCU
> > callback is invoked after arbitrarily long delays making the sleeping
> > process to wait?
>
> True. But that now only holds up the destruction of a net namespace and the
> removal of the rxrpc module.
>
> > If you agree, you can convert the call_rcu() to call_rcu_hurry() in
> > your patch itself. Would you be willing to do that? If not, that's
> > totally OK and I can send a patch later once yours is in (after
> > further testing).
>
> I can add it to part 4 (see my rxrpc-ringless-5 branch) if it is necessary.
Ok sounds good, on module removal the rcu_barrier() will flush out
pending callbacks so that should not be an issue.
Based on your message, I think we can drop this patch then. Since Paul
is already dropping it, no other action is needed.
(I just realized my patch was not fixing a test failure, like the
other net ones did, but rather we found the issue by static analysis
-- i.e. programmatically auditing all callbacks in the kernel doing
wake ups).
thanks,
- Joel