Re: [PATCH net 1/2] netpoll: Use rcu_access_pointer() in __netpoll_setup
From: Breno Leitao
Date: Tue Nov 19 2024 - 05:26:06 EST
Hello Herbet,
On Tue, Nov 19, 2024 at 11:28:33AM +0800, Herbert Xu wrote:
> On Mon, Nov 18, 2024 at 03:15:17AM -0800, Breno Leitao wrote:
> > The ndev->npinfo pointer in __netpoll_setup() is RCU-protected but is being
> > accessed directly for a NULL check. While no RCU read lock is held in this
> > context, we should still use proper RCU primitives for consistency and
> > correctness.
> >
> > Replace the direct NULL check with rcu_access_pointer(), which is the
> > appropriate primitive when only checking for NULL without dereferencing
> > the pointer. This function provides the necessary ordering guarantees
> > without requiring RCU read-side protection.
> >
> > Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
> > Fixes: 8fdd95ec162a ("netpoll: Allow netpoll_setup/cleanup recursion")
> > ---
> > net/core/netpoll.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/core/netpoll.c b/net/core/netpoll.c
> > index aa49b92e9194babab17b2e039daf092a524c5b88..45fb60bc4803958eb07d4038028269fc0c19622e 100644
> > --- a/net/core/netpoll.c
> > +++ b/net/core/netpoll.c
> > @@ -626,7 +626,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev)
> > goto out;
> > }
> >
> > - if (!ndev->npinfo) {
> > + if (!rcu_access_pointer(ndev->npinfo)) {
> > npinfo = kmalloc(sizeof(*npinfo), GFP_KERNEL);
> > if (!npinfo) {
> > err = -ENOMEM;
>
> This is completely bogus. Think about it, we are setting ndev->npinfo,
> meaning that we must have some form of synchronisation over this that
> guarantees us to be the only writer.
Correct. __netpoll_setup() should have the RTNL lock held. In the most
common case, it is done through:
netpoll_setup() {
rtnl_lock();
...
__netpoll_setup()
...
rtnl_unlock();
}
> So why does it need RCU protection for reading?
Good question, I understand this bring explicit calls to RCU pointers. In
fact, the same function that this patch changes (__netpoll_setup), later
does use rtnl_dereference(), and it is inside the same RTNL lock.
More over, looking at the RCU documentation, there is an explicit example
about this, at Documentation/RCU/Design/Requirements/Requirements.rst in
the "Performance and Scalability" section. I says:
spin_lock(&gp_lock);
p = rcu_access_pointer(gp);
if (!p) {
spin_unlock(&gp_lock);
return false;
}
> Assuming that this code isn't completely bonkers, then the correct
> primitive to use should be rcu_dereference_protected.
I looked about rcu_dereference_protected() as well, and I though it is
used when you are de-referencing the pointer, which is a more expensive
approach. In the code above, the code basically need to check if the
pointer is assigned or not. Looking at the code, it seems that having
rcu_access_pointer() inside the update lock seems a common pattern, than
that is what I chose.
On the other side, I understand we want to call an RCU primitive with
the _protected() context, so, I looked for a possible
`rcu_access_pointer_protected()`, but this best does not exist. Anyway,
I am happy to change it, if it is the correct API.
> Fixes header should be set to the commit that introduced the broken
> RCU marking:
>
> commit 5fbee843c32e5de2d8af68ba0bdd113bb0af9d86
> Author: Cong Wang <amwang@xxxxxxxxxx>
> Date: Tue Jan 22 21:29:39 2013 +0000
>
> netpoll: add RCU annotation to npinfo field
When 8fdd95ec162a was created, npinfo was an RCU pointer, although
without the RCU annotation that came later (5fbee843c). That is
reason I chose to fix 8fdd95ec162a.
For instance, checking out 8fdd95ec162a, at the end of
__netpoll_setup(), I see, the RCU annotation, indicating that
ndev->npinfo was a RCU protected pointer.
/* last thing to do is link it to the net device structure */
rcu_assign_pointer(ndev->npinfo, npinfo);
Thanks for feedback and the good pointers
--breno