Re: [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink()
From: Edward Adam Davis
Date: Thu May 14 2026 - 03:35:10 EST
On Wed, 13 May 2026 20:46:55 -0300, Jason Gunthorpe wrote:
> On Wed, May 13, 2026 at 02:17:28PM -0400, Leon Romanovsky wrote:
> >
> > On Thu, 07 May 2026 20:50:10 +0800, Edward Adam Davis wrote:
> > > We must serialize calls to nldev_dellink() or risk a crash as syzbot
> > > reported:
> > >
> > > Call Trace:
> > > udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
> > > rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
> > > rxe_sock_put drivers/infiniband/sw/rxe/rxe_net.c:639 [inline]
> > > rxe_net_del+0xfb/0x290 drivers/infiniband/sw/rxe/rxe_net.c:660
> > > rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
> > >
> > > [...]
> >
> > Applied, thanks!
> >
> > [1/1] RDMA/nldev: add mutual exclusion in nldev_dellink()
> > https://git.kernel.org/rdma/rdma/c/0b28000b64f40d
>
> This seems like a rxe bug, I would have expected the lock to be inside
> rxe to protect its racy implementation of rxe_net_del(), which looks
> like it is possibly also triggered by NETDEV_UNREGISTER...
No, it was triggered by RDMA_NLDEV_CMD_DELLINK, you can see the "call trace".
>
> ie it should not change nldev_dellink().
While this could be fixed within RXE, the same issue affects all other
RXE-like submodules when they subsequently support the "dellink" interface,
therefore, handling this within nldev_dellink() is relatively more appropriate.
Edward