Re: [PATCH RFC 0/9] nfs/sunrpc: stop holding netns references in client-side NFS and RPC objects
From: Jeff Layton
Date: Mon Mar 17 2025 - 17:58:10 EST
On Mon, 2025-03-17 at 21:35 +0000, Trond Myklebust wrote:
> On Mon, 2025-03-17 at 16:59 -0400, Jeff Layton wrote:
> > We have a long-standing problem with containers that have NFS mounts
> > in
> > them. Best practice is to unmount gracefully, of course, but
> > sometimes
> > containers just spontaneously die (e.g. SIGSEGV in the init task in
> > the
> > container). When that happens the orchestrator will see that all of
> > the
> > tasks are dead, and will detach the mount namespace and kill off the
> > network connection.
> >
> > If there are RPCs in flight at the time, the rpc_clnt will try to
> > retransmit them indefinitely, but there is no hope of them ever
> > contacting the server since nothing in userland can reach the netns
> > at that point to fix anything.
> >
> > This patchset takes the approach of changing various nfs client and
> > sunrpc objects to not hold a netns reference. Instead, when a nfs_net
> > or
> > sunrpc_net is exiting, all nfs_server, nfs_client and rpc_clnt
> > objects
> > associated with it are shut down, and the pre_exit functions block
> > until they are gone.
> >
> > With this approach, when the last userland task in the container
> > exits,
> > the NFS and RPC clients get cleaned up automatically. As a bonus,
> > this
> > fixes another bug with the gssproxy RPC client that causes net
> > namespace
> > leaks in any container where it runs (details in the patch
> > descriptions).
> >
>
> So with this approach, what happens if the NFS mount was created in a
> container, but got bind mounted somewhere else?
>
The lifetime of these objects are tied to the net namespace. If it gets
bind-mounted into a different mount namespace, while the tasks are
setns()'ed into the correct net namespace, then I expect the mount
would end up shut down at that point and be unusable, just like if you
echo 1 into the shutdown file in sysfs.
Hopefully no one is doing anything that silly. You wouldn't be able to
upcall, for one thing, since there wouldn't be any more userland
processes attached to the netns.
I'll test that scenario and get back to you though. I do want to make
sure that that's not going to lead to a crash or anything.
--
Jeff Layton <jlayton@xxxxxxxxxx>