Re: [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1
From: bfields@xxxxxxxxxxxx
Date: Tue Sep 28 2021 - 10:57:52 EST
On Tue, Sep 28, 2021 at 02:27:33PM +0000, Trond Myklebust wrote:
> On Tue, 2021-09-28 at 10:17 -0400, bfields@xxxxxxxxxxxx wrote:
> > On Tue, Sep 28, 2021 at 02:04:49PM +0000, Trond Myklebust wrote:
> > > On Tue, 2021-09-28 at 09:49 -0400, bfields@xxxxxxxxxxxx wrote:
> > > > On Tue, Sep 28, 2021 at 01:30:17PM +0000, Trond Myklebust wrote:
> > > > > On Tue, 2021-09-28 at 11:14 +0800, Wang Hai wrote:
> > > > > > When use-gss-proxy is set to 1, write_gssp() creates a rpc
> > > > > > client
> > > > > > in
> > > > > > gssp_rpc_create(), this increases the netns refcount by 2,
> > > > > > these
> > > > > > refcounts are supposed to be released in
> > > > > > rpcsec_gss_exit_net(),
> > > > > > but
> > > > > > it
> > > > > > will never happen because rpcsec_gss_exit_net() is triggered
> > > > > > only
> > > > > > when
> > > > > > the netns refcount gets to 0, specifically:
> > > > > > refcount=0 -> cleanup_net() -> ops_exit_list ->
> > > > > > rpcsec_gss_exit_net
> > > > > > It is a deadlock situation here, refcount will never get to 0
> > > > > > unless
> > > > > > rpcsec_gss_exit_net() is called. So, in this case, the netns
> > > > > > refcount
> > > > > > should not be increased.
> > > > > >
> > > > > > In this case, xprt will take a netns refcount which is not
> > > > > > supposed
> > > > > > to be taken. Add a new flag to rpc_create_args called
> > > > > > RPC_CLNT_CREATE_NO_NET_REF for not increasing the netns
> > > > > > refcount.
> > > > > >
> > > > > > It is safe not to hold the netns refcount, because when
> > > > > > cleanup_net(), it
> > > > > > will hold the gssp_lock and then shut down the rpc client
> > > > > > synchronously.
> > > > > >
> > > > > >
> > > > > I don't like this solution at all. Adding this kind of flag is
> > > > > going to
> > > > > lead to problems down the road.
> > > > >
> > > > > Is there any reason whatsoever why we need this RPC client to
> > > > > exist
> > > > > when there is no active knfsd server? IOW: Is there any reason
> > > > > why
> > > > > we
> > > > > shouldn't defer creating this RPC client for when knfsd starts
> > > > > up
> > > > > in
> > > > > this net namespace, and why we can't shut it down when knfsd
> > > > > shuts
> > > > > down?
> > > >
> > > > The rpc create is done in the context of the process that writes
> > > > to
> > > > /proc/net/rpc/use-gss-proxy to get the right namespaces. I don't
> > > > know
> > > > how hard it would be capture that information for a later create.
> > > >
> > >
> > > svcauth_gss_proxy_init() uses the net namespace SVC_NET(rqstp)
> > > (i.e.
> > > the knfsd namespace) in the call to
> > > gssp_accept_sec_context_upcall().
> > >
> > > IOW: the net namespace used in the call to find the RPC client is
> > > the
> > > one set up by knfsd, and so if use-gss-proxy was set in a different
> > > namespace than the one used by knfsd, then it won't be found.
> >
> > Right. If you've got multiple containers, you don't want to find a
> > gss-proxy from a different container.
> >
>
> Exactly. So there is no namespace context to capture in the RPC client
> other than what's already in knfsd.
>
> The RPC client doesn't capture any other process context. It can cache
> a user cred in order to capture the user namespace, but that
> information appears to be unused by this gssd RPC client.
OK, that's good to know, thanks.
It's doing a path lookup (it uses an AF_LOCAL socket), and I'm not
assuming that will get the same result across containers. Is there an
easy way to do just that path lookup here and delay the res till knfsd
startup?
--b.
>
> So I'll repeat my question: Why can't we set this gssd RPC client up at
> knfsd startup time, and tear it down when knfsd is shut down?
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@xxxxxxxxxxxxxxx
>
>