Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals
From: Trond Myklebust
Date: Mon Jan 27 2020 - 10:54:38 EST
On Mon, 2020-01-27 at 14:45 +0000, Robert Milkowski wrote:
> On Thu, 23 Jan 2020 at 19:08, Trond Myklebust <
> trondmy@xxxxxxxxxxxxxxx> wrote:
> > On Wed, 2020-01-22 at 19:10 +0000, Schumaker, Anna wrote:
> > > Hi Robert,
> > >
> > > On Mon, 2020-01-20 at 17:55 +0000, Robert Milkowski wrote:
> > > > > -----Original Message-----
> > > > > From: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > > > > Sent: 30 December 2019 15:37
> > > > > To: Robert Milkowski <rmilkowski@xxxxxxxxx>
> > > > > Cc: Linux NFS Mailing List <linux-nfs@xxxxxxxxxxxxxxx>; Trond
> > > > > Myklebust
> > > > > <trond.myklebust@xxxxxxxxxxxxxxx>; Anna Schumaker
> > > > > <anna.schumaker@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx
> > > > > Subject: Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not
> > > > > do
> > > > > implicit
> > > > > lease renewals
> > > > >
> > > > >
> > > > >
> > > > > > On Dec 30, 2019, at 10:20 AM, Robert Milkowski <
> > > > > > rmilkowski@xxxxxxxxx>
> > > > > wrote:
> > > > > > From: Robert Milkowski <rmilkowski@xxxxxxxxx>
> > > > > >
> > > > > > Currently, each time nfs4_do_fsinfo() is called it will do
> > > > > > an
> > > > > > implicit
> > > > > > NFS4 lease renewal, which is not compliant with the NFS4
> > > > > specification.
> > > > > > This can result in a lease being expired by an NFS server.
> > > > > >
> > > > > > Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
> > > > > > leases")
> > > > > > introduced implicit client lease renewal in
> > > > > > nfs4_do_fsinfo(),
> > > > > > which
> > > > > > can result in the NFSv4.0 lease to expire on a server side,
> > > > > > and
> > > > > > servers returning NFS4ERR_EXPIRED or
> > > > > > NFS4ERR_STALE_CLIENTID.
> > > > > >
> > > > > > This can easily be reproduced by frequently unmounting a
> > > > > > sub-
> > > > > > mount,
> > > > > > then stat'ing it to get it mounted again, which will delay
> > > > > > or
> > > > > > even
> > > > > > completely prevent client from sending RENEW operations if
> > > > > > no
> > > > > > other
> > > > > > NFS operations are issued. Eventually nfs server will
> > > > > > expire
> > > > > > client's
> > > > > > lease and return an error on file access or next RENEW.
> > > > > >
> > > > > > This can also happen when a sub-mount is automatically
> > > > > > unmounted due
> > > > > > to inactivity (after nfs_mountpoint_expiry_timeout), then
> > > > > > it is
> > > > > > mounted again via stat(). This can result in a short window
> > > > > > during
> > > > > > which client's lease will expire on a server but not on a
> > > > > > client.
> > > > > > This specific case was observed on production systems.
> > > > > >
> > > > > > This patch makes an explicit lease renewal instead of an
> > > > > > implicit one,
> > > > > > by adding RENEW to a compound operation issued by
> > > > > > nfs4_do_fsinfo(),
> > > > > > similarly to NFSv4.1 which adds SEQUENCE operation.
> > > > > >
> > > > > > Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
> > > > > > leases")
> > > > > > Signed-off-by: Robert Milkowski <rmilkowski@xxxxxxxxx>
> > > > >
> > > > > Reviewed-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > > > >
> > > > >
> > > >
> > > > How do we progress it further?
> > >
> > > Thanks for following up! I have the patch included in my linux-
> > > next
> > > branch for
> > > the next merge window.
> >
> > NACK. This is the wrong way to solve the problem. Lease renewal in
> > NFSv4 does not need to be tied to fsinfo. It creates an unnecessary
> > extra error condition that has absolutely nothing to do with the
> > functionality of retrieving per-filesystem attributes.
>
> Well, we also do it for NFSv4.1+ with the SEQUENCE operation being
> send by fsinfo, and as Chuck pointed out
> it makes sense to do similarly for 4.0, which is what this patch
> does.
That's a bogus argument. The NFSv4.x SEQUENCE is mandated by the
protocol in order to manage the only once semantics via the session
slot mechanism. While it does implicitly renew the lease, that's not at
all the reason why it is present in this particular RPC call.
> Or as per the v2 version of the patch, not do the implicit RENEW for
> 4.0, which was a simpler patch,
> but then not in-line with 4.1+.
>
> > All that needs to be done here is to move the setting of clp-
> > > cl_last_renewal _out_ of nfs4_set_lease_period(), and just have
> > nfs4_proc_setclientid_confirm() and nfs4_update_session() call
> > do_renew_lease().
> >
>
> This would also require nfs4_setup_state_renewal() to call
> do_renew_lease() I think - at least it currently calls
> nfs4_set_lease_period().
This is the mechanism we're replacing. There is no need for a call to
do_renew_lease() in nfs4_setup_state_renewal().
> Also, iirc fsinfo() not setting cl_last_renewal leads to
> cl_last_renewal initialization issues under some circumstances.
>
> Then the RFC 7530 in section 16.34.5 states:
> "SETCLIENTID_CONFIRM does not establish or renew a lease.", so
> calling
> do_renew_lease() from nfs4_setclientid_confirm() doesn't seem to be
> ok.
So just move the do_renew_lease() to nfs4_proc_setclientid(). The
successful combination SETCLIENTID+SETCLIENTID_CONFIRM definitely
_does_ establish a lease.
> I'm not sure if is is valid to do implicit lease renewal in
> nfs4_update_session() either...
Ditto. Move to the exchange_id call.
>
> Anyway, the patch would be something like (haven't tested it yet):
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 76d3716..a7af864 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5019,16 +5019,13 @@ static int nfs4_do_fsinfo(struct nfs_server
> *server, struct nfs_fh *fhandle, str
> struct nfs4_exception exception = {
> .interruptible = true,
> };
> - unsigned long now = jiffies;
> int err;
>
> do {
> err = _nfs4_do_fsinfo(server, fhandle, fsinfo);
> trace_nfs4_fsinfo(server, fhandle, fsinfo->fattr,
> err);
> if (err == 0) {
> - nfs4_set_lease_period(server->nfs_client,
> - fsinfo->lease_time * HZ,
> - now);
> + nfs4_set_lease_period(server->nfs_client,
> fsinfo->lease_time * HZ)
> break;
> }
> err = nfs4_handle_exception(server, err, &exception);
> @@ -6146,6 +6143,10 @@ int nfs4_proc_setclientid_confirm(struct
> nfs_client *clp,
> clp->cl_clientid);
> status = rpc_call_sync(clp->cl_rpcclient, &msg,
> RPC_TASK_TIMEOUT |
> RPC_TASK_NO_ROUND_ROBIN);
> + if(status == 0) {
> + unsigned long now = jiffies;
The variable 'now' needs to be set before the RPC call in order to
avoid overestimating the remaining lease period.
> + do_renew_lease(clp, now);
> + }
> trace_nfs4_setclientid_confirm(clp, status);
> dprintk("NFS reply setclientid_confirm: %d\n", status);
> return status;
> @@ -8590,6 +8591,8 @@ static void nfs4_update_session(struct
> nfs4_session *session,
> if (res->flags & SESSION4_BACK_CHAN)
> memcpy(&session->bc_attrs, &res->bc_attrs,
> sizeof(session->bc_attrs));
> + unsigned long now = jiffies;
> + do_renew_lease(session->clp, now);
> }
>
> static int _nfs4_proc_create_session(struct nfs_client *clp,
> diff --git a/fs/nfs/nfs4renewd.c b/fs/nfs/nfs4renewd.c
> index 6ea431b..ff876dd 100644
> --- a/fs/nfs/nfs4renewd.c
> +++ b/fs/nfs/nfs4renewd.c
> @@ -138,15 +138,12 @@
> *
> * @clp: pointer to nfs_client
> * @lease: new value for lease period
> - * @lastrenewed: time at which lease was last renewed
> */
> void nfs4_set_lease_period(struct nfs_client *clp,
> - unsigned long lease,
> - unsigned long lastrenewed)
> + unsigned long lease)
> {
> spin_lock(&clp->cl_lock);
> clp->cl_lease_time = lease;
> - clp->cl_last_renewal = lastrenewed;
> spin_unlock(&clp->cl_lock);
>
> /* Cap maximum reconnect timeout at 1/2 lease period */
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 3455232..d7b02fd 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -102,7 +102,8 @@ static int nfs4_setup_state_renewal(struct
> nfs_client *clp)
> now = jiffies;
> status = nfs4_proc_get_lease_time(clp, &fsinfo);
> if (status == 0) {
> - nfs4_set_lease_period(clp, fsinfo.lease_time * HZ,
> now);
> + nfs4_set_lease_period(clp, fsinfo.lease_time * HZ);
> + nfs4_do_renew_lease(clp, now);
As stated above, this call to nfs4_do_renew_lease() is not needed.
> nfs4_schedule_state_renewal(clp);
> }
>
>
>
> But it potentially has issues, as just pointed out, mainly with not
> being compliant with rfc again.
See above.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx