Re: nfs: infinite loop in fcntl(F_SETLKW)

From: J. Bruce Fields
Date: Sat Apr 12 2008 - 20:09:16 EST


On Fri, Apr 11, 2008 at 09:12:23PM +0200, Miklos Szeredi wrote:
> > > OK. So the correct fix here should really be applied to fcntl_setlk().
> > > There is absolutely no reason why we should be looping at all if the
> > > filesystem has a ->lock() method.
> > >
> > > In fact, this looping behaviour was introduced recently in commit
> > > 7723ec9777d9832849b76475b1a21a2872a40d20.
> >
> > Apologies, that was indeed a behavioral change introduced in a commit
> > that claimed just to be shuffling code around.
>
> Yeah, that patch looks totally wrong. It's not generally a good idea
> to do a loop where the exit condition depends on something you don't
> control. And error values from filesystem methods are typically like
> that. For example with fuse, the error code could come from an
> unprivileged userspace process.
>
> I didn't realize this aspect of the bug previously, because I
> concentrated on the lockd inconsistency.

So, does this patch on its own fix the problem you saw?

Any extra eyes welcome....

--b.

commit e56100676b9ea3b2d5f3e937c3ce8a5149cffb84
Author: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>
Date: Sat Apr 12 18:12:15 2008 -0400

locks: fix possible infinite loop in fcntl(F_SETLKW) over nfs

Miklos Szeredi found the bug:

"Basically what happens is that on the server nlm_fopen() calls
nfsd_open() which returns -EACCES, to which nlm_fopen() returns
NLM_LCK_DENIED.

"On the client this will turn into a -EAGAIN (nlm_stat_to_errno()),
which in will cause fcntl_setlk() to retry forever."

So, for example, opening a file on an nfs filesystem, changing
permissions to forbid further access, then trying to lock the file,
could result in an infinite loop.

And Trond Myklebust identified the culprit, from Marc Eshel and I:

7723ec9777d9832849b76475b1a21a2872a40d20 "locks: factor out
generic/filesystem switch from setlock code"

That commit claimed to just be reshuffling code, but actually introduced
a behavioral change by calling the lock method repeatedly as long as it
returned -EAGAIN.

We assumed this would be safe, since we assumed a lock of type SETLKW
would only return with either success or an error other than -EAGAIN.
However, nfs does can in fact return -EAGAIN in this situation, and
independently of whether that behavior is correct or not, we don't
actually need this change, and it seems far safer not to depend on such
assumptions about the filesystem's ->lock method.

Therefore, revert the problematic part of the original commit. This
leaves vfs_lock_file() and its other callers unchanged, while returning
fcntl_setlk and fcntl_setlk64 to their former behavior.

Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>
Cc: Miklos Szeredi <mszeredi@xxxxxxx>
Cc: Trond Myklebust <trond.myklebust@xxxxxxxxxx>
Cc: Marc Eshel <eshel@xxxxxxxxxxxxxxx>

diff --git a/fs/locks.c b/fs/locks.c
index d83fab1..43c0af2 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1801,17 +1801,21 @@ again:
if (error)
goto out;

- for (;;) {
- error = vfs_lock_file(filp, cmd, file_lock, NULL);
- if (error != -EAGAIN || cmd == F_SETLK)
- break;
- error = wait_event_interruptible(file_lock->fl_wait,
- !file_lock->fl_next);
- if (!error)
- continue;
+ if (filp->f_op && filp->f_op->lock != NULL)
+ error = filp->f_op->lock(filp, cmd, file_lock);
+ else {
+ for (;;) {
+ error = posix_lock_file(filp, file_lock, NULL);
+ if (error != -EAGAIN || cmd == F_SETLK)
+ break;
+ error = wait_event_interruptible(file_lock->fl_wait,
+ !file_lock->fl_next);
+ if (!error)
+ continue;

- locks_delete_block(file_lock);
- break;
+ locks_delete_block(file_lock);
+ break;
+ }
}

/*
@@ -1925,17 +1929,21 @@ again:
if (error)
goto out;

- for (;;) {
- error = vfs_lock_file(filp, cmd, file_lock, NULL);
- if (error != -EAGAIN || cmd == F_SETLK64)
- break;
- error = wait_event_interruptible(file_lock->fl_wait,
- !file_lock->fl_next);
- if (!error)
- continue;
+ if (filp->f_op && filp->f_op->lock != NULL)
+ error = filp->f_op->lock(filp, cmd, file_lock);
+ else {
+ for (;;) {
+ error = posix_lock_file(filp, file_lock, NULL);
+ if (error != -EAGAIN || cmd == F_SETLK64)
+ break;
+ error = wait_event_interruptible(file_lock->fl_wait,
+ !file_lock->fl_next);
+ if (!error)
+ continue;

- locks_delete_block(file_lock);
- break;
+ locks_delete_block(file_lock);
+ break;
+ }
}

/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/