Re: 2.6.28-rc3 truncates nfsd results

From: J. Bruce Fields
Date: Wed Nov 05 2008 - 18:07:37 EST


Seems good to me; I cleaned up commit message, as below, and I'll submit
in a few days. If you didn't mind me adding a signed-off-by for you,
that'd also be good. I hope we've got it right this time!

--b.

commit 19e1e96e0b51f9ad4cb0329e6b32a742cf9db713
Author: Doug Nazar <nazard@xxxxxxxxxxxx>
Date: Wed Nov 5 06:16:28 2008 -0500

Fix nfsd truncation of readdir results

Commit 8d7c4203 "nfsd: fix failure to set eof in readdir in some
situations" introduced a bug: on a directory in an exported ext3
filesystem with dir_index unset, a READDIR will only return about 250
entries, even if the directory was larger.

Bisected it back to this commit; reverting it fixes the problem.

It turns out that in this case ext3 reads a block at a time, then
returns from readdir, which means we can end up with buf.full==0 but
with more entries in the directory still to be read. Before 8d7c4203
(but after c002a6c797 "Optimise NFS readdir hack slightly"), this would
cause us to return the READDIR result immediately, but with the eof bit
unset. That could cause a performance regression (because the client
would need more roundtrips to the server to read the whole directory),
but no loss in correctness, since the cleared eof bit caused the client
to send another readdir. After 8d7c4203, the setting of the eof bit
made this a correctness problem.

So, move nfserr_eof into the loop and remove the buf.full check so that
we loop until buf.used==0. The following seems to do the right thing
and reduces the network traffic since we don't return a READDIR result
until the buffer is full.

Tested on an empty directory & large directory; eof is properly sent and
there are no more short buffers.

Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 848a03e..4433c8f 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1875,11 +1875,11 @@ static int nfsd_buffered_readdir(struct file *file, filldir_t func,
return -ENOMEM;

offset = *offsetp;
- cdp->err = nfserr_eof; /* will be cleared on successful read */

while (1) {
unsigned int reclen;

+ cdp->err = nfserr_eof; /* will be cleared on successful read */
buf.used = 0;
buf.full = 0;

@@ -1912,9 +1912,6 @@ static int nfsd_buffered_readdir(struct file *file, filldir_t func,
de = (struct buffered_dirent *)((char *)de + reclen);
}
offset = vfs_llseek(file, 0, SEEK_CUR);
- cdp->err = nfserr_eof;
- if (!buf.full)
- break;
}

done:
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/