Please assist in locating NFS race in old vendor kernel

From: Nikita V. Youshchenko
Date: Wed May 09 2007 - 13:40:17 EST


Hello.

I'm currently working with an embedded system based on very old,
2.4.17-based, vendor kernel.

Among other issues, there is a race in NFS code, that I'm currently trying
to understand and fix.

The race is the following.
Some i/o is done on a file located on NFS-mounted filesystem. At some
moment, ftruncate() is called. And in very rare but still reproducable
cases:
- first inode->i_size is set to the new value, in process context
- then inode->i_size is restored to old value, in rpciod context.
I was able to catch this by adding some logging to the point where
__nfs_refresh_inode() updates inode->i_size.

Looks like inode size is being broken (by restore of old value) by
completeion handler of RPC operation that was started before ftruncate().

The test on which the race is reproduced more or less reliably,
is "fsx-linux" from LTP suite.

System in question is SMP - thus making race happen more often.

I guess the race is long fixed in more modern kernels.
But I'm not familiar with NFS code, and after some attempts to find
something I feel lost.

Could someone more familiar with NFS implementaion please point me where to
look?

Nikita

Attachment: pgp00000.pgp
Description: PGP signature