On Sat, Jan 31, 2004 at 09:59:06AM -0600, Steve Lord wrote:
The vn_revalidate call is called out of linvfs_setattr,
which is called with the i_sem held, it is also potentially called out
of linvfs_getattr, although since the i_size is always maintained
as it is changed, this call should not actually be updating the size.
Possibly changing the code in vn_revalidate to do this:
if (i_size_read(inode) < va.va_size))
i_size_write(inode, va.va_size);
Would be a good starting point, I suspect those calls from the nfs
revalidate call are not really going to change the inode size. My
guess is this will make your problem go away.
Probably some larger code restructure is needed so that revalidate
knows if the i_sem is held or not at this point.
I think the right fix is to update the Linux i_size always shortly after
di_size is updated. There's a lot of updates in the directory code while
should be handled by an i_size_write in the matching linvfs routines, and
there's a few more but we should be able to handle those without
vn_revalidate aswell.
The O_DIRECT write case is the hard one. In XFS's internal view of
the world, the inode size is maintained via the XFS_ILOCK, but we
only hold that across metadata manipulation within the fs code,
not across I/O such as a call to generic_file_aio_write_nolock.
Right now the only way I see of dealing with that is to make
writes which we know will extent the file hold the i_sem for
the duration in the O_DIRECT case.
That's the more difficult case. Any reson why you'd hold i_sem
for the whole O_DIRECT I/O instead of just for updating i_sem?
Note that the EOF zeroing code also calls i_size_write.