Re: [PATCH 1/3] vfs: add new f_op->syncfs vector
From: Jan Kara
Date: Thu Dec 17 2020 - 04:58:15 EST
On Thu 17-12-20 00:49:35, Al Viro wrote:
> [Christoph added to Cc...]
> On Wed, Dec 16, 2020 at 06:31:47PM -0500, Vivek Goyal wrote:
> > Current implementation of __sync_filesystem() ignores the return code
> > from ->sync_fs(). I am not sure why that's the case. There must have
> > been some historical reason for this.
> >
> > Ignoring ->sync_fs() return code is problematic for overlayfs where
> > it can return error if sync_filesystem() on upper super block failed.
> > That error will simply be lost and sycnfs(overlay_fd), will get
> > success (despite the fact it failed).
> >
> > If we modify existing implementation, there is a concern that it will
> > lead to user space visible behavior changes and break things. So
> > instead implement a new file_operations->syncfs() call which will
> > be called in syncfs() syscall path. Return code from this new
> > call will be captured. And all the writeback error detection
> > logic can go in there as well. Only filesystems which implement
> > this call get affected by this change. Others continue to fallback
> > to existing mechanism.
>
> That smells like a massive source of confusion down the road. I'd just
> looked through the existing instances; many always return 0, but quite
> a few sometimes try to return an error:
> fs/btrfs/super.c:2412: .sync_fs = btrfs_sync_fs,
> fs/exfat/super.c:204: .sync_fs = exfat_sync_fs,
> fs/ext4/super.c:1674: .sync_fs = ext4_sync_fs,
> fs/f2fs/super.c:2480: .sync_fs = f2fs_sync_fs,
> fs/gfs2/super.c:1600: .sync_fs = gfs2_sync_fs,
> fs/hfsplus/super.c:368: .sync_fs = hfsplus_sync_fs,
> fs/nilfs2/super.c:689: .sync_fs = nilfs_sync_fs,
> fs/ocfs2/super.c:139: .sync_fs = ocfs2_sync_fs,
> fs/overlayfs/super.c:399: .sync_fs = ovl_sync_fs,
> fs/ubifs/super.c:2052: .sync_fs = ubifs_sync_fs,
> is the list of such. There are 4 method callers:
> dquot_quota_sync(), dquot_disable(), __sync_filesystem() and
> sync_fs_one_sb(). For sync_fs_one_sb() we want to ignore the
> return value; for __sync_filesystem() we almost certainly
> do *not* - it ends with return __sync_blockdev(sb->s_bdev, wait),
> after all. The question for that one is whether we want
> __sync_blockdev() called even in case of ->sync_fs() reporting
> a failure, and I suspect that it's safer to call it anyway and
> return the first error value we'd got. No idea about quota
> situation.
WRT quota situation: All the ->sync_fs() calls there are due to cache
coherency reasons (we need to get quota changes to disk, then prune quota
files's page cache, and then userspace can read current quota structures
from the disk). We don't want to fail dquot_disable() just because caches
might be incoherent so ignoring ->sync_fs() return value there is fine.
With dquot_quota_sync() it might make some sense to return the error -
that's just a backend for Q_SYNC quotactl(2). OTOH I'm not sure anybody
really cares - Q_SYNC is rarely used.
Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR