Re: [PATCH 09/11] vfs, fscache: Add an IS_KERNEL_FILE() macro for the S_KERNEL_FILE flag
From: Christian Brauner
Date: Wed Jan 19 2022 - 06:16:11 EST
On Wed, Jan 19, 2022 at 09:18:05AM +0000, David Howells wrote:
> Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
>
> > On Tue, Jan 18, 2022 at 05:40:14PM +0000, David Howells wrote:
> > > Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> > >
> > > > On Tue, Jan 18, 2022 at 01:54:54PM +0000, David Howells wrote:
> > > > > Add an IS_KERNEL_FILE() macro to test the S_KERNEL_FILE inode flag as is
> > > > > common practice for the other inode flags[1].
> > > >
> > > > Please fix the flag to have a sensible name first, as the naming of the
> > > > flag and this new helper is utterly wrong as we already discussed.
> > >
> > > And I suggested a new name, which you didn't comment on.
> >
> > Again, look at the semantics of the flag: The only thing it does in the
> > VFS is to prevent a rmdir. So you might want to name it after that.
> >
> > Or in fact drop the flag entirely. We don't have that kind of
> > protection for other in-kernel file use or important userspace daemons
> > either. I can't see why cachefiles is the magic snowflake here that
> > suddenly needs semantics no one else has.
>
> The flag cannot just be dropped - it's an important part of the interaction
> with cachefilesd with regard to culling. Culling to free up space is
> offloaded to userspace rather than being done within the kernel.
>
> Previously, cachefiles, the kernel module, had to maintain a huge tree of
> records of every backing inode that it was currently using so that it could
> forbid cachefilesd to cull one when cachefilesd asked. I've reduced that to a
> single bit flag on the inode struct, thereby saving both memory and time. You
> can argue whether it's worth sacrificing an inode flag bit for that, but the
> flag can be reused for any other kernel service that wants to similarly mark
> an inode in use.
>
> Further, it's used as a mark to prevent cachefiles accidentally using an inode
> twice - say someone misconfigures a second cache overlapping the first - and,
> again, this works if some other kernel driver wants to mark inode it is using
> in use. Cachefiles will refuse to use them if it ever sees them, so no
> problem there.
>
> And it's not true that we don't have that kind of protection for other
> in-kernel file use. See S_SWAPFILE. I did consider using that, but that has
> other side effects. I mentioned that perhaps I should make swapon set
> S_KERNEL_FILE also. Also blockdevs have some exclusion also, I think.
>
> The rmdir thing should really apply to rename and unlink also. That's to
> prevent someone, cachefilesd included, causing cachefiles to malfunction by
> removing the directories it created. Possibly this should be a separate bit
> to S_KERNEL_FILE, maybe S_NO_DELETE.
>
> So I could change S_KERNEL_FILE to S_KERNEL_LOCK, say, or maybe S_EXCLUSIVE.
[ ] S_REMOVE_PROTECTED
[ ] S_UNREMOVABLE
[ ] S_HELD_BUSY
[ ] S_KERNEL_BUSY
[ ] S_BUSY_INTERNAL
[ ] S_BUSY
[ ] S_HELD
?