Re: [PATCH 09/11] vfs, fscache: Add an IS_KERNEL_FILE() macro for the S_KERNEL_FILE flag
From: David Howells
Date: Wed Jan 19 2022 - 04:18:30 EST
Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> On Tue, Jan 18, 2022 at 05:40:14PM +0000, David Howells wrote:
> > Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> >
> > > On Tue, Jan 18, 2022 at 01:54:54PM +0000, David Howells wrote:
> > > > Add an IS_KERNEL_FILE() macro to test the S_KERNEL_FILE inode flag as is
> > > > common practice for the other inode flags[1].
> > >
> > > Please fix the flag to have a sensible name first, as the naming of the
> > > flag and this new helper is utterly wrong as we already discussed.
> >
> > And I suggested a new name, which you didn't comment on.
>
> Again, look at the semantics of the flag: The only thing it does in the
> VFS is to prevent a rmdir. So you might want to name it after that.
>
> Or in fact drop the flag entirely. We don't have that kind of
> protection for other in-kernel file use or important userspace daemons
> either. I can't see why cachefiles is the magic snowflake here that
> suddenly needs semantics no one else has.
The flag cannot just be dropped - it's an important part of the interaction
with cachefilesd with regard to culling. Culling to free up space is
offloaded to userspace rather than being done within the kernel.
Previously, cachefiles, the kernel module, had to maintain a huge tree of
records of every backing inode that it was currently using so that it could
forbid cachefilesd to cull one when cachefilesd asked. I've reduced that to a
single bit flag on the inode struct, thereby saving both memory and time. You
can argue whether it's worth sacrificing an inode flag bit for that, but the
flag can be reused for any other kernel service that wants to similarly mark
an inode in use.
Further, it's used as a mark to prevent cachefiles accidentally using an inode
twice - say someone misconfigures a second cache overlapping the first - and,
again, this works if some other kernel driver wants to mark inode it is using
in use. Cachefiles will refuse to use them if it ever sees them, so no
problem there.
And it's not true that we don't have that kind of protection for other
in-kernel file use. See S_SWAPFILE. I did consider using that, but that has
other side effects. I mentioned that perhaps I should make swapon set
S_KERNEL_FILE also. Also blockdevs have some exclusion also, I think.
The rmdir thing should really apply to rename and unlink also. That's to
prevent someone, cachefilesd included, causing cachefiles to malfunction by
removing the directories it created. Possibly this should be a separate bit
to S_KERNEL_FILE, maybe S_NO_DELETE.
So I could change S_KERNEL_FILE to S_KERNEL_LOCK, say, or maybe S_EXCLUSIVE.
David