Re: [PATCH v2] fs: rename S_KERNEL_FILE

From: David Howells
Date: Fri Jan 28 2022 - 05:12:35 EST


Christoph Hellwig <hch@xxxxxx> wrote:

> S_KERNEL_FILE is grossly misnamed. We have plenty of files hold

"held".

> open by the kernel kernel using filp_open.

You said "kernel" twice.

And so what? Cachefiles holds files open by filp_open - but it can only do so
temporarily otherwise EMFILE/ENFILE and OOMs start to become a serious problem
because it could end up holding thousands of files open (one or two of the
xfstests cause this to happen).

Further, holding the file open *doesn't* prevent cachefilesd from trying to
cull the file to make space whilst it's "in use".

Yet further, I'm not holding the directories that form the cache layout open
with filp_open at all - I'm not reading them, so that would just be a waste of
resources - but I really don't want cachefilesd culling them because it sees
they're empty but cachefiles is holding them ready.

> This flag OTOH signals the inode as being a special snowflake that
> cachefiles holds onto that can't be unlinked because of ..., erm, pixie
> dust.

Really? I presume you read the explanation I gave of the races that are a
consequence of splitting the driver between the kernel and userspace? I could
avoid them - or at least mitigate them - by keeping my own list of all the
inodes in use by cachefiles so that cachefilesd can query it. I did, in fact,
use to have such a list, but the core kernel already has such lists and the
facilities to translate pathnames into internal objects, so my stuff ought to
be redundant - all I need is one inode flag.

Further, that inode flag can be shared with anyone else who wants to do
something similar. It's just an "I'm using this" lock. There's no particular
reason to limit its use to cachefiles. In fact, it is better if it is then
shared so that in the unlikely event that two drivers try to use the same
file, an error will occur.

I do use it to defend cachefiles against itself also. In the event that
there's a bug or a race and it tries to reuse its own cache - particularly
with something like NFS that can have multiple superblocks for the same
source, just with different I/O parameters, for example - this can lead to
data corruption. I try to defend against it in fscache also, but there can be
delayed effects due to object finalisation being done in the background after
fscache has returned to the netfs.

Now, I do think there's an argument to be made for splitting the flag into
two, as I advanced in a proposed patch. One piece would just be an "I'm using
this", the other would be a "don't delete this" flag. Both are of potential
use to other drivers.

I take it you'd prefer this to be done a different way?

David